Join datasets
The Join step allows you to combine two datasets listed in the dataHub to bring columns from the former into the latter, matching rows based on
columns correspondance
Step parameters
Select a dataset to join (as a right dataset)
column(string)*: Select a dataset to join as the right dataset.Select a join Type
dropdown(string)*:
Choose from "left", "inner", or "left outer" join.left
: will keep every row of the current dataset and fill unmatched rows withnull
values,left outer
:inner
: will only keep rows that match rows of the joined dataset.
Join based on columns:
specify 1 or more column couple(s) that will be compared to determine rows correspondance between the 2 datasets. The first element of a couple is for the current dataset column, and the second for the corresponding column in the right dataset to be joined. If you specify more than 1 couple, the matching rows will be those that find a correspondance between the 2 datasets for every column couple specified (logical ‘AND’).
Example
Input


Configuration
{
"right_pipeline": "dataset_to_join",
"type": "left",
"on": [
{
"id": "emp_id"
}
]
}
Output

Last updated
Was this helpful?