Filter rows

The Filter step allows you to selectively include or exclude rows from your dataset based on specified conditions. This step is part of a data processing pipeline and can be used to modify data coming from datasets represented in rows and columns.

Step parameters

Condition: This is where you define your filtering criteria. You can create three types of conditions:

  • Simple condition*:

    • Column(string)*: Enter the name of the column you want to filter on.

    • Operator[operators]*: Choose an operator to filter your data. defaults to eq

    • Value*: Enter the value to compare against (not required for isnull and notnull operators).

  • ADD CONDITION : Combine multiple simple conditions that you can bind by either an "AND" or "OR" logical operator

  • ADD GROUP (): Add a group of simple conditions that you can bind by either an "AND" or "OR" logical operator. Note that you cannot nest a group of conditions in another group.

Example

Input

Filter rows - input

Configuration

{
    "condition": {
        "OR": [
            {
                "column": "department",
                "value": "IT",
                "operator": "eq"
            },
            {
                "column": "country",
                "value": "Canada",
                "operator": "eq"
            }
        ]
}

Rows meeting any of these conditions will appear in the filter's output

Output

Filter - filter rows - output

[Operators]

The following operators are available for conditions:

  • eq (equals)

  • ne (doesn't equal to)

  • gt (is greater than)

  • ge (is greater than or equal to)

  • lt (is less than)

  • le (is less than or equal to)

  • in (is one of)

  • nin (is not one of)

  • matches (matches pattern)

  • notmatches (doesn't match pattern)

  • isnull (is null)

  • notnull (is not null)

  • from (starting in/on)

  • until (ending in/on)

matches and notmatches operators are used to test value against a regular expression.

Values

value can be an arbitrary value depending on the selected operator (e.g a list when used with the in operator, or null when used with the isnull operator).

Value can be:

  • a variable or

  • a fixed value among

    • date,

    • string,

    • int,

    • float,

    • array

For date only starting in/on, ending in/on, is null ,is not null are available

Last updated

Was this helpful?