# Group by

The Group by step allows you to group your data by one or more columns and perform calculations on other columns. This step is useful for summarizing data and creating reports.

### Step parameters

2. `Group rows by` **column(array)\***: Select one or more columns that will be used to constitute unique groups. For example, you might group by "product" or "category".
3. `And aggregate...` **array(aggregation)\***: Define one or more aggregations to perform on your grouped data. For each aggregation, you need to specify:
   * `Columns`: **column(array)\***: the columns to be aggregated (you can apply the same aggregation function to several columns at once)
   * `Function` **(string)\***: the aggregation function to be applied (`sum`, `avg`, `count`, `min`, or `max`)
4. `Keep Original Granularity` **(boolean)**: whether to keep the original granularity, in that case computed aggregations will be added in new columns. If unchecked, the output will only contain the grouped and aggregated data
5. `Count null values like regular values` (boolean): Select whether to include `null` values in the count. If checked, `null` values will be counted as regular entries.

### Example

**Input**

<figure><img src="/files/DMOMm2LD956Zb22FPURT" alt=""><figcaption><p>Aggregate - group by input</p></figcaption></figure>

**Configuration**

```json
{
    "on": []
    "aggregations": [
        {
            "columns": [],
            "aggfunction": ""
        },
        {
            "columns": [],
            "aggfunction": ""
        }
    ]
    "keep_original_granularity": false 
}
```

**Output**

<figure><img src="/files/E22F7efzIRsZPX9qXifJ" alt=""><figcaption><p>aggregate - group by output</p></figcaption></figure>

{% hint style="info" %}
If an aggregation function is applied once in a column, the output column will replace the aggregated column with the same name.

If it's applied twice or more on the same column, the aggregated columns will be named `column_name-aggfunction`

For example, if you compute an aggregation on a sales column for sum and average, you will have a column named `sales-sum` and another one titled `sales-avg`
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs-v3.toucantoco.com/data-management-in-datahub/datasets-in-toucan/preparing-data/overview-of-youprep-tm/aggregate/group-by.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
