About the Summarize Tool
The Summarize Tool enables you to perform various summary calculations and aggregations on your data, making it easier to analyze and understand large datasets. It provides options for grouping, sorting, window aggregations, and more, helping you gain insights from your data in a structured way.
Configuration Options
1. Group By
Use the Group By section to group your data before applying aggregations. Click the + button to add a field to group by. To remove a field, click the trash can icon next to it.
2. Window Sort
In the Window Sort section, you can add sorting rules for your grouped data. Select a field to sort by, and use the ascending/descending icons to set the order. Click the X button to remove a sort configuration.
3. Window Aggregations
Enable Window Aggregations if you need calculations over a specific window or subset within your data. Define the window using the From and To options:
From:
First Row: Includes all rows from the start of the partition.
Current Row: Refers only to the current row.
Row Above: Includes a specified number of rows above the current row.
Row Below: Includes a specified number of rows below the current row.
Last Row: Includes all rows to the end of the partition.
4. Aggregations
To add a calculation or aggregation:
Click the + button under Aggregations.
Choose a field or use an expression from the Argument dropdown.
Select an aggregation function suitable for your data type (e.g., SUM, COUNT, AVG) and assign a new field name if desired.
To add additional aggregations, click the + button again and repeat the configuration.
To remove an aggregation, use the trash can icon next to the aggregation.
Aggregation Functions
Aggregation Function | Description |
COUNT | Returns the total number of rows in a group or table. |
COUNT DISTINCT | Returns the number of unique values within a column or set of columns. |
MIN | Finds the minimum value in a column. |
MAX | Finds the maximum value in a column. |
AVG | Calculates the average value for numeric columns. |
SUM | Adds up all numeric values in a column. |
STDDEV | Returns the standard deviation of numeric values. |
VAR | Calculates the variance of numeric values. |
MEDIAN | Finds the middle value in an ordered numeric column; averages if even. |
CONCAT | Concatenates multiple strings together. |
CONCAT DISTINCT | Concatenates unique string values. |
FIRST | Retrieves the first value in a column. |
LAST | Retrieves the last value in a column. |