Summarize Tool ⭐️

Easily perform customizable summary calculations and aggregations on data.

Updated over a week ago

The Summarize tool allows users to perform a variety of summary calculations and aggregations on their data, making it easy to analyze and understand large datasets.

Add a Summarize tool

1. Navigate to your analysis.

2. Click this icon.

3. Click Summarize.

4. Double-Click to open the Summarize tool in a separate tab. This allows for more in-depth analysis and configuration.


Configuration

  1. Add a Summarize tool to your analysis.

  2. Determine if you want to group the aggregations. To add a Group By field, click the plus button under Group By and select the field from the list. To remove a Group By field, click the trash can icon next to the selected field.

  3. Optionally, you can add sorting by selecting the plus button under Sort, then choose the field to sort by. Use the ascending and descending icons to change the sort order. Click the X button to delete the sorting configuration.

  4. If you wish to perform window aggregations, check the checkbox for Window Aggregations. Specify the From and To options (e.g., Unbounded, Current Row, Proceeding, Following) and the number of rows for each.

  5. To create an Aggregation column in your dataset, click the plus button under Aggregations.

    1. Select the field or use the expression option from the Argument dropdown.

    2. Choose the appropriate Calculation for the data type of the argument selected, and provide a New field name if desired.

  6. Click Apply to confirm the aggregation.

  7. To add another aggregated column, click the plus button under your existing aggregation, configure the new aggregation, and click Apply.

  8. Deleting Aggregations: To delete an aggregated column, click the trash can icon next to the aggregation.


Window Aggregations

Window aggregations, also known as window functions or window operations, allow you to perform aggregate calculations over a specific "window" or subset of data within a larger dataset. Unlike regular aggregate functions (e.g., SUM, COUNT, AVG) that operate on the entire dataset, window functions partition the data into smaller groups or "windows" based on specified criteria, and then perform calculations within each window.

In window aggregations, the FROM and TO options, along with the options of "Unbounded," "Current Row," "Preceding," and "Following," are used to define the range of rows within the partition to include in the window frame for each row being processed.

Option

Description

Unbounded

Includes all rows in the partition from the beginning

Current Row

Includes only the current row in the partition

Preceding

Includes a specified number of rows before the current row

Following

Includes a specified number of rows after the current row

FROM

TO

Description

Result

Unbounded

Unbounded

Includes all rows in the partition.

All rows in the partition.

Unbounded

Current Row

Includes all rows from the beginning of the partition up to and including the current row.

All rows from the start to the current row.

Unbounded

Preceding

Includes all rows from the beginning of the partition up to a specified number of rows before the current row.

All rows from the start to (current - N) rows.

Unbounded

Following

Includes all rows from the beginning of the partition up to a specified number of rows after the current row.

All rows from the start to (current + N) rows.

Current Row

Unbounded

Includes all rows from the end of the partition up to the and including the current row.

Only the current row.

Current Row

Current Row

Includes only the current row in the partition.

Only the current row.

Current Row

Following

Includes a specified number of rows after the current row.

(current + N) rows.

Preceding

Current Row

Includes a specified number of rows before the current row.

(current - N) rows.

Preceding

Preceding

Includes a specified number of rows before the current row.

(current - N1) rows.

Preceding

Following

Includes a specified number of rows before the current row and a specified number of rows after the current row.

(current - N1) rows to (current + N2) rows.

Following

Following

Includes a specified number of rows after the current row.

(current + N) rows.


Aggregations

Aggregation Function

Description

COUNT

Returns the number of rows in a group or the total number of rows in a table if used without grouping.

COUNT DISTINCT

Returns the number of unique values in a column or a combination of columns.

MIN

Returns the minimum value from a column.

MAX

Returns the maximum value from a column.

AVG

Returns the average value of a numeric column.

SUM

Returns the sum of numeric values in a column.

STDDEV

Returns the standard deviation of a set of numeric values in a column.

VAR

Returns the variance of a set of numeric values in a column.

MEDIAN

Returns the middle value of a numeric column when the data is ordered. If even values, returns the average.

CONCAT

Concatenates two or more strings together.

CONCAT DISTINCT

Concatenates two or more unique string values together.

Did this answer your question?