Skip to main content
Transform Tool Multi-Row Functions
Updated over 4 months ago

When using multi-row functions in the Transform tool, several configuration options are available to customize the behavior and output of these functions. Below is a detailed description of each option.

Configuration Options

  1. Field Name: The name of the field on which the function will operate.

  2. Multi-Row Radio Button: Select Multi-Row to open the Multi-Row builder.

  3. Data Type: The default is to infer the data type. You can select from the list to force a specific data type for the new field.

  4. Dense/Percent - RANK: This option is available in the RANK function to specify whether the rank should be dense or percentage-based.

  5. ntile - NTILE: This option is specific to the NTILE function and defines the number of buckets to divide the data into.

  6. Limit Rows in Calculation: Defines the range of rows to include in the calculation.

    • From:

      • First Row: Starts counting from the very first row in the dataset.

      • Current Row: Starts counting from the current row.

      • Row Above: Starts counting from the row immediately above the current row.

      • Row Below: Starts counting from the row immediately below the current row.

    • To:

      • Last Row: Ends counting at the very last row in the dataset.

      • Current Row: Ends counting at the current row.

      • Row Above: Ends counting at the row immediately above the current row.

      • Row Below: Ends counting at the row immediately below the current row.

  7. Partition: Defines the grouping of rows for which the function is calculated.

  8. Argument - Row: Specifies the row to consider for the function.

  9. Delimiter: Used in CONCAT and CONCAT DISTINCT functions to specify the delimiter for concatenated text.

  10. Row Offset: Used in LEAD and LAG functions to specify the number of rows to offset.

  11. Sort: Defines the order of rows before performing the function.


Function Documentation

AVG

Description

The AVG() function calculates the average value of a specified numeric field across a range or partition. This function is useful for determining the mean value in data analysis and reporting.

Output: numeric

Example

ID

Group

Value

AVG(Value) Over (Partition by Group)

1

A

10

15

2

A

20

15

3

A

15

15

4

B

30

35

5

B

40

35

This example demonstrates how the AVG() function can be used to calculate the average value within each group.

CONCAT

Description

The CONCAT() function combines unique text strings from multiple rows in a specified field into a single text string. It ensures that each string appears only once in the output.

Output: text

Example

ID

Tags

CONCAT(Tags) Delimiter (, )

1

Apple

Apple, Banana, Cherry, Apple

2

Banana

Apple, Banana, Cherry, Apple

3

Cherry

Apple, Banana, Cherry, Apple

4

Apple

Apple, Banana, Cherry, Apple

This example demonstrates how the CONCAT() function can be used to combine text strings from multiple rows into a single string with unique values.

CONCAT DISTINCT

Description

The CONCAT DISTINCT() function combines unique text strings from multiple rows in a specified field into a single text string. It ensures that each string appears only once in the output.

Output: text

Example

ID

Tags

CONCAT DISTINCT(Tags) Delimiter (, )

1

Apple

Apple, Banana, Cherry

2

Banana

Apple, Banana, Cherry

3

Cherry

Apple, Banana, Cherry

4

Apple

Apple, Banana, Cherry

This example demonstrates how the CONCAT DISTINCT() function can be used to combine unique text strings from multiple rows into a single string.

COUNT

Description

The COUNT() function counts the number of rows within a specified range or partition. This function is useful for summarizing data and generating row counts for reporting.

Output: integer

Example

ID

Group

COUNT(ID) Over (Partition by Group)

1

A

3

2

A

3

3

A

3

4

B

2

5

B

2

This example demonstrates how the COUNT() function can be used to count the number of rows within each group.

COUNT DISTINCT

Description

The COUNT DISTINCT() function counts the number of unique values in a specified field within a range or partition. This is useful for identifying the number of distinct entries.

Output: integer

Example

ID

Group

Value

COUNT DISTINCT(Value) Over (Partition by Group)

1

A

10

2

2

A

20

2

3

A

10

2

4

B

30

1

5

B

30

1

This example demonstrates how the COUNT DISTINCT() function can be used to count the number of unique values within each group.

CUM SUM

Description

The CUM SUM() function calculates the cumulative sum of a specified numeric field across a range or partition. It is useful for tracking the running total.

Output: numeric

Example

ID

Group

Value

CUM SUM(Value) Over (Partition by Group) Sort (ID)

1

A

10

10

2

A

20

30

3

A

15

45

4

B

30

30

5

B

40

70

This example demonstrates how the CUM SUM() function can be used to calculate the cumulative sum within each group.

FILL

Description

The FILL() function populates missing values in a specified field using a defined strategy, such as forward fill or backward fill. It is useful for handling missing data.

Output: varies based on input field type

Example

ID

Value

FILL(Value, 'forward')

1

10

10

2

NULL

10

3

20

20

4

NULL

20

5

30

30

This example demonstrates how the FILL() function can be used to populate missing values using forward fill.

FIRST

Description

The FIRST() function returns the first value of a specified field within a range or partition. This is useful for identifying the initial entry.

Output: varies based on input field type

Example

ID

Group

Value

FIRST(Value) Over (Partition by Group)

1

A

10

10

2

A

20

10

3

A

15

10

4

B

30

30

5

B

40

30

This example demonstrates how the FIRST() function can be used to return the first value within each group.

LAG

Description

The LAG() function returns the value of a specified field from a previous row in the dataset. It is useful for comparing current values to previous ones.

Output: varies based on input field type

Example

ID

Value

LAG(Value) Row Offset (1)

1

10

null

2

20

10

3

15

20

4

30

15

5

40

30

This example demonstrates how the LAG() function can be used to return the value from a previous row.

LAST

Description

The LAST() function returns the last value of a specified field within a range or partition. This is useful for identifying the final entry.

Output: varies based on input field type

Example

ID

Group

Value

LAST(Value) Over (Partition by Group)

1

A

10

15

2

A

20

15

3

A

15

15

4

B

30

40

5

B

40

40

This example demonstrates how the LAST() function can be used to return the last value within each group.

LEAD

Description

The LEAD() function returns the value of a specified field from a subsequent row in the dataset. It is useful for comparing current values to future ones.

Output: varies based on input field type

Example

ID

Value

LEAD(Value) Row Offset (1)

1

10

20

2

20

15

3

15

30

4

30

40

5

40

null

This example demonstrates how the LEAD() function can be used to return the value from a subsequent row.

MAX

Description

The MAX() function returns the maximum value of a specified field within a range or partition. This is useful for identifying the highest value.

Output: varies based on input field type

Example

ID

Group

Value

MAX(Value) Over (Partition by Group)

1

A

10

20

2

A

20

20

3

A

15

20

4

B

30

40

5

B

40

40

This example demonstrates how the MAX() function can be used to return the maximum value within each group.

MEDIAN

Description

The MEDIAN() function returns the median value of a specified numeric field within a range or partition. This is useful for identifying the central value.

Output: numeric

Example

ID

Group

Value

MEDIAN(Value) Over (Partition by Group)

1

A

10

15

2

A

20

15

3

A

15

15

4

B

30

35

5

B

40

35

This example demonstrates how the MEDIAN() function can be used to return the median value within each group.

MIN

Description

The MIN() function returns the minimum value of a specified field within a range or partition. This is useful for identifying the lowest value.

Output: varies based on input field type

Example

ID

Group

Value

MIN(Value) Over (Partition by Group)

1

A

10

10

2

A

20

10

3

A

15

10

4

B

30

30

5

B

40

30

This example demonstrates how the MIN() function can be used to return the minimum value within each group.

NTILE

Description

The NTILE() function assigns a bucket number to each row in a partition, with the number of buckets defined by the user. This is useful for data segmentation and ranking.

Output: integer

Example

ID

Group

Value

NTILE(3) Over (Partition by Group) Sort (ID)

1

A

10

1

2

A

20

2

3

A

15

3

4

B

30

1

5

B

40

3

This example demonstrates how the NTILE() function can be used to assign bucket numbers to each row within each group.

RANK

Description

The RANK() function assigns a rank to each row within a partition of the dataset, with ties receiving the same rank. This is useful for ranking data based on specified criteria.

Output: integer

Example

ID

Group

Value

RANK() Over (Partition by Group) Sort (Value)

1

A

10

1

2

A

20

3

3

A

15

2

4

B

30

1

5

B

40

2

This example demonstrates how the RANK() function can be used to assign ranks to rows based on their values within each group.

ROW NUM

Description

The ROW NUM() function assigns a unique sequential integer to rows within a partition of the dataset. This is useful for generating row numbers in ordered datasets.

Output: integer

Example

ID

Group

Value

ROW NUM() Over (Partition by Group) Sort (Value)

1

A

10

1

2

A

20

3

3

A

15

2

4

B

30

1

5

B

40

2

This example demonstrates how the ROW NUM() function can be used to assign unique row numbers within each group.

STDDEV

Description

The STDDEV() function calculates the standard deviation of a specified numeric field within a range or partition. This is useful for measuring the dispersion or variability of the data.

Output: numeric

Example

ID

Group

Value

STDDEV(Value) Over (Partition by Group)

1

A

10

5

2

A

20

5

3

A

15

5

4

B

30

7.071

5

B

40

7.071

This example demonstrates how the STDDEV() function can be used to calculate the standard deviation of values within each group.

SUM

Description

The SUM() function calculates the total sum of a specified numeric field within a range or partition. This is useful for aggregating data and calculating totals.

Output: numeric

Example

ID

Group

Value

SUM(Value) Over (Partition by Group)

1

A

10

45

2

A

20

45

3

A

15

45

4

B

30

70

5

B

40

70

This example demonstrates how the SUM() function can be used to calculate the total sum of values within each group.

VAR

Description

The VAR() function calculates the variance of a specified numeric field within a range or partition. This is useful for measuring the spread or dispersion of the data.

Output: numeric

Example

ID

Group

Value

VAR(Value) Over (Partition by Group)

1

A

10

25

2

A

20

25

3

A

15

25

4

B

30

50

5

B

40

50

This example demonstrates how the VAR() function can be used to calculate the variance of values within each group.

Did this answer your question?