make_set_if
This page explains how to use the make_set_if aggregation function in APL.
The make_set_if
aggregation function in APL allows you to create a set of distinct values from a column based on a condition. You can use this function to aggregate values that meet specific criteria, helping you filter and reduce data to unique entries while applying a conditional filter. This is especially useful when analyzing large datasets to extract relevant, distinct information without duplicates.
You can use make_set_if
in scenarios where you need to aggregate conditional data points, such as log analysis, tracing information, or security logs, to summarize distinct occurrences based on particular conditions.
For users of other query languages
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
Usage
Syntax
Parameters
column
: The column from which distinct values will be aggregated.predicate
: A condition that filters the values to be aggregated.[max_size]
: (Optional) Specifies the maximum number of elements in the resulting set. If omitted, the default is 1048576.
Returns
The make_set_if
function returns a dynamic array of distinct values from the specified column that satisfy the given condition.
Use case examples
In this use case, you’re analyzing HTTP logs and want to get the distinct cities from which requests originated, but only for requests that took longer than 500 ms.
Query
Output
method | make_set_if_geo.city |
---|---|
GET | [‘New York’, ‘San Francisco’] |
POST | [‘Berlin’, ‘Tokyo’] |
This query returns the distinct cities from which requests took more than 500 ms, grouped by HTTP request method.
List of related aggregations
- make_list_if: Similar to
make_set_if
, but returns a list that can include duplicates instead of a distinct set. - make_set: Aggregates distinct values without a conditional filter.
- countif: Counts rows that satisfy a specific condition, useful for when you need to count rather than aggregate distinct values.
Was this page helpful?