Aggregate
Summarizes many rows into one row.
When applied:
- without
group
, it produces one row from the whole table, - within a
group
pipeline, it produces one row from each group.
Note
Currently, all declared aggregation functions are min
, max
, count
, average
, stddev
, avg
, sum
and count_distinct
. We are in the process of filling out std lib.
Examples
PRQL
from employees
aggregate [
average salary,
ct = count
]
SQL
SELECT
AVG(salary),
COUNT(*) AS ct
FROM
employees
PRQL
from employees
group [title, country] (
aggregate [
average salary,
ct = count
]
)
SQL
SELECT
title,
country,
AVG(salary),
COUNT(*) AS ct
FROM
employees
GROUP BY
title,
country
Aggregate is required
Unlike in SQL, using an aggregation function in derive
or select
(or any other transform except aggregate
) will not trigger aggregation. By default, PRQL will interpret such attempts functions as window functions:
PRQL
from employees
derive [avg_sal = average salary]
SQL
SELECT
*,
AVG(salary) OVER () AS avg_sal
FROM
employees
This ensures that derive
does not manipulate the number of rows, but only ever adds a column. For more information, see window transform.