Window
Applies a pipeline to segments of rows, producing one output value for every input value.
window rows:{range} range:{range} expanding:false rolling:0 {pipeline}
For each row, the segment over which the pipeline is applied is determined by one of:
rows
, which takes a range of rows relative to the current row position.0
references the current row.
range
, which takes a range of values relative to current row value.
The bounds of the range are inclusive. If a bound is omitted, the segment will extend until the edge of the table or group.
For ease of use, there are two flags that override rows
or range
:
expanding:true
is an alias forrows:..0
. A sum using this window is also known as “cumulative sum”.rolling:n
is an alias forrow:(-n+1)..0
, wheren
is an integer. This will includen
last values, including current row. An average using this window is also knows as a Simple Moving Average.
Some examples:
Expression | Meaning |
---|---|
rows:0..2 | current row plus two following |
rows:-2..0 | two preceding rows plus current row |
rolling:3 | (same as previous) |
rows:-2..4 | two preceding rows plus current row plus four following rows |
rows:..0 | all rows from the start of the table up to & including current row |
expanding:true | (same as previous) |
rows:0.. | current row and all following rows until the end of the table |
rows:.. | all rows, which same as not having window at all |
Example
PRQL
from employees
group employee_id (
sort month
window rolling:12 (
derive [trail_12_m_comp = sum paycheck]
)
)
SQL
SELECT
*,
SUM(paycheck) OVER (
PARTITION BY employee_id
ORDER BY
month ROWS BETWEEN 11 PRECEDING AND CURRENT ROW
) AS trail_12_m_comp
FROM
employees
PRQL
from orders
sort day
window rows:-3..3 (
derive [centered_weekly_average = average value]
)
group [order_month] (
sort day
window expanding:true (
derive [monthly_running_total = sum value]
)
)
SQL
SELECT
*,
AVG(value) OVER (
ORDER BY
day ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING
) AS centered_weekly_average,
SUM(value) OVER (
PARTITION BY order_month
ORDER BY
day ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS monthly_running_total
FROM
orders
Windowing by default
If you use window functions without window
transform, they will be applied to the whole table. Unlike in SQL, they will remain window functions and will not trigger aggregation.
PRQL
from employees
sort age
derive rnk = rank
SQL
SELECT
*,
RANK() OVER (
ORDER BY
age ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS rnk
FROM
employees
ORDER BY
age
You can also only apply group
:
PRQL
from employees
group department (
sort age
derive rnk = rank
)
SQL
SELECT
*,
RANK() OVER (
PARTITION BY department
ORDER BY
age ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS rnk
FROM
employees
Window functions as first class citizens
There is no limitaions where windowed expressions can be used:
PRQL
from employees
filter salary < (average salary)
SQL
WITH table_1 AS (
SELECT
*,
AVG(salary) OVER () AS _expr_0
FROM
employees
)
SELECT
*
FROM
table_1 AS table_0
WHERE
salary < _expr_0