Standard library
The standard library currently contains commonly used functions that are used in SQL. It’s not yet as broad as we’d like, and we’re very open to expanding it.
Currently s-strings are an escape-hatch for any function that isn’t in our standard library. If we find ourselves using them for something frequently, raise an issue and we’ll add it to the stdlib.
Note
Currently the stdlib implementation doesn’t support different DB implementations itself; those need to be built deeper into the compiler. We’ll resolve this at some point. Until then, we’ll only add functions here that are broadly supported by most DBs.
Here’s the source of the current PRQL std:
# Aggregate Functions
func min <scalar || column> column -> null
func max <scalar || column> column -> null
func sum <scalar || column> column -> null
func avg <scalar || column> column -> null
func stddev <scalar || column> column -> null
func average <scalar || column> column -> null
func count <scalar || column> non_null:s"*" -> null
# TODO: Possibly make this into `count distinct:true` (or like `distinct:` as an
# abbreviation of that?)
func count_distinct <scalar || column> column -> null
# Window functions
func lag<column> offset column -> null
func lead<column> offset column -> null
func first<column> offset column -> null
func last<column> offset column -> null
func rank<column> -> null
func rank_dense<column> -> null
func row_number<column> -> null
# Other functions
func round<scalar> n_digits column -> null
func as<scalar> `noresolve.type` column -> null
func in<bool> pattern value -> null
# Transform type definitions
func from<table> `default_db.source`<table> -> null
func select<table> columns<column> tbl<table> -> null
func filter<table> condition<bool> tbl<table> -> null
func derive<table> columns<column> tbl<table> -> null
func aggregate<table> a<column> tbl<table> -> null
func sort<table> by tbl<table> -> null
func take<table> expr tbl<table> -> null
func join<table> `default_db.with`<table> filter `noresolve.side`:inner tbl<table> -> null
func group<table> by pipeline tbl<table> -> null
func window<table> rows:0..0 range:0..0 expanding:false rolling:0 pipeline tbl<table> -> null
func append<table> `default_db.bottom`<table> top<table> -> null
func intersect<table> `default_db.bottom`<table> top<table> -> (
from t = _param.top
join b = _param.bottom (all (map _eq (zip t.* b.*)))
select t.*
)
func remove<table> `default_db.bottom`<table> top<table> -> (
from t = _param.top
join side:left b = _param.bottom (all (map _eq (zip t.* b.*)))
filter (all (map _is_null b.*))
select t.*
)
func loop<table> pipeline top<table> -> null
# List functions
func all<bool> list<list> -> null
func map<list> fn list<list> -> null
func zip<list> a<list> b<list> -> null
func _eq<list> a<list> -> null
func _is_null a -> _param.a == null
# Misc
func from_text<table> input<text> `noresolve.format`:csv -> null
# String functions
func lower <text> column -> null
func upper <text> column -> null
# type primitives
type int
type float
type bool
type text
type date
type time
type timestamp
type table
type column
type list
type scalar
# Source-reading functions, primarily for DuckDB
func read_parquet<table> source<text> -> s"SELECT * FROM read_parquet({source})"
func read_csv<table> source<text> -> s"SELECT * FROM read_csv_auto({source})"
And a couple of examples:
PRQL
from employees
derive [
gross_salary = (salary + payroll_tax | as int),
gross_salary_rounded = (gross_salary | round 0),
time = s"NOW()", # an s-string, given no `now` function exists in PRQL
]
SQL
SELECT
*,
CAST(salary + payroll_tax AS int) AS gross_salary,
ROUND(CAST(salary + payroll_tax AS int), 0) AS gross_salary_rounded,
NOW() AS time
FROM
employees