S-strings
An s-string inserts SQL directly, as an escape hatch when there’s something that PRQL doesn’t yet implement. For example, there’s no version()
function in SQL that returns the Postgres version, so if we want to use that, we use an s-string:
PRQL
from my_table
select db_version = s"version()"
SQL
SELECT
version() AS db_version
FROM
my_table
Embed a column name in an s-string using braces. For example, PRQL’s standard library defines the average
function as:
So this compiles using the function:
PRQL
from employees
aggregate [average salary]
SQL
SELECT
AVG(salary)
FROM
employees
Here’s an example of a more involved use of an s-string:
PRQL
from de=dept_emp
join s=salaries side:left [
(s.emp_no == de.emp_no),
s"""({s.from_date}, {s.to_date})
OVERLAPS
({de.from_date}, {de.to_date})"""
]
SQL
SELECT
de.*,
s.*
FROM
dept_emp AS de
LEFT JOIN salaries AS s ON s.emp_no = de.emp_no
AND (s.from_date, s.to_date) OVERLAPS (de.from_date, de.to_date)
For those who have used Python, s-strings are similar to Python’s f-strings, but the result is SQL code, rather than a string literal. For example, a Python f-string of f"average({col})"
would produce "average(salary)"
, with quotes; while in PRQL, s"average({col})"
produces average(salary)
, without quotes.
Note that interpolations can only contain plain variable names and not whole expression like Python.
We can also use s-strings to produce a full table:
PRQL
from s"SELECT DISTINCT ON first_name, id, age FROM employees ORDER BY age ASC"
join s = s"SELECT * FROM salaries" [==id]
SQL
WITH table_0 AS (
SELECT
DISTINCT ON first_name,
id,
age
FROM
employees
ORDER BY
age ASC
),
table_1 AS (
SELECT
*
FROM
salaries
)
SELECT
table_2.*,
table_3.*
FROM
table_0 AS table_2
JOIN table_1 AS table_3 ON table_2.id = table_3.id
Note
S-strings in user code are intended as an escape-hatch for an unimplemented feature. If we often need s-strings to express something, that’s a sign we should implement it in PRQL or PRQL’s stdlib.
Braces
To output braces from an s-string, use double braces:
PRQL
from employees
derive [
has_valid_title = s"regexp_contains(title, '([a-z0-9]*-){{2,}}')"
]
SQL
SELECT
*,
regexp_contains(title, '([a-z0-9]*-){2,}') AS has_valid_title
FROM
employees
Precedence
The PRQL compiler simply places a literal copy of each variable into the resulting string, which means we may get surprising behavior when the variable is has multiple terms and the s-string isn’t parenthesized.
In this toy example, the salary + benefits / 365
gets precedence wrong:
PRQL
from employees
derive [
gross_salary = salary + benefits,
daily_rate = s"{gross_salary} / 365"
]
SQL
SELECT
*,
salary + benefits AS gross_salary,
salary + benefits / 365 AS daily_rate
FROM
employees
Instead, put the denominator {gross_salary}
in parentheses:
PRQL
from employees
derive [
gross_salary = salary + benefits,
daily_rate = s"({gross_salary}) / 365"
]
SQL
SELECT
*,
salary + benefits AS gross_salary,
(salary + benefits) / 365 AS daily_rate
FROM
employees