Common Table Expressions
Peewee supports the inclusion of common table expressions (CTEs) in all types of queries. CTEs may be useful for:
- Factoring out a common subquery.
- Grouping or filtering by a column derived in the CTE’s result set.
- Writing recursive queries.
To declare a Select
query for use as a CTE, use cte()
method, which wraps the query in a CTE
object. To indicate that a CTE
should be included as part of a query, use the Query.with_cte()
method, passing a list of CTE objects.
Simple Example
For an example, let’s say we have some data points that consist of a key and a floating-point value. Let’s define our model and populate some test data:
class Sample(Model):
key = TextField()
value = FloatField()
data = (
('a', (1.25, 1.5, 1.75)),
('b', (2.1, 2.3, 2.5, 2.7, 2.9)),
('c', (3.5, 3.5)))
# Populate data.
for key, values in data:
Sample.insert_many([(key, value) for value in values],
fields=[Sample.key, Sample.value]).execute()
Let’s use a CTE to calculate, for each distinct key, which values were above-average for that key.
# First we'll declare the query that will be used as a CTE. This query
# simply determines the average value for each key.
cte = (Sample
.select(Sample.key, fn.AVG(Sample.value).alias('avg_value'))
.group_by(Sample.key)
.cte('key_avgs', columns=('key', 'avg_value')))
# Now we'll query the sample table, using our CTE to find rows whose value
# exceeds the average for the given key. We'll calculate how far above the
# average the given sample's value is, as well.
query = (Sample
.select(Sample.key, Sample.value)
.join(cte, on=(Sample.key == cte.c.key))
.where(Sample.value > cte.c.avg_value)
.order_by(Sample.value)
.with_cte(cte))
We can iterate over the samples returned by the query to see which samples had above-average values for their given group:
>>> for sample in query:
... print(sample.key, sample.value)
# 'a', 1.75
# 'b', 2.7
# 'b', 2.9
Complex Example
For a more complete example, let’s consider the following query which uses multiple CTEs to find per-product sales totals in only the top sales regions. Our model looks like this:
class Order(Model):
region = TextField()
amount = FloatField()
product = TextField()
quantity = IntegerField()
Here is how the query might be written in SQL. This example can be found in the postgresql documentation.
WITH regional_sales AS (
SELECT region, SUM(amount) AS total_sales
FROM orders
GROUP BY region
), top_regions AS (
SELECT region
FROM regional_sales
WHERE total_sales > (SELECT SUM(total_sales) / 10 FROM regional_sales)
)
SELECT region,
product,
SUM(quantity) AS product_units,
SUM(amount) AS product_sales
FROM orders
WHERE region IN (SELECT region FROM top_regions)
GROUP BY region, product;
With Peewee, we would write:
reg_sales = (Order
.select(Order.region,
fn.SUM(Order.amount).alias('total_sales'))
.group_by(Order.region)
.cte('regional_sales'))
top_regions = (reg_sales
.select(reg_sales.c.region)
.where(reg_sales.c.total_sales > (
reg_sales.select(fn.SUM(reg_sales.c.total_sales) / 10)))
.cte('top_regions'))
query = (Order
.select(Order.region,
Order.product,
fn.SUM(Order.quantity).alias('product_units'),
fn.SUM(Order.amount).alias('product_sales'))
.where(Order.region.in_(top_regions.select(top_regions.c.region)))
.group_by(Order.region, Order.product)
.with_cte(regional_sales, top_regions))
Recursive CTEs
Peewee supports recursive CTEs. Recursive CTEs can be useful when, for example, you have a tree data-structure represented by a parent-link foreign key. Suppose, for example, that we have a hierarchy of categories for an online bookstore. We wish to generate a table showing all categories and their absolute depths, along with the path from the root to the category.
We’ll assume the following model definition, in which each category has a foreign-key to its immediate parent category:
class Category(Model):
name = TextField()
parent = ForeignKeyField('self', backref='children', null=True)
To list all categories along with their depth and parents, we can use a recursive CTE:
# Define the base case of our recursive CTE. This will be categories that
# have a null parent foreign-key.
Base = Category.alias()
level = Value(1).alias('level')
path = Base.name.alias('path')
base_case = (Base
.select(Base.name, Base.parent, level, path)
.where(Base.parent.is_null())
.cte('base', recursive=True))
# Define the recursive terms.
RTerm = Category.alias()
rlevel = (base_case.c.level + 1).alias('level')
rpath = base_case.c.path.concat('->').concat(RTerm.name).alias('path')
recursive = (RTerm
.select(RTerm.name, RTerm.parent, rlevel, rpath)
.join(base_case, on=(RTerm.parent == base_case.c.id)))
# The recursive CTE is created by taking the base case and UNION ALL with
# the recursive term.
cte = base_case.union_all(recursive)
# We will now query from the CTE to get the categories, their levels, and
# their paths.
query = (cte
.select_from(cte.c.name, cte.c.level, cte.c.path)
.order_by(cte.c.path))
# We can now iterate over a list of all categories and print their names,
# absolute levels, and path from root -> category.
for category in query:
print(category.name, category.level, category.path)
# Example output:
# root, 1, root
# p1, 2, root->p1
# c1-1, 3, root->p1->c1-1
# c1-2, 3, root->p1->c1-2
# p2, 2, root->p2
# c2-1, 3, root->p2->c2-1