Common Table Expressions
Peewee supports the inclusion of common table expressions (CTEs) in all typesof queries. CTEs may be useful for:
- Factoring out a common subquery.
- Grouping or filtering by a column derived in the CTE’s result set.
- Writing recursive queries.
To declare a Select
query for use as a CTE, usecte()
method, which wraps the query in a CTE
object. To indicate that a CTE
should be included as part of aquery, use the Query.with_cte()
method, passing a list of CTE objects.
Simple Example
For an example, let’s say we have some data points that consist of a key and afloating-point value. Let’s define our model and populate some test data:
- class Sample(Model):
- key = TextField()
- value = FloatField()
- data = (
- ('a', (1.25, 1.5, 1.75)),
- ('b', (2.1, 2.3, 2.5, 2.7, 2.9)),
- ('c', (3.5, 3.5)))
- # Populate data.
- for key, values in data:
- Sample.insert_many([(key, value) for value in values],
- fields=[Sample.key, Sample.value]).execute()
Let’s use a CTE to calculate, for each distinct key, which values wereabove-average for that key.
- # First we'll declare the query that will be used as a CTE. This query
- # simply determines the average value for each key.
- cte = (Sample
- .select(Sample.key, fn.AVG(Sample.value).alias('avg_value'))
- .group_by(Sample.key)
- .cte('key_avgs', columns=('key', 'avg_value')))
- # Now we'll query the sample table, using our CTE to find rows whose value
- # exceeds the average for the given key. We'll calculate how far above the
- # average the given sample's value is, as well.
- query = (Sample
- .select(Sample.key, Sample.value)
- .join(cte, on=(Sample.key == cte.c.key))
- .where(Sample.value > cte.c.avg_value)
- .order_by(Sample.value)
- .with_cte(cte))
We can iterate over the samples returned by the query to see which samples hadabove-average values for their given group:
- >>> for sample in query:
- ... print(sample.key, sample.value)
- # 'a', 1.75
- # 'b', 2.7
- # 'b', 2.9
Complex Example
For a more complete example, let’s consider the following query which usesmultiple CTEs to find per-product sales totals in only the top sales regions.Our model looks like this:
- class Order(Model):
- region = TextField()
- amount = FloatField()
- product = TextField()
- quantity = IntegerField()
Here is how the query might be written in SQL. This example can be found inthe postgresql documentation.
- WITH regional_sales AS (
- SELECT region, SUM(amount) AS total_sales
- FROM orders
- GROUP BY region
- ), top_regions AS (
- SELECT region
- FROM regional_sales
- WHERE total_sales > (SELECT SUM(total_sales) / 10 FROM regional_sales)
- )
- SELECT region,
- product,
- SUM(quantity) AS product_units,
- SUM(amount) AS product_sales
- FROM orders
- WHERE region IN (SELECT region FROM top_regions)
- GROUP BY region, product;
With Peewee, we would write:
- reg_sales = (Order
- .select(Order.region,
- fn.SUM(Order.amount).alias('total_sales'))
- .group_by(Order.region)
- .cte('regional_sales'))
- top_regions = (reg_sales
- .select(reg_sales.c.region)
- .where(reg_sales.c.total_sales > (
- reg_sales.select(fn.SUM(reg_sales.c.total_sales) / 10)))
- .cte('top_regions'))
- query = (Order
- .select(Order.region,
- Order.product,
- fn.SUM(Order.quantity).alias('product_units'),
- fn.SUM(Order.amount).alias('product_sales'))
- .where(Order.region.in_(top_regions.select(top_regions.c.region)))
- .group_by(Order.region, Order.product)
- .with_cte(regional_sales, top_regions))
Recursive CTEs
Peewee supports recursive CTEs. Recursive CTEs can be useful when, for example,you have a tree data-structure represented by a parent-link foreign key.Suppose, for example, that we have a hierarchy of categories for an onlinebookstore. We wish to generate a table showing all categories and theirabsolute depths, along with the path from the root to the category.
We’ll assume the following model definition, in which each category has aforeign-key to its immediate parent category:
- class Category(Model):
- name = TextField()
- parent = ForeignKeyField('self', backref='children', null=True)
To list all categories along with their depth and parents, we can use arecursive CTE:
- # Define the base case of our recursive CTE. This will be categories that
- # have a null parent foreign-key.
- Base = Category.alias()
- level = Value(1).alias('level')
- path = Base.name.alias('path')
- base_case = (Base
- .select(Base.name, Base.parent, level, path)
- .where(Base.parent.is_null())
- .cte('base', recursive=True))
- # Define the recursive terms.
- RTerm = Category.alias()
- rlevel = (base_case.c.level + 1).alias('level')
- rpath = base_case.c.path.concat('->').concat(RTerm.name).alias('path')
- recursive = (RTerm
- .select(RTerm.name, RTerm.parent, rlevel, rpath)
- .join(base_case, on=(RTerm.parent == base_case.c.id)))
- # The recursive CTE is created by taking the base case and UNION ALL with
- # the recursive term.
- cte = base_case.union_all(recursive)
- # We will now query from the CTE to get the categories, their levels, and
- # their paths.
- query = (cte
- .select_from(cte.c.name, cte.c.level, cte.c.path)
- .order_by(cte.c.path))
- # We can now iterate over a list of all categories and print their names,
- # absolute levels, and path from root -> category.
- for category in query:
- print(category.name, category.level, category.path)
- # Example output:
- # root, 1, root
- # p1, 2, root->p1
- # c1-1, 3, root->p1->c1-1
- # c1-2, 3, root->p1->c1-2
- # p2, 2, root->p2
- # c2-1, 3, root->p2->c2-1