Configuring how Relationship Joins

relationship() will normally create a join between two tables by examining the foreign key relationship between the two tables to determine which columns should be compared. There are a variety of situations where this behavior needs to be customized.

Handling Multiple Join Paths

One of the most common situations to deal with is when there are more than one foreign key path between two tables.

Consider a Customer class that contains two foreign keys to an Address class:

  1. from sqlalchemy import Integer, ForeignKey, String, Column
  2. from sqlalchemy.orm import DeclarativeBase
  3. from sqlalchemy.orm import relationship
  4. class Base(DeclarativeBase):
  5. pass
  6. class Customer(Base):
  7. __tablename__ = "customer"
  8. id = mapped_column(Integer, primary_key=True)
  9. name = mapped_column(String)
  10. billing_address_id = mapped_column(Integer, ForeignKey("address.id"))
  11. shipping_address_id = mapped_column(Integer, ForeignKey("address.id"))
  12. billing_address = relationship("Address")
  13. shipping_address = relationship("Address")
  14. class Address(Base):
  15. __tablename__ = "address"
  16. id = mapped_column(Integer, primary_key=True)
  17. street = mapped_column(String)
  18. city = mapped_column(String)
  19. state = mapped_column(String)
  20. zip = mapped_column(String)

The above mapping, when we attempt to use it, will produce the error:

  1. sqlalchemy.exc.AmbiguousForeignKeysError: Could not determine join
  2. condition between parent/child tables on relationship
  3. Customer.billing_address - there are multiple foreign key
  4. paths linking the tables. Specify the 'foreign_keys' argument,
  5. providing a list of those columns which should be
  6. counted as containing a foreign key reference to the parent table.

The above message is pretty long. There are many potential messages that relationship() can return, which have been carefully tailored to detect a variety of common configurational issues; most will suggest the additional configuration that’s needed to resolve the ambiguity or other missing information.

In this case, the message wants us to qualify each relationship() by instructing for each one which foreign key column should be considered, and the appropriate form is as follows:

  1. class Customer(Base):
  2. __tablename__ = "customer"
  3. id = mapped_column(Integer, primary_key=True)
  4. name = mapped_column(String)
  5. billing_address_id = mapped_column(Integer, ForeignKey("address.id"))
  6. shipping_address_id = mapped_column(Integer, ForeignKey("address.id"))
  7. billing_address = relationship("Address", foreign_keys=[billing_address_id])
  8. shipping_address = relationship("Address", foreign_keys=[shipping_address_id])

Above, we specify the foreign_keys argument, which is a Column or list of Column objects which indicate those columns to be considered “foreign”, or in other words, the columns that contain a value referring to a parent table. Loading the Customer.billing_address relationship from a Customer object will use the value present in billing_address_id in order to identify the row in Address to be loaded; similarly, shipping_address_id is used for the shipping_address relationship. The linkage of the two columns also plays a role during persistence; the newly generated primary key of a just-inserted Address object will be copied into the appropriate foreign key column of an associated Customer object during a flush.

When specifying foreign_keys with Declarative, we can also use string names to specify, however it is important that if using a list, the list is part of the string:

  1. billing_address = relationship("Address", foreign_keys="[Customer.billing_address_id]")

In this specific example, the list is not necessary in any case as there’s only one Column we need:

  1. billing_address = relationship("Address", foreign_keys="Customer.billing_address_id")

Warning

When passed as a Python-evaluable string, the relationship.foreign_keys argument is interpreted using Python’s eval() function. DO NOT PASS UNTRUSTED INPUT TO THIS STRING. See Evaluation of relationship arguments for details on declarative evaluation of relationship() arguments.

Specifying Alternate Join Conditions

The default behavior of relationship() when constructing a join is that it equates the value of primary key columns on one side to that of foreign-key-referring columns on the other. We can change this criterion to be anything we’d like using the relationship.primaryjoin argument, as well as the relationship.secondaryjoin argument in the case when a “secondary” table is used.

In the example below, using the User class as well as an Address class which stores a street address, we create a relationship boston_addresses which will only load those Address objects which specify a city of “Boston”:

  1. from sqlalchemy import Integer, ForeignKey, String, Column
  2. from sqlalchemy.orm import DeclarativeBase
  3. from sqlalchemy.orm import relationship
  4. class Base(DeclarativeBase):
  5. pass
  6. class User(Base):
  7. __tablename__ = "user"
  8. id = mapped_column(Integer, primary_key=True)
  9. name = mapped_column(String)
  10. boston_addresses = relationship(
  11. "Address",
  12. primaryjoin="and_(User.id==Address.user_id, " "Address.city=='Boston')",
  13. )
  14. class Address(Base):
  15. __tablename__ = "address"
  16. id = mapped_column(Integer, primary_key=True)
  17. user_id = mapped_column(Integer, ForeignKey("user.id"))
  18. street = mapped_column(String)
  19. city = mapped_column(String)
  20. state = mapped_column(String)
  21. zip = mapped_column(String)

Within this string SQL expression, we made use of the and_() conjunction construct to establish two distinct predicates for the join condition - joining both the User.id and Address.user_id columns to each other, as well as limiting rows in Address to just city='Boston'. When using Declarative, rudimentary SQL functions like and_() are automatically available in the evaluated namespace of a string relationship() argument.

Warning

When passed as a Python-evaluable string, the relationship.primaryjoin argument is interpreted using Python’s eval() function. DO NOT PASS UNTRUSTED INPUT TO THIS STRING. See Evaluation of relationship arguments for details on declarative evaluation of relationship() arguments.

The custom criteria we use in a relationship.primaryjoin is generally only significant when SQLAlchemy is rendering SQL in order to load or represent this relationship. That is, it’s used in the SQL statement that’s emitted in order to perform a per-attribute lazy load, or when a join is constructed at query time, such as via Select.join(), or via the eager “joined” or “subquery” styles of loading. When in-memory objects are being manipulated, we can place any Address object we’d like into the boston_addresses collection, regardless of what the value of the .city attribute is. The objects will remain present in the collection until the attribute is expired and re-loaded from the database where the criterion is applied. When a flush occurs, the objects inside of boston_addresses will be flushed unconditionally, assigning value of the primary key user.id column onto the foreign-key-holding address.user_id column for each row. The city criteria has no effect here, as the flush process only cares about synchronizing primary key values into referencing foreign key values.

Creating Custom Foreign Conditions

Another element of the primary join condition is how those columns considered “foreign” are determined. Usually, some subset of Column objects will specify ForeignKey, or otherwise be part of a ForeignKeyConstraint that’s relevant to the join condition. relationship() looks to this foreign key status as it decides how it should load and persist data for this relationship. However, the relationship.primaryjoin argument can be used to create a join condition that doesn’t involve any “schema” level foreign keys. We can combine relationship.primaryjoin along with relationship.foreign_keys and relationship.remote_side explicitly in order to establish such a join.

Below, a class HostEntry joins to itself, equating the string content column to the ip_address column, which is a PostgreSQL type called INET. We need to use cast() in order to cast one side of the join to the type of the other:

  1. from sqlalchemy import cast, String, Column, Integer
  2. from sqlalchemy.orm import relationship
  3. from sqlalchemy.dialects.postgresql import INET
  4. from sqlalchemy.orm import DeclarativeBase
  5. class Base(DeclarativeBase):
  6. pass
  7. class HostEntry(Base):
  8. __tablename__ = "host_entry"
  9. id = mapped_column(Integer, primary_key=True)
  10. ip_address = mapped_column(INET)
  11. content = mapped_column(String(50))
  12. # relationship() using explicit foreign_keys, remote_side
  13. parent_host = relationship(
  14. "HostEntry",
  15. primaryjoin=ip_address == cast(content, INET),
  16. foreign_keys=content,
  17. remote_side=ip_address,
  18. )

The above relationship will produce a join like:

  1. SELECT host_entry.id, host_entry.ip_address, host_entry.content
  2. FROM host_entry JOIN host_entry AS host_entry_1
  3. ON host_entry_1.ip_address = CAST(host_entry.content AS INET)

An alternative syntax to the above is to use the foreign() and remote() annotations, inline within the relationship.primaryjoin expression. This syntax represents the annotations that relationship() normally applies by itself to the join condition given the relationship.foreign_keys and relationship.remote_side arguments. These functions may be more succinct when an explicit join condition is present, and additionally serve to mark exactly the column that is “foreign” or “remote” independent of whether that column is stated multiple times or within complex SQL expressions:

  1. from sqlalchemy.orm import foreign, remote
  2. class HostEntry(Base):
  3. __tablename__ = "host_entry"
  4. id = mapped_column(Integer, primary_key=True)
  5. ip_address = mapped_column(INET)
  6. content = mapped_column(String(50))
  7. # relationship() using explicit foreign() and remote() annotations
  8. # in lieu of separate arguments
  9. parent_host = relationship(
  10. "HostEntry",
  11. primaryjoin=remote(ip_address) == cast(foreign(content), INET),
  12. )

Using custom operators in join conditions

Another use case for relationships is the use of custom operators, such as PostgreSQL’s “is contained within” << operator when joining with types such as INET and CIDR. For custom boolean operators we use the Operators.bool_op() function:

  1. inet_column.bool_op("<<")(cidr_column)

A comparison like the above may be used directly with relationship.primaryjoin when constructing a relationship():

  1. class IPA(Base):
  2. __tablename__ = "ip_address"
  3. id = mapped_column(Integer, primary_key=True)
  4. v4address = mapped_column(INET)
  5. network = relationship(
  6. "Network",
  7. primaryjoin="IPA.v4address.bool_op('<<')" "(foreign(Network.v4representation))",
  8. viewonly=True,
  9. )
  10. class Network(Base):
  11. __tablename__ = "network"
  12. id = mapped_column(Integer, primary_key=True)
  13. v4representation = mapped_column(CIDR)

Above, a query such as:

  1. select(IPA).join(IPA.network)

Will render as:

  1. SELECT ip_address.id AS ip_address_id, ip_address.v4address AS ip_address_v4address
  2. FROM ip_address JOIN network ON ip_address.v4address << network.v4representation

Custom operators based on SQL functions

A variant to the use case for Operators.op.is_comparison is when we aren’t using an operator, but a SQL function. The typical example of this use case is the PostgreSQL PostGIS functions however any SQL function on any database that resolves to a binary condition may apply. To suit this use case, the FunctionElement.as_comparison() method can modify any SQL function, such as those invoked from the func namespace, to indicate to the ORM that the function produces a comparison of two expressions. The below example illustrates this with the Geoalchemy2 library:

  1. from geoalchemy2 import Geometry
  2. from sqlalchemy import Column, Integer, func
  3. from sqlalchemy.orm import relationship, foreign
  4. class Polygon(Base):
  5. __tablename__ = "polygon"
  6. id = mapped_column(Integer, primary_key=True)
  7. geom = mapped_column(Geometry("POLYGON", srid=4326))
  8. points = relationship(
  9. "Point",
  10. primaryjoin="func.ST_Contains(foreign(Polygon.geom), Point.geom).as_comparison(1, 2)",
  11. viewonly=True,
  12. )
  13. class Point(Base):
  14. __tablename__ = "point"
  15. id = mapped_column(Integer, primary_key=True)
  16. geom = mapped_column(Geometry("POINT", srid=4326))

Above, the FunctionElement.as_comparison() indicates that the func.ST_Contains() SQL function is comparing the Polygon.geom and Point.geom expressions. The foreign() annotation additionally notes which column takes on the “foreign key” role in this particular relationship.

New in version 1.3: Added FunctionElement.as_comparison().

Overlapping Foreign Keys

A rare scenario can arise when composite foreign keys are used, such that a single column may be the subject of more than one column referred to via foreign key constraint.

Consider an (admittedly complex) mapping such as the Magazine object, referred to both by the Writer object and the Article object using a composite primary key scheme that includes magazine_id for both; then to make Article refer to Writer as well, Article.magazine_id is involved in two separate relationships; Article.magazine and Article.writer:

  1. class Magazine(Base):
  2. __tablename__ = "magazine"
  3. id = mapped_column(Integer, primary_key=True)
  4. class Article(Base):
  5. __tablename__ = "article"
  6. article_id = mapped_column(Integer)
  7. magazine_id = mapped_column(ForeignKey("magazine.id"))
  8. writer_id = mapped_column()
  9. magazine = relationship("Magazine")
  10. writer = relationship("Writer")
  11. __table_args__ = (
  12. PrimaryKeyConstraint("article_id", "magazine_id"),
  13. ForeignKeyConstraint(
  14. ["writer_id", "magazine_id"], ["writer.id", "writer.magazine_id"]
  15. ),
  16. )
  17. class Writer(Base):
  18. __tablename__ = "writer"
  19. id = mapped_column(Integer, primary_key=True)
  20. magazine_id = mapped_column(ForeignKey("magazine.id"), primary_key=True)
  21. magazine = relationship("Magazine")

When the above mapping is configured, we will see this warning emitted:

  1. SAWarning: relationship 'Article.writer' will copy column
  2. writer.magazine_id to column article.magazine_id,
  3. which conflicts with relationship(s): 'Article.magazine'
  4. (copies magazine.id to article.magazine_id). Consider applying
  5. viewonly=True to read-only relationships, or provide a primaryjoin
  6. condition marking writable columns with the foreign() annotation.

What this refers to originates from the fact that Article.magazine_id is the subject of two different foreign key constraints; it refers to Magazine.id directly as a source column, but also refers to Writer.magazine_id as a source column in the context of the composite key to Writer. If we associate an Article with a particular Magazine, but then associate the Article with a Writer that’s associated with a different Magazine, the ORM will overwrite Article.magazine_id non-deterministically, silently changing which magazine we refer towards; it may also attempt to place NULL into this column if we de-associate a Writer from an Article. The warning lets us know this is the case.

To solve this, we need to break out the behavior of Article to include all three of the following features:

  1. Article first and foremost writes to Article.magazine_id based on data persisted in the Article.magazine relationship only, that is a value copied from Magazine.id.

  2. Article can write to Article.writer_id on behalf of data persisted in the Article.writer relationship, but only the Writer.id column; the Writer.magazine_id column should not be written into Article.magazine_id as it ultimately is sourced from Magazine.id.

  3. Article takes Article.magazine_id into account when loading Article.writer, even though it doesn’t write to it on behalf of this relationship.

To get just #1 and #2, we could specify only Article.writer_id as the “foreign keys” for Article.writer:

  1. class Article(Base):
  2. # ...
  3. writer = relationship("Writer", foreign_keys="Article.writer_id")

However, this has the effect of Article.writer not taking Article.magazine_id into account when querying against Writer:

  1. SELECT article.article_id AS article_article_id,
  2. article.magazine_id AS article_magazine_id,
  3. article.writer_id AS article_writer_id
  4. FROM article
  5. JOIN writer ON writer.id = article.writer_id

Therefore, to get at all of #1, #2, and #3, we express the join condition as well as which columns to be written by combining relationship.primaryjoin fully, along with either the relationship.foreign_keys argument, or more succinctly by annotating with foreign():

  1. class Article(Base):
  2. # ...
  3. writer = relationship(
  4. "Writer",
  5. primaryjoin="and_(Writer.id == foreign(Article.writer_id), "
  6. "Writer.magazine_id == Article.magazine_id)",
  7. )

Changed in version 1.0.0: the ORM will attempt to warn when a column is used as the synchronization target from more than one relationship simultaneously.

Non-relational Comparisons / Materialized Path

Warning

this section details an experimental feature.

Using custom expressions means we can produce unorthodox join conditions that don’t obey the usual primary/foreign key model. One such example is the materialized path pattern, where we compare strings for overlapping path tokens in order to produce a tree structure.

Through careful use of foreign() and remote(), we can build a relationship that effectively produces a rudimentary materialized path system. Essentially, when foreign() and remote() are on the same side of the comparison expression, the relationship is considered to be “one to many”; when they are on different sides, the relationship is considered to be “many to one”. For the comparison we’ll use here, we’ll be dealing with collections so we keep things configured as “one to many”:

  1. class Element(Base):
  2. __tablename__ = "element"
  3. path = mapped_column(String, primary_key=True)
  4. descendants = relationship(
  5. "Element",
  6. primaryjoin=remote(foreign(path)).like(path.concat("/%")),
  7. viewonly=True,
  8. order_by=path,
  9. )

Above, if given an Element object with a path attribute of "/foo/bar2", we seek for a load of Element.descendants to look like:

  1. SELECT element.path AS element_path
  2. FROM element
  3. WHERE element.path LIKE ('/foo/bar2' || '/%') ORDER BY element.path

New in version 0.9.5: Support has been added to allow a single-column comparison to itself within a primaryjoin condition, as well as for primaryjoin conditions that use ColumnOperators.like() as the comparison operator.

Self-Referential Many-to-Many Relationship

See also

This section documents a two-table variant of the “adjacency list” pattern, which is documented at Adjacency List Relationships. Be sure to review the self-referential querying patterns in subsections Self-Referential Query Strategies and Configuring Self-Referential Eager Loading which apply equally well to the mapping pattern discussed here.

Many to many relationships can be customized by one or both of relationship.primaryjoin and relationship.secondaryjoin - the latter is significant for a relationship that specifies a many-to-many reference using the relationship.secondary argument. A common situation which involves the usage of relationship.primaryjoin and relationship.secondaryjoin is when establishing a many-to-many relationship from a class to itself, as shown below:

  1. from sqlalchemy import Integer, ForeignKey, String, Column, Table
  2. from sqlalchemy.orm import DeclarativeBase
  3. from sqlalchemy.orm import relationship
  4. class Base(DeclarativeBase):
  5. pass
  6. node_to_node = Table(
  7. "node_to_node",
  8. Base.metadata,
  9. Column("left_node_id", Integer, ForeignKey("node.id"), primary_key=True),
  10. Column("right_node_id", Integer, ForeignKey("node.id"), primary_key=True),
  11. )
  12. class Node(Base):
  13. __tablename__ = "node"
  14. id = mapped_column(Integer, primary_key=True)
  15. label = mapped_column(String)
  16. right_nodes = relationship(
  17. "Node",
  18. secondary=node_to_node,
  19. primaryjoin=id == node_to_node.c.left_node_id,
  20. secondaryjoin=id == node_to_node.c.right_node_id,
  21. backref="left_nodes",
  22. )

Where above, SQLAlchemy can’t know automatically which columns should connect to which for the right_nodes and left_nodes relationships. The relationship.primaryjoin and relationship.secondaryjoin arguments establish how we’d like to join to the association table. In the Declarative form above, as we are declaring these conditions within the Python block that corresponds to the Node class, the id variable is available directly as the Column object we wish to join with.

Alternatively, we can define the relationship.primaryjoin and relationship.secondaryjoin arguments using strings, which is suitable in the case that our configuration does not have either the Node.id column object available yet or the node_to_node table perhaps isn’t yet available. When referring to a plain Table object in a declarative string, we use the string name of the table as it is present in the MetaData:

  1. class Node(Base):
  2. __tablename__ = "node"
  3. id = mapped_column(Integer, primary_key=True)
  4. label = mapped_column(String)
  5. right_nodes = relationship(
  6. "Node",
  7. secondary="node_to_node",
  8. primaryjoin="Node.id==node_to_node.c.left_node_id",
  9. secondaryjoin="Node.id==node_to_node.c.right_node_id",
  10. backref="left_nodes",
  11. )

Warning

When passed as a Python-evaluable string, the relationship.primaryjoin and relationship.secondaryjoin arguments are interpreted using Python’s eval() function. DO NOT PASS UNTRUSTED INPUT TO THESE STRINGS. See Evaluation of relationship arguments for details on declarative evaluation of relationship() arguments.

A classical mapping situation here is similar, where node_to_node can be joined to node.c.id:

  1. from sqlalchemy import Integer, ForeignKey, String, Column, Table, MetaData
  2. from sqlalchemy.orm import relationship, registry
  3. metadata_obj = MetaData()
  4. mapper_registry = registry()
  5. node_to_node = Table(
  6. "node_to_node",
  7. metadata_obj,
  8. Column("left_node_id", Integer, ForeignKey("node.id"), primary_key=True),
  9. Column("right_node_id", Integer, ForeignKey("node.id"), primary_key=True),
  10. )
  11. node = Table(
  12. "node",
  13. metadata_obj,
  14. Column("id", Integer, primary_key=True),
  15. Column("label", String),
  16. )
  17. class Node:
  18. pass
  19. mapper_registry.map_imperatively(
  20. Node,
  21. node,
  22. properties={
  23. "right_nodes": relationship(
  24. Node,
  25. secondary=node_to_node,
  26. primaryjoin=node.c.id == node_to_node.c.left_node_id,
  27. secondaryjoin=node.c.id == node_to_node.c.right_node_id,
  28. backref="left_nodes",
  29. )
  30. },
  31. )

Note that in both examples, the relationship.backref keyword specifies a left_nodes backref - when relationship() creates the second relationship in the reverse direction, it’s smart enough to reverse the relationship.primaryjoin and relationship.secondaryjoin arguments.

See also

Composite “Secondary” Joins

Note

This section features far edge cases that are somewhat supported by SQLAlchemy, however it is recommended to solve problems like these in simpler ways whenever possible, by using reasonable relational layouts and / or in-Python attributes.

Sometimes, when one seeks to build a relationship() between two tables there is a need for more than just two or three tables to be involved in order to join them. This is an area of relationship() where one seeks to push the boundaries of what’s possible, and often the ultimate solution to many of these exotic use cases needs to be hammered out on the SQLAlchemy mailing list.

In more recent versions of SQLAlchemy, the relationship.secondary parameter can be used in some of these cases in order to provide a composite target consisting of multiple tables. Below is an example of such a join condition (requires version 0.9.2 at least to function as is):

  1. class A(Base):
  2. __tablename__ = "a"
  3. id = mapped_column(Integer, primary_key=True)
  4. b_id = mapped_column(ForeignKey("b.id"))
  5. d = relationship(
  6. "D",
  7. secondary="join(B, D, B.d_id == D.id)." "join(C, C.d_id == D.id)",
  8. primaryjoin="and_(A.b_id == B.id, A.id == C.a_id)",
  9. secondaryjoin="D.id == B.d_id",
  10. uselist=False,
  11. viewonly=True,
  12. )
  13. class B(Base):
  14. __tablename__ = "b"
  15. id = mapped_column(Integer, primary_key=True)
  16. d_id = mapped_column(ForeignKey("d.id"))
  17. class C(Base):
  18. __tablename__ = "c"
  19. id = mapped_column(Integer, primary_key=True)
  20. a_id = mapped_column(ForeignKey("a.id"))
  21. d_id = mapped_column(ForeignKey("d.id"))
  22. class D(Base):
  23. __tablename__ = "d"
  24. id = mapped_column(Integer, primary_key=True)

In the above example, we provide all three of relationship.secondary, relationship.primaryjoin, and relationship.secondaryjoin, in the declarative style referring to the named tables a, b, c, d directly. A query from A to D looks like:

  1. sess.scalars(select(A).join(A.d)).all()
  2. SELECT a.id AS a_id, a.b_id AS a_b_id
  3. FROM a JOIN (
  4. b AS b_1 JOIN d AS d_1 ON b_1.d_id = d_1.id
  5. JOIN c AS c_1 ON c_1.d_id = d_1.id)
  6. ON a.b_id = b_1.id AND a.id = c_1.a_id JOIN d ON d.id = b_1.d_id

In the above example, we take advantage of being able to stuff multiple tables into a “secondary” container, so that we can join across many tables while still keeping things “simple” for relationship(), in that there’s just “one” table on both the “left” and the “right” side; the complexity is kept within the middle.

Warning

A relationship like the above is typically marked as viewonly=True and should be considered as read-only. While there are sometimes ways to make relationships like the above writable, this is generally complicated and error prone.

Relationship to Aliased Class

New in version 1.3: The AliasedClass construct can now be specified as the target of a relationship(), replacing the previous approach of using non-primary mappers, which had limitations such that they did not inherit sub-relationships of the mapped entity as well as that they required complex configuration against an alternate selectable. The recipes in this section are now updated to use AliasedClass.

In the previous section, we illustrated a technique where we used relationship.secondary in order to place additional tables within a join condition. There is one complex join case where even this technique is not sufficient; when we seek to join from A to B, making use of any number of C, D, etc. in between, however there are also join conditions between A and B directly. In this case, the join from A to B may be difficult to express with just a complex relationship.primaryjoin condition, as the intermediary tables may need special handling, and it is also not expressible with a relationship.secondary object, since the A->secondary->B pattern does not support any references between A and B directly. When this extremely advanced case arises, we can resort to creating a second mapping as a target for the relationship. This is where we use AliasedClass in order to make a mapping to a class that includes all the additional tables we need for this join. In order to produce this mapper as an “alternative” mapping for our class, we use the aliased() function to produce the new construct, then use relationship() against the object as though it were a plain mapped class.

Below illustrates a relationship() with a simple join from A to B, however the primaryjoin condition is augmented with two additional entities C and D, which also must have rows that line up with the rows in both A and B simultaneously:

  1. class A(Base):
  2. __tablename__ = "a"
  3. id = mapped_column(Integer, primary_key=True)
  4. b_id = mapped_column(ForeignKey("b.id"))
  5. class B(Base):
  6. __tablename__ = "b"
  7. id = mapped_column(Integer, primary_key=True)
  8. class C(Base):
  9. __tablename__ = "c"
  10. id = mapped_column(Integer, primary_key=True)
  11. a_id = mapped_column(ForeignKey("a.id"))
  12. some_c_value = mapped_column(String)
  13. class D(Base):
  14. __tablename__ = "d"
  15. id = mapped_column(Integer, primary_key=True)
  16. c_id = mapped_column(ForeignKey("c.id"))
  17. b_id = mapped_column(ForeignKey("b.id"))
  18. some_d_value = mapped_column(String)
  19. # 1. set up the join() as a variable, so we can refer
  20. # to it in the mapping multiple times.
  21. j = join(B, D, D.b_id == B.id).join(C, C.id == D.c_id)
  22. # 2. Create an AliasedClass to B
  23. B_viacd = aliased(B, j, flat=True)
  24. A.b = relationship(B_viacd, primaryjoin=A.b_id == j.c.b_id)

With the above mapping, a simple join looks like:

  1. sess.scalars(select(A).join(A.b)).all()
  2. SELECT a.id AS a_id, a.b_id AS a_b_id
  3. FROM a JOIN (b JOIN d ON d.b_id = b.id JOIN c ON c.id = d.c_id) ON a.b_id = b.id

Using the AliasedClass target in Queries

In the previous example, the A.b relationship refers to the B_viacd entity as the target, and not the B class directly. To add additional criteria involving the A.b relationship, it’s typically necessary to reference the B_viacd directly rather than using B, especially in a case where the target entity of A.b is to be transformed into an alias or a subquery. Below illustrates the same relationship using a subquery, rather than a join:

  1. subq = select(B).join(D, D.b_id == B.id).join(C, C.id == D.c_id).subquery()
  2. B_viacd_subquery = aliased(B, subq)
  3. A.b = relationship(B_viacd_subquery, primaryjoin=A.b_id == subq.c.id)

A query using the above A.b relationship will render a subquery:

  1. sess.scalars(select(A).join(A.b)).all()
  2. SELECT a.id AS a_id, a.b_id AS a_b_id
  3. FROM a JOIN (SELECT b.id AS id, b.some_b_column AS some_b_column
  4. FROM b JOIN d ON d.b_id = b.id JOIN c ON c.id = d.c_id) AS anon_1 ON a.b_id = anon_1.id

If we want to add additional criteria based on the A.b join, we must do so in terms of B_viacd_subquery rather than B directly:

  1. sess.scalars(
  2. select(A)
  3. .join(A.b)
  4. .where(B_viacd_subquery.some_b_column == "some b")
  5. .order_by(B_viacd_subquery.id)
  6. ).all()
  7. SELECT a.id AS a_id, a.b_id AS a_b_id
  8. FROM a JOIN (SELECT b.id AS id, b.some_b_column AS some_b_column
  9. FROM b JOIN d ON d.b_id = b.id JOIN c ON c.id = d.c_id) AS anon_1 ON a.b_id = anon_1.id
  10. WHERE anon_1.some_b_column = ? ORDER BY anon_1.id

Row-Limited Relationships with Window Functions

Another interesting use case for relationships to AliasedClass objects are situations where the relationship needs to join to a specialized SELECT of any form. One scenario is when the use of a window function is desired, such as to limit how many rows should be returned for a relationship. The example below illustrates a non-primary mapper relationship that will load the first ten items for each collection:

  1. class A(Base):
  2. __tablename__ = "a"
  3. id = mapped_column(Integer, primary_key=True)
  4. class B(Base):
  5. __tablename__ = "b"
  6. id = mapped_column(Integer, primary_key=True)
  7. a_id = mapped_column(ForeignKey("a.id"))
  8. partition = select(
  9. B, func.row_number().over(order_by=B.id, partition_by=B.a_id).label("index")
  10. ).alias()
  11. partitioned_b = aliased(B, partition)
  12. A.partitioned_bs = relationship(
  13. partitioned_b, primaryjoin=and_(partitioned_b.a_id == A.id, partition.c.index < 10)
  14. )

We can use the above partitioned_bs relationship with most of the loader strategies, such as selectinload():

  1. for a1 in session.scalars(select(A).options(selectinload(A.partitioned_bs))):
  2. print(a1.partitioned_bs) # <-- will be no more than ten objects

Where above, the “selectinload” query looks like:

  1. SELECT
  2. a_1.id AS a_1_id, anon_1.id AS anon_1_id, anon_1.a_id AS anon_1_a_id,
  3. anon_1.data AS anon_1_data, anon_1.index AS anon_1_index
  4. FROM a AS a_1
  5. JOIN (
  6. SELECT b.id AS id, b.a_id AS a_id, b.data AS data,
  7. row_number() OVER (PARTITION BY b.a_id ORDER BY b.id) AS index
  8. FROM b) AS anon_1
  9. ON anon_1.a_id = a_1.id AND anon_1.index < %(index_1)s
  10. WHERE a_1.id IN ( ... primary key collection ...)
  11. ORDER BY a_1.id

Above, for each matching primary key in “a”, we will get the first ten “bs” as ordered by “b.id”. By partitioning on “a_id” we ensure that each “row number” is local to the parent “a_id”.

Such a mapping would ordinarily also include a “plain” relationship from “A” to “B”, for persistence operations as well as when the full set of “B” objects per “A” is desired.

Building Query-Enabled Properties

Very ambitious custom join conditions may fail to be directly persistable, and in some cases may not even load correctly. To remove the persistence part of the equation, use the flag relationship.viewonly on the relationship(), which establishes it as a read-only attribute (data written to the collection will be ignored on flush()). However, in extreme cases, consider using a regular Python property in conjunction with Query as follows:

  1. class User(Base):
  2. __tablename__ = "user"
  3. id = mapped_column(Integer, primary_key=True)
  4. @property
  5. def addresses(self):
  6. return object_session(self).query(Address).with_parent(self).filter(...).all()

In other cases, the descriptor can be built to make use of existing in-Python data. See the section on Using Descriptors and Hybrids for more general discussion of special Python attributes.

See also

Using Descriptors and Hybrids