Mypy / Pep-484 Support for ORM Mappings

Support for PEP 484 typing annotations as well as the Mypy type checking tool.

Note

The Mypy plugin and typing annotations should be regarded as alpha level for the early 1.4 releases of SQLAlchemy. The plugin has not been tested in real world scenarios and may have many unhandled cases and error conditions. Specifics of the new typing stubs are also subject to change during the 1.4 series.

Installation

The Mypy plugin depends upon new stubs for SQLAlchemy packaged at sqlalchemy2-stubs. These stubs necessarily fully replace the previous sqlalchemy-stubs typing annotations published by Dropbox, as they occupy the same sqlalchemy-stubs namespace as specified by PEP 561. The Mypy package itself is also a dependency.

Both packages may be installed using the “mypy” extras hook using pip:

  1. pip install sqlalchemy[mypy]

The plugin itself is configured as described in Configuring mypy to use Plugins, using the sqlalchemy.ext.mypy.plugin module name, such as within setup.cfg:

  1. [mypy]
  2. plugins = sqlalchemy.ext.mypy.plugin

What the Plugin Does

The primary purpose of the Mypy plugin is to intercept and alter the static definition of SQLAlchemy declarative mappings so that they match up to how they are structured after they have been instrumented by their Mapper objects. This allows both the class structure itself as well as code that uses the class to make sense to the Mypy tool, which otherwise would not be the case based on how declarative mappings currently function. The plugin is not unlike similar plugins that are required for libraries like dataclasses which alter classes dynamically at runtime.

To cover the major areas where this occurs, consider the following ORM mapping, using the typical example of the User class:

  1. from sqlalchemy import Column
  2. from sqlalchemy import Integer
  3. from sqlalchemy import String
  4. from sqlalchemy import select
  5. from sqlalchemy.orm import declarative_base
  6. # "Base" is a class that is created dynamically from the
  7. # declarative_base() function
  8. Base = declarative_base()
  9. class User(Base):
  10. __tablename__ = 'user'
  11. id = Column(Integer, primary_key=True)
  12. name = Column(String)
  13. # "some_user" is an instance of the User class, which
  14. # accepts "id" and "name" kwargs based on the mapping
  15. some_user = User(id=5, name='user')
  16. # it has an attribute called .name that's a string
  17. print(f"Username: {some_user.name}")
  18. # a select() construct makes use of SQL expressions derived from the
  19. # User class itself
  20. select_stmt = select(User).where(User.id.in_([3, 4, 5])).where(User.name.contains('s'))

Above, the steps that the Mypy extension can take include:

  • Interpretation of the Base dynamic class generated by declarative_base(), so that classes which inherit from it are known to be mapped. It also can accommodate the class decorator approach described at Declarative Mapping using a Decorator (no declarative base).

  • Type inference for ORM mapped attributes that are defined in declarative “inline” style, in the above example the id and name attributes of the User class. This includes that an instance of User will use int for id and str for name. It also includes that when the User.id and User.name class-level attributes are accessed, as they are above in the select() statement, they are compatible with SQL expression behavior, which is derived from the InstrumentedAttribute attribute descriptor class.

  • Application of an __init__() method to mapped classes that do not already include an explicit constructor, which accepts keyword arguments of specific types for all mapped attributes detected.

When the Mypy plugin processes the above file, the resulting static class definition and Python code passed to the Mypy tool is equivalent to the following:

  1. from sqlalchemy import Column
  2. from sqlalchemy import Integer
  3. from sqlalchemy import String
  4. from sqlalchemy import select
  5. from sqlalchemy.orm import declarative_base
  6. from sqlalchemy.orm.decl_api import DeclarativeMeta
  7. from sqlalchemy.orm import Mapped
  8. class Base(metaclass=DeclarativeMeta):
  9. __abstract__ = True
  10. class User(Base):
  11. __tablename__ = 'user'
  12. id: Mapped[Optional[int]] = Mapped._special_method(
  13. Column(Integer, primary_key=True)
  14. )
  15. name: Mapped[Optional[str]] = Mapped._special_method(
  16. Column(String)
  17. )
  18. def __init__(self, id: Optional[int] = ..., name: Optional[str] = ...) -> None:
  19. ...
  20. some_user = User(id=5, name='user')
  21. print(f"Username: {some_user.name}")
  22. select_stmt = select(User).where(User.id.in_([3, 4, 5])).where(User.name.contains('s'))

The key steps which have been taken above include:

  • The Base class is now defined in terms of the DeclarativeMeta class explicitly, rather than being a dynamic class.

  • The id and name attributes are defined in terms of the Mapped class, which represents a Python descriptor that exhibits different behaviors at the class vs. instance levels. The Mapped class is now the base class for the InstrumentedAttribute class that is used for all ORM mapped attributes.

    In sqlalchemy2-stubs, Mapped is defined as a generic class against arbitrary Python types, meaning specific occurrences of Mapped are associated with a specific Python type, such as Mapped[Optional[int]] and Mapped[Optional[str]] above.

  • The right-hand side of the declarative mapped attribute assignments are removed, as this resembles the operation that the Mapper class would normally be doing, which is that it would be replacing these attributes with specific instances of InstrumentedAttribute. The original expression is moved into a function call that will allow it to still be type-checked without conflicting with the left-hand side of the expression. For Mypy purposes, the left-hand typing annotation is sufficient for the attribute’s behavior to be understood.

  • A type stub for the User.__init__() method is added which includes the correct keywords and datatypes.

Usage

The following subsections will address individual uses cases that have so far been considered for pep-484 compliance.

Introspection of Columns based on TypeEngine

For mapped columns that include an explicit datatype, when they are mapped as inline attributes, the mapped type will be introspected automatically:

  1. class MyClass(Base):
  2. # ...
  3. id = Column(Integer, primary_key=True)
  4. name = Column("employee_name", String(50), nullable=False)
  5. other_name = Column(String(50))

Above, the ultimate class-level datatypes of id, name and other_name will be introspected as Mapped[Optional[int]], Mapped[Optional[str]] and Mapped[Optional[str]]. The types are by default always considered to be Optional, even for the primary key and non-nullable column. The reason is because while the database columns “id” and “name” can’t be NULL, the Python attributes id and name most certainly can be None without an explicit constructor:

  1. >>> m1 = MyClass()
  2. >>> m1.id
  3. None

The types of the above columns can be stated explicitly, providing the two advantages of clearer self-documentation as well as being able to control which types are optional:

  1. class MyClass(Base):
  2. # ...
  3. id: int = Column(Integer, primary_key=True)
  4. name: str = Column("employee_name", String(50), nullable=False)
  5. other_name: Optional[str] = Column(String(50))

The Mypy plugin will accept the above int, str and Optional[str] and convert them to include the Mapped[] type surrounding them. The Mapped[] construct may also be used explicitly:

  1. from sqlalchemy.orm import Mapped
  2. class MyClass(Base):
  3. # ...
  4. id: Mapped[int] = Column(Integer, primary_key=True)
  5. name: Mapped[str] = Column("employee_name", String(50), nullable=False)
  6. other_name: Mapped[Optional[str]] = Column(String(50))

When the type is non-optional, it simply means that the attribute as accessed from an instance of MyClass will be considered to be non-None:

  1. mc = MyClass(...)
  2. # will pass mypy --strict
  3. name: str = mc.name

For optional attributes, Mypy considers that the type must include None or otherwise be Optional:

  1. mc = MyClass(...)
  2. # will pass mypy --strict
  3. other_name: Optional[str] = mc.name

Whether or not the mapped attribute is typed as Optional, the generation of the __init__() method will still consider all keywords to be optional. This is again matching what the SQLAlchemy ORM actually does when it creates the constructor, and should not be confused with the behavior of a validating system such as Python dataclasses which will generate a constructor that matches the annotations in terms of optional vs. required attributes.

Tip

In the above examples the Integer and String datatypes are both TypeEngine subclasses. In sqlalchemy2-stubs, the Column object is a generic which subscribes to the type, e.g. above the column types are Column[Integer], Column[String], and Column[String]. The Integer and String classes are in turn generically subscribed to the Python types they correspond towards, i.e. Integer(TypeEngine[int]), String(TypeEngine[str]).

Columns that Don’t have an Explicit Type

Columns that include a ForeignKey modifier do not need to specify a datatype in a SQLAlchemy declarative mapping. For this type of attribute, the Mypy plugin will inform the user that it needs an explicit type to be sent:

  1. # .. other imports
  2. from sqlalchemy.sql.schema import ForeignKey
  3. Base = declarative_base()
  4. class User(Base):
  5. __tablename__ = 'user'
  6. id = Column(Integer, primary_key=True)
  7. name = Column(String)
  8. class Address(Base):
  9. __tablename__ = 'address'
  10. id = Column(Integer, primary_key=True)
  11. user_id = Column(ForeignKey("user.id"))

The plugin will deliver the message as follows:

  1. $ mypy test3.py --strict
  2. test3.py:20: error: [SQLAlchemy Mypy plugin] Can't infer type from
  3. ORM mapped expression assigned to attribute 'user_id'; please specify a
  4. Python type or Mapped[<python type>] on the left hand side.
  5. Found 1 error in 1 file (checked 1 source file)

To resolve, apply an explicit type annotation to the Address.user_id column:

  1. class Address(Base):
  2. __tablename__ = 'address'
  3. id = Column(Integer, primary_key=True)
  4. user_id: int = Column(ForeignKey("user.id"))

Mapping Columns with Imperative Table

In imperative table style, the Column definitions are given inside of a Table construct which is separate from the mapped attributes themselves. The Mypy plugin does not consider this Table, but instead supports that the attributes can be explicitly stated with a complete annotation that must use the Mapped class to identify them as mapped attributes:

  1. class MyClass(Base):
  2. __table__ = Table(
  3. "mytable",
  4. Base.metadata,
  5. Column(Integer, primary_key=True),
  6. Column("employee_name", String(50), nullable=False),
  7. Column(String(50))
  8. )
  9. id: Mapped[int]
  10. name: Mapped[str]
  11. other_name: Mapped[Optional[str]]

The above Mapped annotations are considered as mapped columns and will be included in the default constructor, as well as provide the correct typing profile for MyClass both at the class level and the instance level.

Mapping Relationships

The plugin has limited support for using type inference to detect the types for relationships. For all those cases where it can’t detect the type, it will emit an informative error message, and in all cases the appropriate type may be provided explicitly, either with the Mapped class or optionally omitting it for an inline declaration. The plugin also needs to determine whether or not the relationship refers to a collection or a scalar, and for that it relies upon the explicit value of the relationship.uselist and/or relationship.collection_class parameters. An explicit type is needed if neither of these parameters are present, as well as if the target type of the relationship() is a string or callable, and not a class:

  1. class User(Base):
  2. __tablename__ = 'user'
  3. id = Column(Integer, primary_key=True)
  4. name = Column(String)
  5. class Address(Base):
  6. __tablename__ = 'address'
  7. id = Column(Integer, primary_key=True)
  8. user_id: int = Column(ForeignKey("user.id"))
  9. user = relationship(User)

The above mapping will produce the following error:

  1. test3.py:22: error: [SQLAlchemy Mypy plugin] Can't infer scalar or
  2. collection for ORM mapped expression assigned to attribute 'user'
  3. if both 'uselist' and 'collection_class' arguments are absent from the
  4. relationship(); please specify a type annotation on the left hand side.
  5. Found 1 error in 1 file (checked 1 source file)

The error can be resolved either by using relationship(User, uselist=False) or by providing the type, in this case the scalar User object:

  1. class Address(Base):
  2. __tablename__ = 'address'
  3. id = Column(Integer, primary_key=True)
  4. user_id: int = Column(ForeignKey("user.id"))
  5. user: User = relationship(User)

For collections, a similar pattern applies, where in the absence of uselist=True or a relationship.collection_class, a collection annotation such as List may be used. It is also fully appropriate to use the string name of the class in the annotation as supported by pep-484, ensuring the class is imported with in the TYPE_CHECKING block as approriate:

  1. from typing import List, TYPE_CHECKING
  2. from .mymodel import Base
  3. if TYPE_CHECKING:
  4. # if the target of the relationship is in another module
  5. # that cannot normally be imported at runtime
  6. from .myaddressmodel import Address
  7. class User(Base):
  8. __tablename__ = 'user'
  9. id = Column(Integer, primary_key=True)
  10. name = Column(String)
  11. addresses: List["Address"] = relationship("Address")

As is the case with columns, the Mapped class may also be applied explicitly:

  1. class User(Base):
  2. __tablename__ = 'user'
  3. id = Column(Integer, primary_key=True)
  4. name = Column(String)
  5. addresses: Mapped[List["Address"]] = relationship("Address", back_populates="user")
  6. class Address(Base):
  7. __tablename__ = 'address'
  8. id = Column(Integer, primary_key=True)
  9. user_id: int = Column(ForeignKey("user.id"))
  10. user: Mapped[User] = relationship(User, back_populates="addresses")

Using @declared_attr

The declared_attr class allows Declarative mapped attributes to be declared in class level functions, and is particularly useful when using declarative mixins. For these functions, the return type of the function should be annotated using either the Mapped[] construct or by indicating the exact kind of object returned by the function:

  1. from sqlalchemy.orm.decl_api import declared_attr
  2. class HasUpdatedAt:
  3. @declared_attr
  4. def updated_at(cls) -> Column[DateTime]: # uses Column
  5. return Column(DateTime)
  6. class HasCompany:
  7. @declared_attr
  8. def company_id(cls) -> Mapped[int]: # uses Mapped
  9. return Column(ForeignKey("company.id"))
  10. @declared_attr
  11. def company(cls) -> Mapped["Company"]:
  12. return relationship("Company")
  13. class Employee(HasUpdatedAt, HasCompany, Base):
  14. __tablename__ = 'employee'
  15. id = Column(Integer, primary_key=True)
  16. name = Column(String)

Note the mismatch between the actual return type of a method like HasCompany.company vs. what is annotated. The Mypy plugin converts all @declared_attr functions into simple annotated attributes to avoid this complexity:

  1. # what Mypy sees
  2. class HasCompany:
  3. company_id: Mapped[int]
  4. company: Mapped["Company"]

Combining with Dataclasses or Other Type-Sensitive Attribute Systems

The examples of Python dataclasses integration at Declarative Mapping with Dataclasses and Attrs presents a problem; Python dataclasses expect an explicit type that it will use to build the class, and the value given in each assignment statement is significant. That is, a class as follows has to be stated exactly as it is in order to be accepted by dataclasses:

  1. mapper_registry : registry = registry()
  2. @mapper_registry.mapped
  3. @dataclass
  4. class User:
  5. __table__ = Table(
  6. "user",
  7. mapper_registry.metadata,
  8. Column("id", Integer, primary_key=True),
  9. Column("name", String(50)),
  10. Column("fullname", String(50)),
  11. Column("nickname", String(12)),
  12. )
  13. id: int = field(init=False)
  14. name: Optional[str] = None
  15. fullname: Optional[str] = None
  16. nickname: Optional[str] = None
  17. addresses: List[Address] = field(default_factory=list)
  18. __mapper_args__ = { # type: ignore
  19. "properties" : {
  20. "addresses": relationship("Address")
  21. }
  22. }

We can’t apply our Mapped[] types to the attributes id, name, etc. because they will be rejected by the @dataclass decorator. Additionally, Mypy has another plugin for dataclasses explicitly which can also get in the way of what we’re doing.

The above class will actually pass Mypy’s type checking without issue; the only thing we are missing is the ability for attributes on User to be used in SQL expressions, such as:

  1. stmt = select(User.name).where(User.id.in_([1, 2, 3]))

To provide a workaround for this, the Mypy plugin has an additional feature whereby we can specify an extra attribute _mypy_mapped_attrs, that is a list that encloses the class-level objects or their string names. This attribute can be conditional within the TYPE_CHECKING variable:

  1. @mapper_registry.mapped
  2. @dataclass
  3. class User:
  4. __table__ = Table(
  5. "user",
  6. mapper_registry.metadata,
  7. Column("id", Integer, primary_key=True),
  8. Column("name", String(50)),
  9. Column("fullname", String(50)),
  10. Column("nickname", String(12)),
  11. )
  12. id: int = field(init=False)
  13. name: Optional[str] = None
  14. fullname: Optional[str]
  15. nickname: Optional[str]
  16. addresses: List[Address] = field(default_factory=list)
  17. if TYPE_CHECKING:
  18. _mypy_mapped_attrs = [id, name, "fullname", "nickname", addresses]
  19. __mapper_args__ = { # type: ignore
  20. "properties" : {
  21. "addresses": relationship("Address")
  22. }
  23. }

With the above recipe, the attributes listed in _mypy_mapped_attrs will be applied with the Mapped typing information so that the User class will behave as a SQLAlchemy mapped class when used in a class-bound context.