SQL Expression Language Tutorial
The SQLAlchemy Expression Language presents a system of representingrelational database structures and expressions using Python constructs. Theseconstructs are modeled to resemble those of the underlying database as closelyas possible, while providing a modicum of abstraction of the variousimplementation differences between database backends. While the constructsattempt to represent equivalent concepts between backends with consistentstructures, they do not conceal useful concepts that are unique to particularsubsets of backends. The Expression Language therefore presents a method ofwriting backend-neutral SQL expressions, but does not attempt to enforce thatexpressions are backend-neutral.
The Expression Language is in contrast to the Object Relational Mapper, whichis a distinct API that builds on top of the Expression Language. Whereas theORM, introduced in Object Relational Tutorial, presents a high level andabstracted pattern of usage, which itself is an example of applied usage ofthe Expression Language, the Expression Language presents a system ofrepresenting the primitive constructs of the relational database directlywithout opinion.
While there is overlap among the usage patterns of the ORM and the ExpressionLanguage, the similarities are more superficial than they may at first appear.One approaches the structure and content of data from the perspective of auser-defined domain model which is transparentlypersisted and refreshed from its underlying storage model. The otherapproaches it from the perspective of literal schema and SQL expressionrepresentations which are explicitly composed into messages consumedindividually by the database.
A successful application may be constructed using the Expression Languageexclusively, though the application will need to define its own system oftranslating application concepts into individual database messages and fromindividual database result sets. Alternatively, an application constructedwith the ORM may, in advanced scenarios, make occasional usage of theExpression Language directly in certain areas where specific databaseinteractions are required.
The following tutorial is in doctest format, meaning each >>>
linerepresents something you can type at a Python command prompt, and thefollowing text represents the expected return value. The tutorial has noprerequisites.
Version Check
A quick check to verify that we are on at least version 1.3 of SQLAlchemy:
- >>> import sqlalchemy
- >>> sqlalchemy.__version__
- 1.3.0
Connecting
For this tutorial we will use an in-memory-only SQLite database. This is aneasy way to test things without needing to have an actual database definedanywhere. To connect we use create_engine()
:
- >>> from sqlalchemy import create_engine
- >>> engine = create_engine('sqlite:///:memory:', echo=True)
The echo
flag is a shortcut to setting up SQLAlchemy logging, which isaccomplished via Python’s standard logging
module. With it enabled, we’llsee all the generated SQL produced. If you are working through this tutorialand want less output generated, set it to False
. This tutorial will formatthe SQL behind a popup window so it doesn’t get in our way; just click the“SQL” links to see what’s being generated.
The return value of create_engine()
is an instance ofEngine
, and it represents the core interface to thedatabase, adapted through a dialect that handles the detailsof the database and DBAPI in use. In this case the SQLitedialect will interpret instructions to the Python built-in sqlite3
module.
Lazy Connecting
The Engine
, when first returned by create_engine()
,has not actually tried to connect to the database yet; that happensonly the first time it is asked to perform a task against the database.
The first time a method like Engine.execute()
or Engine.connect()
is called, the Engine
establishes a real DBAPI connection to thedatabase, which is then used to emit the SQL.
See also
Database Urls - includes examples of create_engine()
connecting to several kinds of databases with links to more information.
Define and Create Tables
The SQL Expression Language constructs its expressions in most cases againsttable columns. In SQLAlchemy, a column is most often represented by an objectcalled Column
, and in all cases aColumn
is associated with aTable
. A collection ofTable
objects and their associated child objectsis referred to as database metadata. In this tutorial we will explicitlylay out several Table
objects, but note that SAcan also “import” whole sets of Table
objectsautomatically from an existing database (this process is called tablereflection).
We define our tables all within a catalog calledMetaData
, using theTable
construct, which resembles regular SQLCREATE TABLE statements. We’ll make two tables, one of which represents“users” in an application, and another which represents zero or more “emailaddresses” for each row in the “users” table:
- >>> from sqlalchemy import Table, Column, Integer, String, MetaData, ForeignKey
- >>> metadata = MetaData()
- >>> users = Table('users', metadata,
- ... Column('id', Integer, primary_key=True),
- ... Column('name', String),
- ... Column('fullname', String),
- ... )
- >>> addresses = Table('addresses', metadata,
- ... Column('id', Integer, primary_key=True),
- ... Column('user_id', None, ForeignKey('users.id')),
- ... Column('email_address', String, nullable=False)
- ... )
All about how to define Table
objects, as well ashow to create them from an existing database automatically, is described inDescribing Databases with MetaData.
Next, to tell the MetaData
we’d actually like tocreate our selection of tables for real inside the SQLite database, we usecreate_all()
, passing it the engine
instance which points to our database. This will check for the presence ofeach table first before creating, so it’s safe to call multiple times:
- sql>>> metadata.create_all(engine)
- SE...
CREATE TABLE users ( id INTEGER NOT NULL, name VARCHAR, fullname VARCHAR, PRIMARY KEY (id) ) () COMMIT CREATE TABLE addresses ( id INTEGER NOT NULL, user_id INTEGER, email_address VARCHAR NOT NULL, PRIMARY KEY (id), FOREIGN KEY(user_id) REFERENCES users (id) ) () COMMIT
Note
Users familiar with the syntax of CREATE TABLE may notice that theVARCHAR columns were generated without a length; on SQLite and PostgreSQL,this is a valid datatype, but on others, it’s not allowed. So if runningthis tutorial on one of those databases, and you wish to use SQLAlchemy toissue CREATE TABLE, a “length” may be provided to the String
type asbelow:
- Column('name', String(50))
The length field on String
, as well as similar precision/scale fieldsavailable on Integer
, Numeric
, etc. are not referenced bySQLAlchemy other than when creating tables.
Additionally, Firebird and Oracle require sequences to generate newprimary key identifiers, and SQLAlchemy doesn’t generate or assume thesewithout being instructed. For that, you use the Sequence
construct:
- from sqlalchemy import Sequence
- Column('id', Integer, Sequence('user_id_seq'), primary_key=True)
A full, foolproof Table
is therefore:
- users = Table('users', metadata,
- Column('id', Integer, Sequence('user_id_seq'), primary_key=True),
- Column('name', String(50)),
- Column('fullname', String(50)),
- Column('nickname', String(50))
- )
We include this more verbose Table
construct separatelyto highlight the difference between a minimal construct geared primarilytowards in-Python usage only, versus one that will be used to emit CREATETABLE statements on a particular set of backends with more stringentrequirements.
Insert Expressions
The first SQL expression we’ll create is theInsert
construct, which represents anINSERT statement. This is typically created relative to its target table:
- >>> ins = users.insert()
To see a sample of the SQL this construct produces, use the str()
function:
- >>> str(ins)
- 'INSERT INTO users (id, name, fullname) VALUES (:id, :name, :fullname)'
Notice above that the INSERT statement names every column in the users
table. This can be limited by using the values()
method, which establishesthe VALUES clause of the INSERT explicitly:
- >>> ins = users.insert().values(name='jack', fullname='Jack Jones')
- >>> str(ins)
- 'INSERT INTO users (name, fullname) VALUES (:name, :fullname)'
Above, while the values
method limited the VALUES clause to just twocolumns, the actual data we placed in values
didn’t get rendered into thestring; instead we got named bind parameters. As it turns out, our data _is_stored within our Insert
construct, but ittypically only comes out when the statement is actually executed; since thedata consists of literal values, SQLAlchemy automatically generates bindparameters for them. We can peek at this data for now by looking at thecompiled form of the statement:
- >>> ins.compile().params
- {'fullname': 'Jack Jones', 'name': 'jack'}
Executing
The interesting part of an Insert
isexecuting it. In this tutorial, we will generally focus on the most explicitmethod of executing a SQL construct, and later touch upon some “shortcut” waysto do it. The engine
object we created is a repository for databaseconnections capable of issuing SQL to the database. To acquire a connection,we use the connect()
method:
- >>> conn = engine.connect()
- >>> conn
- <sqlalchemy.engine.base.Connection object at 0x...>
The Connection
object represents an activelychecked out DBAPI connection resource. Lets feed it ourInsert
object and see what happens:
- >>> result = conn.execute(ins)
INSERT INTO users (name, fullname) VALUES (?, ?) ('jack', 'Jack Jones') COMMIT
So the INSERT statement was now issued to the database. Although we gotpositional “qmark” bind parameters instead of “named” bind parameters in theoutput. How come ? Because when executed, theConnection
used the SQLite dialect tohelp generate the statement; when we use the str()
function, the statementisn’t aware of this dialect, and falls back onto a default which uses namedparameters. We can view this manually as follows:
- >>> ins.bind = engine
- >>> str(ins)
- 'INSERT INTO users (name, fullname) VALUES (?, ?)'
What about the result
variable we got when we called execute()
? Asthe SQLAlchemy Connection
object references aDBAPI connection, the result, known as aResultProxy
object, is analogous to the DBAPIcursor object. In the case of an INSERT, we can get important information fromit, such as the primary key values which were generated from our statementusing ResultProxy.inserted_primary_key
:
- >>> result.inserted_primary_key
- [1]
The value of 1
was automatically generated by SQLite, but only because wedid not specify the id
column in ourInsert
statement; otherwise, our explicitvalue would have been used. In either case, SQLAlchemy always knows how to getat a newly generated primary key value, even though the method of generatingthem is different across different databases; each database’sDialect
knows the specific steps needed todetermine the correct value (or values; note thatResultProxy.inserted_primary_key
returns a list so that it supports composite primary keys). Methods hererange from using cursor.lastrowid
, to selecting from a database-specificfunction, to using INSERT..RETURNING
syntax; this all occurs transparently.
Executing Multiple Statements
Our insert example above was intentionally a little drawn out to show somevarious behaviors of expression language constructs. In the usual case, anInsert
statement is usually compiledagainst the parameters sent to the execute()
method onConnection
, so that there’s no need to usethe values
keyword with Insert
. Letscreate a generic Insert
statement againand use it in the “normal” way:
- >>> ins = users.insert()
- >>> conn.execute(ins, id=2, name='wendy', fullname='Wendy Williams')
INSERT INTO users (id, name, fullname) VALUES (?, ?, ?) (2, 'wendy', 'Wendy Williams') COMMIT- <sqlalchemy.engine.result.ResultProxy object at 0x...>
Above, because we specified all three columns in the execute()
method,the compiled Insert
included all threecolumns. The Insert
statement is compiledat execution time based on the parameters we specified; if we specified fewerparameters, the Insert
would have fewerentries in its VALUES clause.
To issue many inserts using DBAPI’s executemany()
method, we can send in alist of dictionaries each containing a distinct set of parameters to beinserted, as we do here to add some email addresses:
- >>> conn.execute(addresses.insert(), [
- ... {'user_id': 1, 'email_address' : 'jack@yahoo.com'},
- ... {'user_id': 1, 'email_address' : 'jack@msn.com'},
- ... {'user_id': 2, 'email_address' : 'www@www.org'},
- ... {'user_id': 2, 'email_address' : 'wendy@aol.com'},
- ... ])
INSERT INTO addresses (user_id, email_address) VALUES (?, ?) ((1, 'jack@yahoo.com'), (1, 'jack@msn.com'), (2, 'www@www.org'), (2, 'wendy@aol.com')) COMMIT- <sqlalchemy.engine.result.ResultProxy object at 0x...>
Above, we again relied upon SQLite’s automatic generation of primary keyidentifiers for each addresses
row.
When executing multiple sets of parameters, each dictionary must have thesame set of keys; i.e. you cant have fewer keys in some dictionaries thanothers. This is because the Insert
statement is compiled against the first dictionary in the list, and it’sassumed that all subsequent argument dictionaries are compatible with thatstatement.
The “executemany” style of invocation is available for each of theinsert()
, update()
and delete()
constructs.
Selecting
We began with inserts just so that our test database had some data in it. Themore interesting part of the data is selecting it! We’ll cover UPDATE andDELETE statements later. The primary construct used to generate SELECTstatements is the select()
function:
- >>> from sqlalchemy.sql import select
- >>> s = select([users])
- >>> result = conn.execute(s)
SELECT users.id, users.name, users.fullname FROM users ()
Above, we issued a basic select()
call, placing the users
tablewithin the COLUMNS clause of the select, and then executing. SQLAlchemyexpanded the users
table into the set of each of its columns, and alsogenerated a FROM clause for us. The result returned is again aResultProxy
object, which acts much like aDBAPI cursor, including methods such asfetchone()
andfetchall()
. The easiest way to getrows from it is to just iterate:
- >>> for row in result:
- ... print(row)
- (1, u'jack', u'Jack Jones')
- (2, u'wendy', u'Wendy Williams')
Above, we see that printing each row produces a simple tuple-like result. Wehave more options at accessing the data in each row. One very common way isthrough dictionary access, using the string names of columns:
- sql>>> result = conn.execute(s)
SELECT users.id, users.name, users.fullname FROM users ()- >>> row = result.fetchone()
- >>> print("name:", row['name'], "; fullname:", row['fullname'])
- name: jack ; fullname: Jack Jones
Integer indexes work as well:
- >>> row = result.fetchone()
- >>> print("name:", row[1], "; fullname:", row[2])
- name: wendy ; fullname: Wendy Williams
But another way, whose usefulness will become apparent later on, is to use theColumn
objects directly as keys:
- sql>>> for row in conn.execute(s):
- ... print("name:", row[users.c.name], "; fullname:", row[users.c.fullname])
SELECT users.id, users.name, users.fullname FROM users ()- name: jack ; fullname: Jack Jones
- name: wendy ; fullname: Wendy Williams
Result sets which have pending rows remaining should be explicitly closedbefore discarding. While the cursor and connection resources referenced by theResultProxy
will be respectively closed andreturned to the connection pool when the object is garbage collected, it’sbetter to make it explicit as some database APIs are very picky about suchthings:
- >>> result.close()
If we’d like to more carefully control the columns which are placed in theCOLUMNS clause of the select, we reference individualColumn
objects from ourTable
. These are available as named attributes offthe c
attribute of the Table
object:
- >>> s = select([users.c.name, users.c.fullname])
- sql>>> result = conn.execute(s)
SELECT users.name, users.fullname FROM users ()- >>> for row in result:
- ... print(row)
- (u'jack', u'Jack Jones')
- (u'wendy', u'Wendy Williams')
Lets observe something interesting about the FROM clause. Whereas thegenerated statement contains two distinct sections, a “SELECT columns” partand a “FROM table” part, our select()
construct only has a listcontaining columns. How does this work ? Let’s try putting two tables intoour select()
statement:
- sql>>> for row in conn.execute(select([users, addresses])):
- ... print(row)
SELECT users.id, users.name, users.fullname, addresses.id, addresses.user_id, addresses.email_address FROM users, addresses ()- (1, u'jack', u'Jack Jones', 1, 1, u'jack@yahoo.com')
- (1, u'jack', u'Jack Jones', 2, 1, u'jack@msn.com')
- (1, u'jack', u'Jack Jones', 3, 2, u'www@www.org')
- (1, u'jack', u'Jack Jones', 4, 2, u'wendy@aol.com')
- (2, u'wendy', u'Wendy Williams', 1, 1, u'jack@yahoo.com')
- (2, u'wendy', u'Wendy Williams', 2, 1, u'jack@msn.com')
- (2, u'wendy', u'Wendy Williams', 3, 2, u'www@www.org')
- (2, u'wendy', u'Wendy Williams', 4, 2, u'wendy@aol.com')
It placed both tables into the FROM clause. But also, it made a real mess.Those who are familiar with SQL joins know that this is a Cartesianproduct; each row from the users
table is produced against each row fromthe addresses
table. So to put some sanity into this statement, we need aWHERE clause. We do that using Select.where()
:
- >>> s = select([users, addresses]).where(users.c.id == addresses.c.user_id)
- sql>>> for row in conn.execute(s):
- ... print(row)
SELECT users.id, users.name, users.fullname, addresses.id, addresses.user_id, addresses.email_address FROM users, addresses WHERE users.id = addresses.user_id ()- (1, u'jack', u'Jack Jones', 1, 1, u'jack@yahoo.com')
- (1, u'jack', u'Jack Jones', 2, 1, u'jack@msn.com')
- (2, u'wendy', u'Wendy Williams', 3, 2, u'www@www.org')
- (2, u'wendy', u'Wendy Williams', 4, 2, u'wendy@aol.com')
So that looks a lot better, we added an expression to our select()
which had the effect of adding WHERE users.id = addresses.user_id
to ourstatement, and our results were managed down so that the join of users
andaddresses
rows made sense. But let’s look at that expression? It’s usingjust a Python equality operator between two differentColumn
objects. It should be clear that somethingis up. Saying 1 == 1
produces True
, and 1 == 2
produces False
, nota WHERE clause. So lets see exactly what that expression is doing:
- >>> users.c.id == addresses.c.user_id
- <sqlalchemy.sql.elements.BinaryExpression object at 0x...>
Wow, surprise ! This is neither a True
nor a False
. Well what is it ?
- >>> str(users.c.id == addresses.c.user_id)
- 'users.id = addresses.user_id'
As you can see, the ==
operator is producing an object that is very muchlike the Insert
and select()
objects we’ve made so far, thanks to Python’s eq()
builtin; you callstr()
on it and it produces SQL. By now, one can see that everything weare working with is ultimately the same type of object. SQLAlchemy terms thebase class of all of these expressions as ColumnElement
.
Operators
Since we’ve stumbled upon SQLAlchemy’s operator paradigm, let’s go throughsome of its capabilities. We’ve seen how to equate two columns to each other:
- >>> print(users.c.id == addresses.c.user_id)
- users.id = addresses.user_id
If we use a literal value (a literal meaning, not a SQLAlchemy clause object),we get a bind parameter:
- >>> print(users.c.id == 7)
- users.id = :id_1
The 7
literal is embedded the resultingColumnElement
; we can use the same trickwe did with the Insert
object to see it:
- >>> (users.c.id == 7).compile().params
- {u'id_1': 7}
Most Python operators, as it turns out, produce a SQL expression here, likeequals, not equals, etc.:
- >>> print(users.c.id != 7)
- users.id != :id_1
- >>> # None converts to IS NULL
- >>> print(users.c.name == None)
- users.name IS NULL
- >>> # reverse works too
- >>> print('fred' > users.c.name)
- users.name < :name_1
If we add two integer columns together, we get an addition expression:
- >>> print(users.c.id + addresses.c.id)
- users.id + addresses.id
Interestingly, the type of the Column
is important!If we use +
with two string based columns (recall we put types likeInteger
and String
onour Column
objects at the beginning), we getsomething different:
- >>> print(users.c.name + users.c.fullname)
- users.name || users.fullname
Where ||
is the string concatenation operator used on most databases. Butnot all of them. MySQL users, fear not:
- >>> print((users.c.name + users.c.fullname).
- ... compile(bind=create_engine('mysql://')))
- concat(users.name, users.fullname)
The above illustrates the SQL that’s generated for anEngine
that’s connected to a MySQL database;the ||
operator now compiles as MySQL’s concat()
function.
If you have come across an operator which really isn’t available, you canalways use the Operators.op()
method; this generates whatever operator you need:
- >>> print(users.c.name.op('tiddlywinks')('foo'))
- users.name tiddlywinks :name_1
This function can also be used to make bitwise operators explicit. For example:
- somecolumn.op('&')(0xff)
is a bitwise AND of the value in somecolumn
.
When using Operators.op()
, the return type of the expression may be important,especially when the operator is used in an expression that will be sent as a resultcolumn. For this case, be sure to make the type explicit, if not what’snormally expected, using type_coerce()
:
- from sqlalchemy import type_coerce
- expr = type_coerce(somecolumn.op('-%>')('foo'), MySpecialType())
- stmt = select([expr])
For boolean operators, use the Operators.bool_op()
method, whichwill ensure that the return type of the expression is handled as boolean:
- somecolumn.bool_op('-->')('some value')
New in version 1.2.0b3: Added the Operators.bool_op()
method.
Operator Customization
While Operators.op()
is handy to get at a custom operator in a hurry,the Core supports fundamental customization and extension of the operator system atthe type level. The behavior of existing operators can be modified on a per-typebasis, and new operations can be defined which become available for all columnexpressions that are part of that particular type. See the section Redefining and Creating New Operatorsfor a description.
Conjunctions
We’d like to show off some of our operators inside of select()
constructs. But we need to lump them together a little more, so let’s firstintroduce some conjunctions. Conjunctions are those little words like AND andOR that put things together. We’ll also hit upon NOT. and_()
, or_()
,and not_()
can workfrom the corresponding functions SQLAlchemy provides (notice we also throw ina like()
):
- >>> from sqlalchemy.sql import and_, or_, not_
- >>> print(and_(
- ... users.c.name.like('j%'),
- ... users.c.id == addresses.c.user_id,
- ... or_(
- ... addresses.c.email_address == 'wendy@aol.com',
- ... addresses.c.email_address == 'jack@yahoo.com'
- ... ),
- ... not_(users.c.id > 5)
- ... )
- ... )
- users.name LIKE :name_1 AND users.id = addresses.user_id AND
- (addresses.email_address = :email_address_1
- OR addresses.email_address = :email_address_2)
- AND users.id <= :id_1
And you can also use the re-jiggered bitwise AND, OR and NOT operators,although because of Python operator precedence you have to watch yourparenthesis:
- >>> print(users.c.name.like('j%') & (users.c.id == addresses.c.user_id) &
- ... (
- ... (addresses.c.email_address == 'wendy@aol.com') | \
- ... (addresses.c.email_address == 'jack@yahoo.com')
- ... ) \
- ... & ~(users.c.id>5)
- ... )
- users.name LIKE :name_1 AND users.id = addresses.user_id AND
- (addresses.email_address = :email_address_1
- OR addresses.email_address = :email_address_2)
- AND users.id <= :id_1
So with all of this vocabulary, let’s select all users who have an emailaddress at AOL or MSN, whose name starts with a letter between “m” and “z”,and we’ll also generate a column containing their full name combined withtheir email address. We will add two new constructs to this statement,between()
and label()
.between()
produces a BETWEEN clause, andlabel()
is used in a column expression to produce labels using the AS
keyword; it’s recommended when selecting from expressions that otherwise wouldnot have a name:
- >>> s = select([(users.c.fullname +
- ... ", " + addresses.c.email_address).
- ... label('title')]).\
- ... where(
- ... and_(
- ... users.c.id == addresses.c.user_id,
- ... users.c.name.between('m', 'z'),
- ... or_(
- ... addresses.c.email_address.like('%@aol.com'),
- ... addresses.c.email_address.like('%@msn.com')
- ... )
- ... )
- ... )
- >>> conn.execute(s).fetchall()
- SELECT users.fullname || ? || addresses.email_address AS title
- FROM users, addresses
- WHERE users.id = addresses.user_id AND users.name BETWEEN ? AND ? AND
- (addresses.email_address LIKE ? OR addresses.email_address LIKE ?)
- (', ', 'm', 'z', '%@aol.com', '%@msn.com')
- [(u'Wendy Williams, wendy@aol.com',)]
Once again, SQLAlchemy figured out the FROM clause for our statement. In factit will determine the FROM clause based on all of its other bits; the columnsclause, the where clause, and also some other elements which we haven’tcovered yet, which include ORDER BY, GROUP BY, and HAVING.
A shortcut to using and_()
is to chain together multiplewhere()
clauses. The above can also be written as:
- >>> s = select([(users.c.fullname +
- ... ", " + addresses.c.email_address).
- ... label('title')]).\
- ... where(users.c.id == addresses.c.user_id).\
- ... where(users.c.name.between('m', 'z')).\
- ... where(
- ... or_(
- ... addresses.c.email_address.like('%@aol.com'),
- ... addresses.c.email_address.like('%@msn.com')
- ... )
- ... )
- >>> conn.execute(s).fetchall()
- SELECT users.fullname || ? || addresses.email_address AS title
- FROM users, addresses
- WHERE users.id = addresses.user_id AND users.name BETWEEN ? AND ? AND
- (addresses.email_address LIKE ? OR addresses.email_address LIKE ?)
- (', ', 'm', 'z', '%@aol.com', '%@msn.com')
- [(u'Wendy Williams, wendy@aol.com',)]
The way that we can build up a select()
construct through successivemethod calls is called method chaining.
Using Textual SQL
Our last example really became a handful to type. Going from what oneunderstands to be a textual SQL expression into a Python construct whichgroups components together in a programmatic style can be hard. That’s whySQLAlchemy lets you just use strings, for those cases when the SQLis already known and there isn’t a strong need for the statement to supportdynamic features. The text()
construct is usedto compose a textual statement that is passed to the database mostlyunchanged. Below, we create a text()
object and execute it:
- >>> from sqlalchemy.sql import text
- >>> s = text(
- ... "SELECT users.fullname || ', ' || addresses.email_address AS title "
- ... "FROM users, addresses "
- ... "WHERE users.id = addresses.user_id "
- ... "AND users.name BETWEEN :x AND :y "
- ... "AND (addresses.email_address LIKE :e1 "
- ... "OR addresses.email_address LIKE :e2)")
- sql>>> conn.execute(s, x='m', y='z', e1='%@aol.com', e2='%@msn.com').fetchall()
SELECT users.fullname || ', ' || addresses.email_address AS title FROM users, addresses WHERE users.id = addresses.user_id AND users.name BETWEEN ? AND ? AND (addresses.email_address LIKE ? OR addresses.email_address LIKE ?) ('m', 'z', '%@aol.com', '%@msn.com')- [(u'Wendy Williams, wendy@aol.com',)]
Above, we can see that bound parameters are specified intext()
using the named colon format; this format isconsistent regardless of database backend. To send values in for theparameters, we passed them into the execute()
methodas additional arguments.
Specifying Bound Parameter Behaviors
The text()
construct supports pre-established bound valuesusing the TextClause.bindparams()
method:
- stmt = text("SELECT * FROM users WHERE users.name BETWEEN :x AND :y")
- stmt = stmt.bindparams(x="m", y="z")
The parameters can also be explicitly typed:
- stmt = stmt.bindparams(bindparam("x", type_=String), bindparam("y", type_=String))
- result = conn.execute(stmt, {"x": "m", "y": "z"})
Typing for bound parameters is necessary when the type requires Python-sideor special SQL-side processing provided by the datatype.
See also
TextClause.bindparams()
- full method description
Specifying Result-Column Behaviors
We may also specify information about the result columns using theTextClause.columns()
method; this method can be used to specifythe return types, based on name:
- stmt = stmt.columns(id=Integer, name=String)
or it can be passed full column expressions positionally, either typedor untyped. In this case it’s a good idea to list out the columnsexplicitly within our textual SQL, since the correlation of our columnexpressions to the SQL will be done positionally:
- stmt = text("SELECT id, name FROM users")
- stmt = stmt.columns(users.c.id, users.c.name)
When we call the TextClause.columns()
method, we get back aTextAsFrom
object that supports the full suite ofTextAsFrom.c
and other “selectable” operations:
- j = stmt.join(addresses, stmt.c.id == addresses.c.user_id)
- new_stmt = select([stmt.c.id, addresses.c.id]).\
- select_from(j).where(stmt.c.name == 'x')
The positional form of TextClause.columns()
is particularly usefulwhen relating textual SQL to existing Core or ORM models, because we can usecolumn expressions directly without worrying about name conflicts or other issues with theresult column names in the textual SQL:
- >>> stmt = text("SELECT users.id, addresses.id, users.id, "
- ... "users.name, addresses.email_address AS email "
- ... "FROM users JOIN addresses ON users.id=addresses.user_id "
- ... "WHERE users.id = 1").columns(
- ... users.c.id,
- ... addresses.c.id,
- ... addresses.c.user_id,
- ... users.c.name,
- ... addresses.c.email_address
- ... )
- sql>>> result = conn.execute(stmt)
SELECT users.id, addresses.id, users.id, users.name, addresses.email_address AS email FROM users JOIN addresses ON users.id=addresses.user_id WHERE users.id = 1 ()
Above, there’s three columns in the result that are named “id”, but sincewe’ve associated these with column expressions positionally, the names aren’t an issuewhen the result-columns are fetched using the actual column object as a key.Fetching the email_address
column would be:
- >>> row = result.fetchone()
- >>> row[addresses.c.email_address]
- 'jack@yahoo.com'
If on the other hand we used a string column key, the usual rules of name-based matching still apply, and we’d get an ambiguous column error forthe id
value:
- >>> row["id"]
- Traceback (most recent call last):
- ...
- InvalidRequestError: Ambiguous column name 'id' in result set column descriptions
It’s important to note that while accessing columns from a result set usingColumn
objects may seem unusual, it is in fact the only systemused by the ORM, which occurs transparently beneath the facade of theQuery
object; in this way, the TextClause.columns()
methodis typically very applicable to textual statements to be used in an ORMcontext. The example at Using Textual SQL illustratesa simple usage.
New in version 1.1: The TextClause.columns()
method now accepts column expressionswhich will be matched positionally to a plain text SQL result set,eliminating the need for column names to match or even be unique in theSQL statement when matching table metadata or ORM models to textual SQL.
See also
TextClause.columns()
- full method description
Using Textual SQL - integrating ORM-level queries withtext()
Using text() fragments inside bigger statements
text()
can also be used to produce fragments of SQLthat can be freely within aselect()
object, which accepts text()
objects as an argument for most of its builder functions.Below, we combine the usage of text()
within aselect()
object. The select()
construct provides the “geometry”of the statement, and the text()
construct provides thetextual content within this form. We can build a statement without theneed to refer to any pre-established Table
metadata:
- >>> s = select([
- ... text("users.fullname || ', ' || addresses.email_address AS title")
- ... ]).\
- ... where(
- ... and_(
- ... text("users.id = addresses.user_id"),
- ... text("users.name BETWEEN 'm' AND 'z'"),
- ... text(
- ... "(addresses.email_address LIKE :x "
- ... "OR addresses.email_address LIKE :y)")
- ... )
- ... ).select_from(text('users, addresses'))
- sql>>> conn.execute(s, x='%@aol.com', y='%@msn.com').fetchall()
SELECT users.fullname || ', ' || addresses.email_address AS title FROM users, addresses WHERE users.id = addresses.user_id AND users.name BETWEEN 'm' AND 'z' AND (addresses.email_address LIKE ? OR addresses.email_address LIKE ?) ('%@aol.com', '%@msn.com')- [(u'Wendy Williams, wendy@aol.com',)]
Changed in version 1.0.0: The select()
construct emits warnings when string SQLfragments are coerced to text()
, and text()
shouldbe used explicitly. See Warnings emitted when coercing full SQL fragments into text() for background.
Using More Specific Text with table(), literal_column(), and column()
We can move our level of structure back in the other direction too,by using column()
, literal_column()
,and table()
for some of thekey elements of our statement. Using these constructs, we can getsome more expression capabilities than if we used text()
directly, as they provide to the Core more information about how the stringsthey store are to be used, but still without the need to get into fullTable
based metadata. Below, we also specify the String
datatype for two of the key literal_column()
objects,so that the string-specific concatenation operator becomes available.We also use literal_column()
in order to use table-qualifiedexpressions, e.g. users.fullname
, that will be rendered as is;using column()
implies an individual column name that maybe quoted:
- >>> from sqlalchemy import select, and_, text, String
- >>> from sqlalchemy.sql import table, literal_column
- >>> s = select([
- ... literal_column("users.fullname", String) +
- ... ', ' +
- ... literal_column("addresses.email_address").label("title")
- ... ]).\
- ... where(
- ... and_(
- ... literal_column("users.id") == literal_column("addresses.user_id"),
- ... text("users.name BETWEEN 'm' AND 'z'"),
- ... text(
- ... "(addresses.email_address LIKE :x OR "
- ... "addresses.email_address LIKE :y)")
- ... )
- ... ).select_from(table('users')).select_from(table('addresses'))
- sql>>> conn.execute(s, x='%@aol.com', y='%@msn.com').fetchall()
SELECT users.fullname || ? || addresses.email_address AS anon_1 FROM users, addresses WHERE users.id = addresses.user_id AND users.name BETWEEN 'm' AND 'z' AND (addresses.email_address LIKE ? OR addresses.email_address LIKE ?) (', ', '%@aol.com', '%@msn.com')- [(u'Wendy Williams, wendy@aol.com',)]
Ordering or Grouping by a Label
One place where we sometimes want to use a string as a shortcut is whenour statement has some labeled column element that we want to refer to ina place such as the “ORDER BY” or “GROUP BY” clause; other candidates includefields within an “OVER” or “DISTINCT” clause. If we have such a labelin our select()
construct, we can refer to it directly by passing thestring straight into select.order_by()
or select.group_by()
,among others. This will refer to the named label and also prevent theexpression from being rendered twice. Label names that resolve to columnsare rendered fully:
- >>> from sqlalchemy import func
- >>> stmt = select([
- ... addresses.c.user_id,
- ... func.count(addresses.c.id).label('num_addresses')]).\
- ... group_by("user_id").order_by("user_id", "num_addresses")
- sql>>> conn.execute(stmt).fetchall()
SELECT addresses.user_id, count(addresses.id) AS num_addresses FROM addresses GROUP BY addresses.user_id ORDER BY addresses.user_id, num_addresses ()- [(1, 2), (2, 2)]
We can use modifiers like asc()
or desc()
by passing the stringname:
- >>> from sqlalchemy import func, desc
- >>> stmt = select([
- ... addresses.c.user_id,
- ... func.count(addresses.c.id).label('num_addresses')]).\
- ... group_by("user_id").order_by("user_id", desc("num_addresses"))
- sql>>> conn.execute(stmt).fetchall()
SELECT addresses.user_id, count(addresses.id) AS num_addresses FROM addresses GROUP BY addresses.user_id ORDER BY addresses.user_id, num_addresses DESC ()- [(1, 2), (2, 2)]
Note that the string feature here is very much tailored to when we havealready used the label()
method to create aspecifically-named label. In other cases, we always want to refer to theColumnElement
object directly so that the expression system canmake the most effective choices for rendering. Below, we illustrate how usingthe ColumnElement
eliminates ambiguity when we want to orderby a column name that appears more than once:
- >>> u1a, u1b = users.alias(), users.alias()
- >>> stmt = select([u1a, u1b]).\
- ... where(u1a.c.name > u1b.c.name).\
- ... order_by(u1a.c.name) # using "name" here would be ambiguous
- sql>>> conn.execute(stmt).fetchall()
SELECT users_1.id, users_1.name, users_1.fullname, users_2.id, users_2.name, users_2.fullname FROM users AS users_1, users AS users_2 WHERE users_1.name > users_2.name ORDER BY users_1.name ()- [(2, u'wendy', u'Wendy Williams', 1, u'jack', u'Jack Jones')]
Using Aliases and Subqueries
The alias in SQL corresponds to a “renamed” version of a table or SELECTstatement, which occurs anytime you say “SELECT .. FROM sometable ASsomeothername”. The AS
creates a new name for the table. Aliases are a keyconstruct as they allow any table or subquery to be referenced by a uniquename. In the case of a table, this allows the same table to be named in theFROM clause multiple times. In the case of a SELECT statement, it provides aparent name for the columns represented by the statement, allowing them to bereferenced relative to this name.
In SQLAlchemy, any Table
, select()
construct, or otherselectable can be turned into an alias or named subquery using theFromClause.alias()
method, which produces a Alias
construct.As an example, suppose we know that our user jack
has two particular emailaddresses. How can we locate jack based on the combination of those twoaddresses? To accomplish this, we’d use a join to the addresses
table,once for each address. We create two Alias
constructs againstaddresses
, and then use them both within a select()
construct:
- >>> a1 = addresses.alias()
- >>> a2 = addresses.alias()
- >>> s = select([users]).\
- ... where(and_(
- ... users.c.id == a1.c.user_id,
- ... users.c.id == a2.c.user_id,
- ... a1.c.email_address == 'jack@msn.com',
- ... a2.c.email_address == 'jack@yahoo.com'
- ... ))
- sql>>> conn.execute(s).fetchall()
SELECT users.id, users.name, users.fullname FROM users, addresses AS addresses_1, addresses AS addresses_2 WHERE users.id = addresses_1.user_id AND users.id = addresses_2.user_id AND addresses_1.email_address = ? AND addresses_2.email_address = ? ('jack@msn.com', 'jack@yahoo.com')- [(1, u'jack', u'Jack Jones')]
Note that the Alias
construct generated the names addresses1
andaddresses_2
in the final SQL result. The generation of these names is determinedby the position of the construct within the statement. If we created a query usingonly the second a2
alias, the name would come out as addresses_1
. Thegeneration of the names is also _deterministic, meaning the same SQLAlchemystatement construct will produce the identical SQL string each time it isrendered for a particular dialect.
Since on the outside, we refer to the alias using the Alias
constructitself, we don’t need to be concerned about the generated name. However, forthe purposes of debugging, it can be specified by passing a string nameto the FromClause.alias()
method:
- >>> a1 = addresses.alias('a1')
Aliases can of course be used for anything which you can SELECT from,including SELECT statements themselves, by converting the SELECT statementinto a named subquery. The SelectBase.alias()
method performs thisrole. We can self-join the users
tableback to the select()
we’ve created by making an alias of the entirestatement:
- >>> addresses_subq = s.alias()
- >>> s = select([users.c.name]).where(users.c.id == addresses_subq.c.id)
- sql>>> conn.execute(s).fetchall()
SELECT users.name FROM users, (SELECT users.id AS id, users.name AS name, users.fullname AS fullname FROM users, addresses AS addresses_1, addresses AS addresses_2 WHERE users.id = addresses_1.user_id AND users.id = addresses_2.user_id AND addresses_1.email_address = ? AND addresses_2.email_address = ?) AS anon_1 WHERE users.id = anon_1.id ('jack@msn.com', 'jack@yahoo.com')- [(u'jack',)]
Using Joins
We’re halfway along to being able to construct any SELECT expression. The nextcornerstone of the SELECT is the JOIN expression. We’ve already been doingjoins in our examples, by just placing two tables in either the columns clauseor the where clause of the select()
construct. But if we want to make areal “JOIN” or “OUTERJOIN” construct, we use the join()
andouterjoin()
methods, most commonly accessed from the left table in thejoin:
- >>> print(users.join(addresses))
- users JOIN addresses ON users.id = addresses.user_id
The alert reader will see more surprises; SQLAlchemy figured out how to JOINthe two tables ! The ON condition of the join, as it’s called, wasautomatically generated based on the ForeignKey
object which we placed on the addresses
table way at the beginning of thistutorial. Already the join()
construct is looking like a much better wayto join tables.
Of course you can join on whatever expression you want, such as if we want tojoin on all users who use the same name in their email address as theirusername:
- >>> print(users.join(addresses,
- ... addresses.c.email_address.like(users.c.name + '%')
- ... )
- ... )
- users JOIN addresses ON addresses.email_address LIKE users.name || :name_1
When we create a select()
construct, SQLAlchemy looks around at thetables we’ve mentioned and then places them in the FROM clause of thestatement. When we use JOINs however, we know what FROM clause we want, sohere we make use of the select_from()
method:
- >>> s = select([users.c.fullname]).select_from(
- ... users.join(addresses,
- ... addresses.c.email_address.like(users.c.name + '%'))
- ... )
- sql>>> conn.execute(s).fetchall()
SELECT users.fullname FROM users JOIN addresses ON addresses.email_address LIKE users.name || ? ('%',)- [(u'Jack Jones',), (u'Jack Jones',), (u'Wendy Williams',)]
The outerjoin()
method creates LEFT OUTER JOIN
constructs,and is used in the same way as join()
:
- >>> s = select([users.c.fullname]).select_from(users.outerjoin(addresses))
- >>> print(s)
- SELECT users.fullname
- FROM users
- LEFT OUTER JOIN addresses ON users.id = addresses.user_id
That’s the output outerjoin()
produces, unless, of course, you’re stuck ina gig using Oracle prior to version 9, and you’ve set up your engine (whichwould be using OracleDialect
) to use Oracle-specific SQL:
- >>> from sqlalchemy.dialects.oracle import dialect as OracleDialect
- >>> print(s.compile(dialect=OracleDialect(use_ansi=False)))
- SELECT users.fullname
- FROM users, addresses
- WHERE users.id = addresses.user_id(+)
If you don’t know what that SQL means, don’t worry ! The secret tribe ofOracle DBAs don’t want their black magic being found out ;).
See also
Everything Else
The concepts of creating SQL expressions have been introduced. What’s left aremore variants of the same themes. So now we’ll catalog the rest of theimportant things we’ll need to know.
Bind Parameter Objects
Throughout all these examples, SQLAlchemy is busy creating bind parameterswherever literal expressions occur. You can also specify your own bindparameters with your own names, and use the same statement repeatedly.The bindparam()
construct is used to produce a bound parameterwith a given name. While SQLAlchemy always refers to bound parameters byname on the API side, thedatabase dialect converts to the appropriate named or positional styleat execution time, as here where it converts to positional for SQLite:
- >>> from sqlalchemy.sql import bindparam
- >>> s = users.select(users.c.name == bindparam('username'))
- sql>>> conn.execute(s, username='wendy').fetchall()
SELECT users.id, users.name, users.fullname FROM users WHERE users.name = ? ('wendy',)- [(2, u'wendy', u'Wendy Williams')]
Another important aspect of bindparam()
is that it may be assigned atype. The type of the bind parameter will determine its behavior withinexpressions and also how the data bound to it is processed before being sentoff to the database:
- >>> s = users.select(users.c.name.like(bindparam('username', type_=String) + text("'%'")))
- sql>>> conn.execute(s, username='wendy').fetchall()
SELECT users.id, users.name, users.fullname FROM users WHERE users.name LIKE ? || '%' ('wendy',)- [(2, u'wendy', u'Wendy Williams')]
bindparam()
constructs of the same name can also be used multiple times, where only asingle named value is needed in the execute parameters:
- >>> s = select([users, addresses]).\
- ... where(
- ... or_(
- ... users.c.name.like(
- ... bindparam('name', type_=String) + text("'%'")),
- ... addresses.c.email_address.like(
- ... bindparam('name', type_=String) + text("'@%'"))
- ... )
- ... ).\
- ... select_from(users.outerjoin(addresses)).\
- ... order_by(addresses.c.id)
- sql>>> conn.execute(s, name='jack').fetchall()
SELECT users.id, users.name, users.fullname, addresses.id, addresses.user_id, addresses.email_address FROM users LEFT OUTER JOIN addresses ON users.id = addresses.user_id WHERE users.name LIKE ? || '%' OR addresses.email_address LIKE ? || '@%' ORDER BY addresses.id ('jack', 'jack')- [(1, u'jack', u'Jack Jones', 1, 1, u'jack@yahoo.com'), (1, u'jack', u'Jack Jones', 2, 1, u'jack@msn.com')]
See also
Functions
SQL functions are created using the func
keyword, whichgenerates functions using attribute access:
- >>> from sqlalchemy.sql import func
- >>> print(func.now())
- now()
- >>> print(func.concat('x', 'y'))
- concat(:concat_1, :concat_2)
By “generates”, we mean that any SQL function is created based on the wordyou choose:
- >>> print(func.xyz_my_goofy_function())
- xyz_my_goofy_function()
Certain function names are known by SQLAlchemy, allowing special behavioralrules to be applied. Some for example are “ANSI” functions, which mean theydon’t get the parenthesis added after them, such as CURRENT_TIMESTAMP:
- >>> print(func.current_timestamp())
- CURRENT_TIMESTAMP
Functions are most typically used in the columns clause of a select statement,and can also be labeled as well as given a type. Labeling a function isrecommended so that the result can be targeted in a result row based on astring name, and assigning it a type is required when you need result-setprocessing to occur, such as for Unicode conversion and date conversions.Below, we use the result function scalar()
to just read the first columnof the first row and then close the result; the label, even though present, isnot important in this case:
- >>> conn.execute(
- ... select([
- ... func.max(addresses.c.email_address, type_=String).
- ... label('maxemail')
- ... ])
- ... ).scalar()
SELECT max(addresses.email_address) AS maxemail FROM addresses ()- u'www@www.org'
Databases such as PostgreSQL and Oracle which support functions that returnwhole result sets can be assembled into selectable units, which can be used instatements. Such as, a database function calculate()
which takes theparameters x
and y
, and returns three columns which we’d like to nameq
, z
and r
, we can construct using “lexical” column objects aswell as bind parameters:
- >>> from sqlalchemy.sql import column
- >>> calculate = select([column('q'), column('z'), column('r')]).\
- ... select_from(
- ... func.calculate(
- ... bindparam('x'),
- ... bindparam('y')
- ... )
- ... )
- >>> calc = calculate.alias()
- >>> print(select([users]).where(users.c.id > calc.c.z))
- SELECT users.id, users.name, users.fullname
- FROM users, (SELECT q, z, r
- FROM calculate(:x, :y)) AS anon_1
- WHERE users.id > anon_1.z
If we wanted to use our calculate
statement twice with different bindparameters, the unique_params()
function will create copies for us, and mark the bind parameters as “unique”so that conflicting names are isolated. Note we also make two separate aliasesof our selectable:
- >>> calc1 = calculate.alias('c1').unique_params(x=17, y=45)
- >>> calc2 = calculate.alias('c2').unique_params(x=5, y=12)
- >>> s = select([users]).\
- ... where(users.c.id.between(calc1.c.z, calc2.c.z))
- >>> print(s)
- SELECT users.id, users.name, users.fullname
- FROM users,
- (SELECT q, z, r FROM calculate(:x_1, :y_1)) AS c1,
- (SELECT q, z, r FROM calculate(:x_2, :y_2)) AS c2
- WHERE users.id BETWEEN c1.z AND c2.z
- >>> s.compile().params
- {u'x_2': 5, u'y_2': 12, u'y_1': 45, u'x_1': 17}
See also
Window Functions
Any FunctionElement
, including functions generated byfunc
, can be turned into a “window function”, that is anOVER clause, using the FunctionElement.over()
method:
- >>> s = select([
- ... users.c.id,
- ... func.row_number().over(order_by=users.c.name)
- ... ])
- >>> print(s)
- SELECT users.id, row_number() OVER (ORDER BY users.name) AS anon_1
- FROM users
FunctionElement.over()
also supports range specification usingeither the expression.over.rows
orexpression.over.range
parameters:
- >>> s = select([
- ... users.c.id,
- ... func.row_number().over(
- ... order_by=users.c.name,
- ... rows=(-2, None))
- ... ])
- >>> print(s)
- SELECT users.id, row_number() OVER
- (ORDER BY users.name ROWS BETWEEN :param_1 PRECEDING AND UNBOUNDED FOLLOWING) AS anon_1
- FROM users
expression.over.rows
and expression.over.range
eachaccept a two-tuple which contains a combination of negative and positiveintegers for ranges, zero to indicate “CURRENT ROW” and None
toindicate “UNBOUNDED”. See the examples at over()
for more detail.
New in version 1.1: support for “rows” and “range” specification forwindow functions
See also
Data Casts and Type Coercion
In SQL, we often need to indicate the datatype of an element explicitly, orwe need to convert between one datatype and another within a SQL statement.The CAST SQL function performs this. In SQLAlchemy, the cast()
functionrenders the SQL CAST keyword. It accepts a column expression and a data typeobject as arguments:
- >>> from sqlalchemy import cast
- >>> s = select([cast(users.c.id, String)])
- >>> conn.execute(s).fetchall()
SELECT CAST(users.id AS VARCHAR) AS anon_1 FROM users ()- [('1',), ('2',)]
The cast()
function is used not just when converting between datatypes,but also in cases where the database needs toknow that some particular value should be considered to be of a particulardatatype within an expression.
The cast()
function also tells SQLAlchemy itself that an expressionshould be treated as a particular type as well. The datatype of an expressiondirectly impacts the behavior of Python operators upon that object, such as howthe +
operator may indicate integer addition or string concatenation, andit also impacts how a literal Python value is transformed or handled beforebeing passed to the database as well as how result values of that expressionshould be transformed or handled.
Sometimes there is the need to have SQLAlchemy know the datatype of anexpression, for all the reasons mentioned above, but to not render the CASTexpression itself on the SQL side, where it may interfere with a SQL operationthat already works without it. For this fairly common use case there isanother function type_coerce()
which is closely related tocast()
, in that it sets up a Python expression as having a specific SQLdatabase type, but does not render the CAST
keyword or datatype on thedatabase side. type_coerce()
is particularly important when dealingwith the types.JSON
datatype, which typically has an intricaterelationship with string-oriented datatypes on different platforms andmay not even be an explicit datatype, such as on SQLite and MariaDB.Below, we use type_coerce()
to deliver a Python structure as a JSONstring into one of MySQL’s JSON functions:
- >>> import json
- >>> from sqlalchemy import JSON
- >>> from sqlalchemy import type_coerce
- >>> from sqlalchemy.dialects import mysql
- >>> s = select([
- ... type_coerce(
- ... {'some_key': {'foo': 'bar'}}, JSON
- ... )['some_key']
- ... ])
- >>> print(s.compile(dialect=mysql.dialect()))
- SELECT JSON_EXTRACT(%s, %s) AS anon_1
Above, MySQL’s JSONEXTRACT
SQL function was invokedbecause we used type_coerce()
to indicate that our Python dictionaryshould be treated as types.JSON
. The Python _getitem
operator, ['some_key']
in this case, became available as a result andallowed a JSON_EXTRACT
path expression (not shown, however in thiscase it would ultimately be '$."some_key"'
) to be rendered.
Unions and Other Set Operations
Unions come in two flavors, UNION and UNION ALL, which are available viamodule level functions union()
andunion_all()
:
- >>> from sqlalchemy.sql import union
- >>> u = union(
- ... addresses.select().
- ... where(addresses.c.email_address == 'foo@bar.com'),
- ... addresses.select().
- ... where(addresses.c.email_address.like('%@yahoo.com')),
- ... ).order_by(addresses.c.email_address)
- sql>>> conn.execute(u).fetchall()
SELECT addresses.id, addresses.user_id, addresses.email_address FROM addresses WHERE addresses.email_address = ? UNION SELECT addresses.id, addresses.user_id, addresses.email_address FROM addresses WHERE addresses.email_address LIKE ? ORDER BY addresses.email_address ('foo@bar.com', '%@yahoo.com')- [(1, 1, u'jack@yahoo.com')]
Also available, though not supported on all databases, areintersect()
,intersect_all()
,except_()
, and except_all()
:
- >>> from sqlalchemy.sql import except_
- >>> u = except_(
- ... addresses.select().
- ... where(addresses.c.email_address.like('%@%.com')),
- ... addresses.select().
- ... where(addresses.c.email_address.like('%@msn.com'))
- ... )
- sql>>> conn.execute(u).fetchall()
SELECT addresses.id, addresses.user_id, addresses.email_address FROM addresses WHERE addresses.email_address LIKE ? EXCEPT SELECT addresses.id, addresses.user_id, addresses.email_address FROM addresses WHERE addresses.email_address LIKE ? ('%@%.com', '%@msn.com')- [(1, 1, u'jack@yahoo.com'), (4, 2, u'wendy@aol.com')]
A common issue with so-called “compound” selectables arises due to the factthat they nest with parenthesis. SQLite in particular doesn’t like a statementthat starts with parenthesis. So when nesting a “compound” inside a“compound”, it’s often necessary to apply .alias().select()
to the firstelement of the outermost compound, if that element is also a compound. Forexample, to nest a “union” and a “select” inside of “except_”, SQLite willwant the “union” to be stated as a subquery:
- >>> u = except_(
- ... union(
- ... addresses.select().
- ... where(addresses.c.email_address.like('%@yahoo.com')),
- ... addresses.select().
- ... where(addresses.c.email_address.like('%@msn.com'))
- ... ).alias().select(), # apply subquery here
- ... addresses.select(addresses.c.email_address.like('%@msn.com'))
- ... )
- sql>>> conn.execute(u).fetchall()
SELECT anon_1.id, anon_1.user_id, anon_1.email_address FROM (SELECT addresses.id AS id, addresses.user_id AS user_id, addresses.email_address AS email_address FROM addresses WHERE addresses.email_address LIKE ? UNION SELECT addresses.id AS id, addresses.user_id AS user_id, addresses.email_address AS email_address FROM addresses WHERE addresses.email_address LIKE ?) AS anon_1 EXCEPT SELECT addresses.id, addresses.user_id, addresses.email_address FROM addresses WHERE addresses.email_address LIKE ? ('%@yahoo.com', '%@msn.com', '%@msn.com')- [(1, 1, u'jack@yahoo.com')]
See also
Scalar Selects
A scalar select is a SELECT that returns exactly one row and onecolumn. It can then be used as a column expression. A scalar selectis often a correlated subquery, which relies upon the enclosingSELECT statement in order to acquire at least one of its FROM clauses.
The select()
construct can be modified to act as acolumn expression by calling either the as_scalar()
or label()
method:
- >>> stmt = select([func.count(addresses.c.id)]).\
- ... where(users.c.id == addresses.c.user_id).\
- ... as_scalar()
The above construct is now a ScalarSelect
object,and is no longer part of the FromClause
hierarchy;it instead is within the ColumnElement
family ofexpression constructs. We can place this construct the same as anyother column within another select()
:
- >>> conn.execute(select([users.c.name, stmt])).fetchall()
SELECT users.name, (SELECT count(addresses.id) AS count_1 FROM addresses WHERE users.id = addresses.user_id) AS anon_1 FROM users ()- [(u'jack', 2), (u'wendy', 2)]
To apply a non-anonymous column name to our scalar select, we createit using SelectBase.label()
instead:
- >>> stmt = select([func.count(addresses.c.id)]).\
- ... where(users.c.id == addresses.c.user_id).\
- ... label("address_count")
- >>> conn.execute(select([users.c.name, stmt])).fetchall()
SELECT users.name, (SELECT count(addresses.id) AS count_1 FROM addresses WHERE users.id = addresses.user_id) AS address_count FROM users ()- [(u'jack', 2), (u'wendy', 2)]
See also
Correlated Subqueries
Notice in the examples on Scalar Selects, the FROM clause of each embeddedselect did not contain the users
table in its FROM clause. This is becauseSQLAlchemy automatically correlates embedded FROM objects to thatof an enclosing query, if present, and if the inner SELECT statement wouldstill have at least one FROM clause of its own. For example:
- >>> stmt = select([addresses.c.user_id]).\
- ... where(addresses.c.user_id == users.c.id).\
- ... where(addresses.c.email_address == 'jack@yahoo.com')
- >>> enclosing_stmt = select([users.c.name]).where(users.c.id == stmt)
- >>> conn.execute(enclosing_stmt).fetchall()
SELECT users.name FROM users WHERE users.id = (SELECT addresses.user_id FROM addresses WHERE addresses.user_id = users.id AND addresses.email_address = ?) ('jack@yahoo.com',)- [(u'jack',)]
Auto-correlation will usually do what’s expected, however it can also be controlled.For example, if we wanted a statement to correlate only to the addresses
tablebut not the users
table, even if both were present in the enclosing SELECT,we use the correlate()
method to specify those FROM clauses thatmay be correlated:
- >>> stmt = select([users.c.id]).\
- ... where(users.c.id == addresses.c.user_id).\
- ... where(users.c.name == 'jack').\
- ... correlate(addresses)
- >>> enclosing_stmt = select(
- ... [users.c.name, addresses.c.email_address]).\
- ... select_from(users.join(addresses)).\
- ... where(users.c.id == stmt)
- >>> conn.execute(enclosing_stmt).fetchall()
SELECT users.name, addresses.email_address FROM users JOIN addresses ON users.id = addresses.user_id WHERE users.id = (SELECT users.id FROM users WHERE users.id = addresses.user_id AND users.name = ?) ('jack',) [(u'jack', u'jack@yahoo.com'), (u'jack', u'jack@msn.com')]
To entirely disable a statement from correlating, we can pass None
as the argument:
- >>> stmt = select([users.c.id]).\
- ... where(users.c.name == 'wendy').\
- ... correlate(None)
- >>> enclosing_stmt = select([users.c.name]).\
- ... where(users.c.id == stmt)
- >>> conn.execute(enclosing_stmt).fetchall()
SELECT users.name FROM users WHERE users.id = (SELECT users.id FROM users WHERE users.name = ?) ('wendy',)- [(u'wendy',)]
We can also control correlation via exclusion, using the Select.correlate_except()
method. Such as, we can write our SELECT for the users
tableby telling it to correlate all FROM clauses except for users
:
- >>> stmt = select([users.c.id]).\
- ... where(users.c.id == addresses.c.user_id).\
- ... where(users.c.name == 'jack').\
- ... correlate_except(users)
- >>> enclosing_stmt = select(
- ... [users.c.name, addresses.c.email_address]).\
- ... select_from(users.join(addresses)).\
- ... where(users.c.id == stmt)
- >>> conn.execute(enclosing_stmt).fetchall()
SELECT users.name, addresses.email_address FROM users JOIN addresses ON users.id = addresses.user_id WHERE users.id = (SELECT users.id FROM users WHERE users.id = addresses.user_id AND users.name = ?) ('jack',) [(u'jack', u'jack@yahoo.com'), (u'jack', u'jack@msn.com')]
LATERAL correlation
LATERAL correlation is a special sub-category of SQL correlation whichallows a selectable unit to refer to another selectable unit within asingle FROM clause. This is an extremely special use case which, whilepart of the SQL standard, is only known to be supported by recentversions of PostgreSQL.
Normally, if a SELECT statement refers totable1 JOIN (some SELECT) AS subquery
in its FROM clause, the subqueryon the right side may not refer to the “table1” expression from the left side;correlation may only refer to a table that is part of another SELECT thatentirely encloses this SELECT. The LATERAL keyword allows us to turn thisbehavior around, allowing an expression such as:
- SELECT people.people_id, people.age, people.name
- FROM people JOIN LATERAL (SELECT books.book_id AS book_id
- FROM books WHERE books.owner_id = people.people_id)
- AS book_subq ON true
Where above, the right side of the JOIN contains a subquery that refers notjust to the “books” table but also the “people” table, correlatingto the left side of the JOIN. SQLAlchemy Core supports a statementlike the above using the Select.lateral()
method as follows:
- >>> from sqlalchemy import table, column, select, true
- >>> people = table('people', column('people_id'), column('age'), column('name'))
- >>> books = table('books', column('book_id'), column('owner_id'))
- >>> subq = select([books.c.book_id]).\
- ... where(books.c.owner_id == people.c.people_id).lateral("book_subq")
- >>> print(select([people]).select_from(people.join(subq, true())))
- SELECT people.people_id, people.age, people.name
- FROM people JOIN LATERAL (SELECT books.book_id AS book_id
- FROM books WHERE books.owner_id = people.people_id)
- AS book_subq ON true
Above, we can see that the Select.lateral()
method acts a lot likethe Select.alias()
method, including that we can specify an optionalname. However the construct is the Lateral
construct instead ofan Alias
which provides for the LATERAL keyword as well as specialinstructions to allow correlation from inside the FROM clause of theenclosing statement.
The Select.lateral()
method interacts normally with theSelect.correlate()
and Select.correlate_except()
methods, exceptthat the correlation rules also apply to any other tables present in theenclosing statement’s FROM clause. Correlation is “automatic” to thesetables by default, is explicit if the table is specified toSelect.correlate()
, and is explicit to all tables except thosespecified to Select.correlate_except()
.
New in version 1.1: Support for the LATERAL keyword and lateral correlation.
See also
Ordering, Grouping, Limiting, Offset…ing…
Ordering is done by passing column expressions to theorder_by()
method:
- >>> stmt = select([users.c.name]).order_by(users.c.name)
- >>> conn.execute(stmt).fetchall()
SELECT users.name FROM users ORDER BY users.name ()- [(u'jack',), (u'wendy',)]
Ascending or descending can be controlled using the asc()
and desc()
modifiers:
- >>> stmt = select([users.c.name]).order_by(users.c.name.desc())
- >>> conn.execute(stmt).fetchall()
SELECT users.name FROM users ORDER BY users.name DESC ()- [(u'wendy',), (u'jack',)]
Grouping refers to the GROUP BY clause, and is usually used in conjunctionwith aggregate functions to establish groups of rows to be aggregated.This is provided via the group_by()
method:
- >>> stmt = select([users.c.name, func.count(addresses.c.id)]).\
- ... select_from(users.join(addresses)).\
- ... group_by(users.c.name)
- >>> conn.execute(stmt).fetchall()
SELECT users.name, count(addresses.id) AS count_1 FROM users JOIN addresses ON users.id = addresses.user_id GROUP BY users.name ()- [(u'jack', 2), (u'wendy', 2)]
HAVING can be used to filter results on an aggregate value, after GROUP BY hasbeen applied. It’s available here via the having()
method:
- >>> stmt = select([users.c.name, func.count(addresses.c.id)]).\
- ... select_from(users.join(addresses)).\
- ... group_by(users.c.name).\
- ... having(func.length(users.c.name) > 4)
- >>> conn.execute(stmt).fetchall()
SELECT users.name, count(addresses.id) AS count_1 FROM users JOIN addresses ON users.id = addresses.user_id GROUP BY users.name HAVING length(users.name) > ? (4,)- [(u'wendy', 2)]
A common system of dealing with duplicates in composed SELECT statementsis the DISTINCT modifier. A simple DISTINCT clause can be added using theSelect.distinct()
method:
- >>> stmt = select([users.c.name]).\
- ... where(addresses.c.email_address.
- ... contains(users.c.name)).\
- ... distinct()
- >>> conn.execute(stmt).fetchall()
SELECT DISTINCT users.name FROM users, addresses WHERE (addresses.email_address LIKE '%' || users.name || '%') ()- [(u'jack',), (u'wendy',)]
Most database backends support a system of limiting how many rowsare returned, and the majority also feature a means of starting to returnrows after a given “offset”. While common backends like PostgreSQL,MySQL and SQLite support LIMIT and OFFSET keywords, other backendsneed to refer to more esoteric features such as “window functions”and row ids to achieve the same effect. The limit()
and offset()
methods provide an easy abstractioninto the current backend’s methodology:
- >>> stmt = select([users.c.name, addresses.c.email_address]).\
- ... select_from(users.join(addresses)).\
- ... limit(1).offset(1)
- >>> conn.execute(stmt).fetchall()
SELECT users.name, addresses.email_address FROM users JOIN addresses ON users.id = addresses.user_id LIMIT ? OFFSET ? (1, 1)- [(u'jack', u'jack@msn.com')]
Inserts, Updates and Deletes
We’ve seen insert()
demonstratedearlier in this tutorial. Where insert()
produces INSERT, the update()
method produces UPDATE. Both of these constructs featurea method called values()
which specifiesthe VALUES or SET clause of the statement.
The values()
method accommodates any column expressionas a value:
- >>> stmt = users.update().\
- ... values(fullname="Fullname: " + users.c.name)
- >>> conn.execute(stmt)
UPDATE users SET fullname=(? || users.name) ('Fullname: ',) COMMIT- <sqlalchemy.engine.result.ResultProxy object at 0x...>
When using insert()
or update()
in an “execute many” context, we may also want to specify namedbound parameters which we can refer to in the argument list.The two constructs will automatically generate bound placeholdersfor any column names passed in the dictionaries sent toexecute()
at execution time. However, if wewish to use explicitly targeted named parameters with composed expressions,we need to use the bindparam()
construct.When using bindparam()
withinsert()
or update()
,the names of the table’s columns themselves are reserved for the“automatic” generation of bind names. We can combine the usageof implicitly available bind names and explicitly named parametersas in the example below:
- >>> stmt = users.insert().\
- ... values(name=bindparam('_name') + " .. name")
- >>> conn.execute(stmt, [
- ... {'id':4, '_name':'name1'},
- ... {'id':5, '_name':'name2'},
- ... {'id':6, '_name':'name3'},
- ... ])
INSERT INTO users (id, name) VALUES (?, (? || ?)) ((4, 'name1', ' .. name'), (5, 'name2', ' .. name'), (6, 'name3', ' .. name')) COMMIT <sqlalchemy.engine.result.ResultProxy object at 0x...>
An UPDATE statement is emitted using the update()
construct. Thisworks much like an INSERT, except there is an additional WHERE clausethat can be specified:
- >>> stmt = users.update().\
- ... where(users.c.name == 'jack').\
- ... values(name='ed')
- >>> conn.execute(stmt)
UPDATE users SET name=? WHERE users.name = ? ('ed', 'jack') COMMIT- <sqlalchemy.engine.result.ResultProxy object at 0x...>
When using update()
in an “executemany” context,we may wish to also use explicitly named bound parameters in theWHERE clause. Again, bindparam()
is the constructused to achieve this:
- >>> stmt = users.update().\
- ... where(users.c.name == bindparam('oldname')).\
- ... values(name=bindparam('newname'))
- >>> conn.execute(stmt, [
- ... {'oldname':'jack', 'newname':'ed'},
- ... {'oldname':'wendy', 'newname':'mary'},
- ... {'oldname':'jim', 'newname':'jake'},
- ... ])
UPDATE users SET name=? WHERE users.name = ? (('ed', 'jack'), ('mary', 'wendy'), ('jake', 'jim')) COMMIT- <sqlalchemy.engine.result.ResultProxy object at 0x...>
Correlated Updates
A correlated update lets you update a table using selection from anothertable, or the same table:
- >>> stmt = select([addresses.c.email_address]).\
- ... where(addresses.c.user_id == users.c.id).\
- ... limit(1)
- >>> conn.execute(users.update().values(fullname=stmt))
UPDATE users SET fullname=(SELECT addresses.email_address FROM addresses WHERE addresses.user_id = users.id LIMIT ? OFFSET ?) (1, 0) COMMIT- <sqlalchemy.engine.result.ResultProxy object at 0x...>
Multiple Table Updates
The PostgreSQL, Microsoft SQL Server, and MySQL backends all support UPDATE statementsthat refer to multiple tables. For PG and MSSQL, this is the “UPDATE FROM” syntax,which updates one table at a time, but can reference additional tables in an additional“FROM” clause that can then be referenced in the WHERE clause directly. On MySQL,multiple tables can be embedded into a single UPDATE statement separated by a comma.The SQLAlchemy update()
construct supports both of these modesimplicitly, by specifying multiple tables in the WHERE clause:
- stmt = users.update().\
- values(name='ed wood').\
- where(users.c.id == addresses.c.id).\
- where(addresses.c.email_address.startswith('ed%'))
- conn.execute(stmt)
The resulting SQL from the above statement would render as:
- UPDATE users SET name=:name FROM addresses
- WHERE users.id = addresses.id AND
- addresses.email_address LIKE :email_address_1 || '%'
When using MySQL, columns from each table can be assigned to in theSET clause directly, using the dictionary form passed to Update.values()
:
- stmt = users.update().\
- values({
- users.c.name:'ed wood',
- addresses.c.email_address:'ed.wood@foo.com'
- }).\
- where(users.c.id == addresses.c.id).\
- where(addresses.c.email_address.startswith('ed%'))
The tables are referenced explicitly in the SET clause:
- UPDATE users, addresses SET addresses.email_address=%s,
- users.name=%s WHERE users.id = addresses.id
- AND addresses.email_address LIKE concat(%s, '%')
When the construct is used on a non-supporting database, the compilerwill raise NotImplementedError
. For convenience, when a statementis printed as a string without specification of a dialect, the “string SQL”compiler will be invoked which provides a non-working SQL representation of theconstruct.
Parameter-Ordered Updates
The default behavior of the update()
construct when rendering the SETclauses is to render them using the column ordering given in theoriginating Table
object.This is an important behavior, since it means that the rendering of aparticular UPDATE statement with particular columnswill be rendered the same each time, which has an impact on query caching systemsthat rely on the form of the statement, either client side or server side.Since the parameters themselves are passed to the Update.values()
method as Python dictionary keys, there is no other fixed orderingavailable.
However in some cases, the order of parameters rendered in the SET clause of anUPDATE statement can be significant. The main example of this is when usingMySQL and providing updates to column values based on that of othercolumn values. The end result of the following statement:
- UPDATE some_table SET x = y + 10, y = 20
Will have a different result than:
- UPDATE some_table SET y = 20, x = y + 10
This because on MySQL, the individual SET clauses are fully evaluated ona per-value basis, as opposed to on a per-row basis, and as each SET clauseis evaluated, the values embedded in the row are changing.
To suit this specific use case, thepreserve_parameter_order
flag may be used. When using this flag, we supply a Python list of 2-tuplesas the argument to the Update.values()
method:
- stmt = some_table.update(preserve_parameter_order=True).\
- values([(some_table.c.y, 20), (some_table.c.x, some_table.c.y + 10)])
The list of 2-tuples is essentially the same structure as a Python dictionaryexcept it is ordered. Using the above form, we are assured that the“y” column’s SET clause will render first, then the “x” column’s SET clause.
New in version 1.0.10: Added support for explicit ordering of UPDATEparameters using the preserve_parameter_order
flag.
See also
INSERT…ON DUPLICATE KEY UPDATE (Upsert) - background on the MySQLON DUPLICATE KEY UPDATE
clause and how to support parameter ordering.
Deletes
Finally, a delete. This is accomplished easily enough using thedelete()
construct:
- >>> conn.execute(addresses.delete())
DELETE FROM addresses () COMMIT- <sqlalchemy.engine.result.ResultProxy object at 0x...>
- >>> conn.execute(users.delete().where(users.c.name > 'm'))
DELETE FROM users WHERE users.name > ? ('m',) COMMIT- <sqlalchemy.engine.result.ResultProxy object at 0x...>
Multiple Table Deletes
New in version 1.2.
The PostgreSQL, Microsoft SQL Server, and MySQL backends all support DELETEstatements that refer to multiple tables within the WHERE criteria. For PGand MySQL, this is the “DELETE USING” syntax, and for SQL Server, it’s a“DELETE FROM” that refers to more than one table. The SQLAlchemydelete()
construct supports both of these modesimplicitly, by specifying multiple tables in the WHERE clause:
- stmt = users.delete().\
- where(users.c.id == addresses.c.id).\
- where(addresses.c.email_address.startswith('ed%'))
- conn.execute(stmt)
On a PostgreSQL backend, the resulting SQL from the above statement would render as:
- DELETE FROM users USING addresses
- WHERE users.id = addresses.id
- AND (addresses.email_address LIKE %(email_address_1)s || '%%')
When the construct is used on a non-supporting database, the compilerwill raise NotImplementedError
. For convenience, when a statementis printed as a string without specification of a dialect, the “string SQL”compiler will be invoked which provides a non-working SQL representation of theconstruct.
Matched Row Counts
Both of update()
anddelete()
are associated with matched row counts. This is anumber indicating the number of rows that were matched by the WHERE clause.Note that by “matched”, this includes rows where no UPDATE actually took place.The value is available as rowcount
:
- >>> result = conn.execute(users.delete())
DELETE FROM users () COMMIT- >>> result.rowcount
- 1
Further Reference
Expression Language Reference: SQL Statements and Expressions API
Database Metadata Reference: Describing Databases with MetaData
Engine Reference: Engine Configuration
Connection Reference: Working with Engines and Connections
Types Reference: Column and Data Types