Advanced features

Table inheritance

It is possible to create a table that contains all the fields from another table. It is sufficient to pass the other table in place of a field to define_table. For example

  1. >>> db.define_table('person', Field('name'), Field('gender'))
  2. <Table person (id, name, gender)>
  3. >>> db.define_table('doctor', db.person, Field('specialization'))
  4. <Table doctor (id, name, gender, specialization)>

It is also possible to define a dummy table that is not stored in a database in order to reuse it in multiple other places. For example:

  1. signature = db.Table(db, 'signature',
  2. Field('is_active', 'boolean', default=True),
  3. Field('created_on', 'datetime', default=request.now),
  4. Field('created_by', db.auth_user, default=auth.user_id),
  5. Field('modified_on', 'datetime', update=request.now),
  6. Field('modified_by', db.auth_user, update=auth.user_id))
  7. db.define_table('payment', Field('amount', 'double'), signature)

This example assumes that standard web2py authentication is enabled.

Notice that if you use Auth web2py already creates one such table for you:

  1. auth = Auth(db)
  2. db.define_table('payment', Field('amount', 'double'), auth.signature)

When using table inheritance, if you want the inheriting table to inherit validators, be sure to define the validators of the parent table before defining the inheriting table.

filter_in and filter_out

It is possible to define a filter for each field to be called before a value is inserted into the database for that field and after a value is retrieved from the database.

Imagine for example that you want to store a serializable Python data structure in a field in the json format. Here is how it could be accomplished:

  1. >>> import json
  2. >>> db.define_table('anyobj',
  3. ... Field('name'),
  4. ... Field('data', 'text'))
  5. <Table anyobj (id, name, data)>
  6. >>> db.anyobj.data.filter_in = lambda obj: json.dumps(obj)
  7. >>> db.anyobj.data.filter_out = lambda txt: json.loads(txt)
  8. >>> myobj = ['hello', 'world', 1, {2: 3}]
  9. >>> aid = db.anyobj.insert(name='myobjname', data=myobj)
  10. >>> row = db.anyobj[aid]
  11. >>> row.data
  12. ['hello', 'world', 1, {'2': 3}]

Another way to accomplish the same is by using a Field of type SQLCustomType, as discussed in next Custom Field types section.

callbacks on record insert, delete and update

Web2py provides a mechanism to register callbacks to be called before and/or after insert, update and delete of records.

Each table stores six lists of callbacks:

  1. db.mytable._before_insert
  2. db.mytable._after_insert
  3. db.mytable._before_update
  4. db.mytable._after_update
  5. db.mytable._before_delete
  6. db.mytable._after_delete

You can register a callback function by appending it to the corresponding list. The caveat is that depending on the functionality, the callback has different signature.

This is best explained via some examples.

  1. >>> db.define_table('person', Field('name'))
  2. <Table person (id, name)>
  3. >>> def pprint(callback, *args):
  4. ... print "%s%s" % (callback, args)
  5. ...
  6. >>> db.person._before_insert.append(lambda f: pprint('before_insert', f))
  7. >>> db.person._after_insert.append(lambda f, i: pprint('after_insert', f, i))
  8. >>> db.person.insert(name='John')
  9. before_insert(<OpRow {'name': 'John'}>,)
  10. after_insert(<OpRow {'name': 'John'}>, 1L)
  11. 1L
  12. >>> db.person._before_update.append(lambda s, f: pprint('before_update', s, f))
  13. >>> db.person._after_update.append(lambda s, f: pprint('after_update', s, f))
  14. >>> db(db.person.id == 1).update(name='Tim')
  15. before_update(<Set ("person"."id" = 1)>, <OpRow {'name': 'Tim'}>)
  16. after_update(<Set ("person"."id" = 1)>, <OpRow {'name': 'Tim'}>)
  17. 1
  18. >>> db.person._before_delete.append(lambda s: pprint('before_delete', s))
  19. >>> db.person._after_delete.append(lambda s: pprint('after_delete', s))
  20. >>> db(db.person.id == 1).delete()
  21. before_delete(<Set ("person"."id" = 1)>,)
  22. after_delete(<Set ("person"."id" = 1)>,)
  23. 1

As you can see:

  • f gets passed the OpRow object with data for insert or update.
  • i gets passed the id of the newly inserted record.
  • s gets passed the Set object used for update or delete.

OpRow is an helper object specialized in storing (field, value) pairs, you can think of it as a normal dictionary that you can use even with the syntax of attribute notation (that is f.name and f['name'] are equivalent).

The return values of these callback should be None or False. If any of the _before_* callback returns a True value it will abort the actual insert/update/delete operation.

Some times a callback may need to perform an update in the same or a different table and one wants to avoid firing other callbacks, which could cause an infinite loop.

For this purpose there the Set objects have an update_naive method that works like update but ignores before and after callbacks.

Database cascades

Database schema can define relationships which trigger deletions of related records, known as cascading. The DAL is not informed when a record is deleted due to a cascade. So no *_delete callaback will ever be called as conseguence of a cascade-deletion.

Record versioning

It is possible to ask web2py to save every copy of a record when the record is individually modified. There are different ways to do it and it can be done for all tables at once using the syntax:

  1. auth.enable_record_versioning(db)

this requires Auth and it is discussed in Chapter 9. It can also be done for each individual table as discussed below.

Consider the following table:

  1. db.define_table('stored_item',
  2. Field('name'),
  3. Field('quantity', 'integer'),
  4. Field('is_active', 'boolean',
  5. writable=False, readable=False, default=True))

Notice the hidden boolean field called is_active and defaulting to True.

We can tell web2py to create a new table (in the same or a different database) and store all previous versions of each record in the table, when modified.

This is done in the following way:

  1. db.stored_item._enable_record_versioning()

or in a more verbose syntax:

  1. db.stored_item._enable_record_versioning(archive_db=db,
  2. archive_name='stored_item_archive',
  3. current_record='current_record',
  4. is_active='is_active')

The archive_db=db tells web2py to store the archive table in the same database as the stored_item table. The archive_name sets the name for the archive table. The archive table has the same fields as the original table stored_item except that unique fields are no longer unique (because it needs to store multiple versions) and has an extra field which name is specified by current_record and which is a reference to the current record in the stored_item table.

When records are deleted, they are not really deleted. A deleted record is copied in the stored_item_archive table (like when it is modified) and the is_active field is set to False. By enabling record versioning web2py sets a common_filter on this table that hides all records in table stored_item where the is_active field is set to False. The is_active parameter in the _enable_record_versioning method allows to specify the name of the field used by the common_filter to determine if the field was deleted or not.

common_filters will be discussed in next Common filters section..

Common fields and multi-tenancy

db._common_fields is a list of fields that should belong to all the tables. This list can also contain tables and it is understood as all fields from the table.

For example occasionally you find yourself in need to add a signature to all your tables but the Auth tables. In this case, after you auth.define_tables() but before defining any other table, insert:

  1. db._common_fields.append(auth.signature)

One field is special: request_tenant, you can set a different name in db._request_tenant. This field does not exist but you can create it and add it to any of your tables (or all of them):

  1. db._common_fields.append(Field('request_tenant',
  2. default=request.env.http_host,
  3. writable=False))

For every table with such a field, all records for all queries are always automatically filtered by:

  1. db.table.request_tenant == db.table.request_tenant.default

and for every record inserted, this field is set to the default value. In the example above we have chosen:

  1. default = request.env.http_host

this means we have chosen to ask our app to filter all tables in all queries with:

  1. db.table.request_tenant == request.env.http_host

This simple trick allow us to turn any application into a multi-tenant application. Even though we run one instance of the application and we use one single database, when the application is accessed under two or more domains the visitors will see different data depending on the domain (in the example the domain name is retrieved from request.env.http_host).

You can turn off multi tenancy filters using ignore_common_filters=True at Set creation time:

  1. db(query, ignore_common_filters=True)

Common filters

A common filter is a generalization of the above multi-tenancy idea. It provides an easy way to prevent repeating of the same query. Consider for example the following table:

  1. db.define_table('blog_post',
  2. Field('subject'),
  3. Field('post_text', 'text'),
  4. Field('is_public', 'boolean'),
  5. common_filter = lambda query: db.blog_post.is_public == True)

Any select, delete or update in this table, will include only public blog posts. The attribute can also be modified at runtime:

  1. db.blog_post._common_filter = lambda query: ...

It serves both as a way to avoid repeating the “db.blog_post.is_public==True” phrase in each blog post search, and also as a security enhancement, that prevents you from forgetting to disallow viewing of non-public posts.

In case you actually do want items left out by the common filter (for example, allowing the admin to see non-public posts), you can either remove the filter:

  1. db.blog_post._common_filter = None

or ignore it:

  1. db(query, ignore_common_filters=True)

Note that common_filters are ignored by the appadmin interface.

Custom Field types

Aside for using filter_in and filter_out, it is possible to define new/custom field types. For example, suppose that you want to define a custom type to store an IP address:

  1. >>> def ip2int(sv):
  2. ... "Convert an IPV4 to an integer."
  3. ... sp = sv.split('.'); assert len(sp) == 4 # IPV4 only
  4. ... iip = 0
  5. ... for i in map(int, sp): iip = (iip<<8) + i
  6. ... return iip
  7. ...
  8. >>> def int2ip(iv):
  9. ... "Convert an integer to an IPV4."
  10. ... assert iv > 0
  11. ... iv = (iv,); ov = []
  12. ... for i in range(3):
  13. ... iv = divmod(iv[0], 256)
  14. ... ov.insert(0, iv[1])
  15. ... ov.insert(0, iv[0])
  16. ... return '.'.join(map(str, ov))
  17. ...
  18. >>> from gluon.dal import SQLCustomType
  19. >>> ipv4 = SQLCustomType(type='string', native='integer',
  20. ... encoder=lambda x : str(ip2int(x)), decoder=int2ip)
  21. >>> db.define_table('website',
  22. ... Field('name'),
  23. ... Field('ipaddr', type=ipv4))
  24. <Table website (id, name, ipaddr)>
  25. >>> db.website.insert(name='wikipedia', ipaddr='91.198.174.192')
  26. 1
  27. >>> db.website.insert(name='google', ipaddr='172.217.11.174')
  28. 2
  29. >>> db.website.insert(name='youtube', ipaddr='74.125.65.91')
  30. 3
  31. >>> db.website.insert(name='github', ipaddr='207.97.227.239')
  32. 4
  33. >>> rows = db(db.website.ipaddr > '100.0.0.0').select(orderby=~db.website.ipaddr)
  34. >>> for row in rows:
  35. ... print row.name, row.ipaddr
  36. ...
  37. github 207.97.227.239
  38. google 172.217.11.174

SQLCustomType is a field type factory. Its type argument must be one of the standard web2py types. It tells web2py how to treat the field values at the web2py level. native is the type of the field as far as the database is concerned. Allowed names depend on the database engine. encoder is an optional transformation function applied when the data is stored and decoder is the optional reverse transformation function.

This feature is marked as experimental. In practice it has been in web2py for a long time and it works but it can make the code not portable, for example when the native type is database specific.

It does not work on Google App Engine NoSQL.

Using DAL without define tables

The DAL can be used from any Python program simply by doing this:

  1. from gluon import DAL
  2. db = DAL('sqlite://storage.sqlite', folder='path/to/app/databases')

i.e. import the DAL, connect and specify the folder which contains the .table files (the app/databases folder).

To access the data and its attributes we still have to define all the tables we are going to access with db.define_table.

If we just need access to the data but not to the web2py table attributes, we get away without re-defining the tables but simply asking web2py to read the necessary info from the metadata in the .table files:

  1. from gluon import DAL
  2. db = DAL('sqlite://storage.sqlite', folder='path/to/app/databases', auto_import=True)

This allows us to access any db.table without need to re-define it.

PostGIS, SpatiaLite, and MS Geo (experimental)

The DAL supports geographical APIs using PostGIS (for PostgreSQL), SpatiaLite (for SQLite), and MSSQL and Spatial Extensions. This is a feature that was sponsored by the Sahana project and implemented by Denes Lengyel.

DAL provides geometry and geography fields types and the following functions:

  1. st_asgeojson (PostGIS only)
  2. st_astext
  3. st_contains
  4. st_distance
  5. st_equals
  6. st_intersects
  7. st_overlaps
  8. st_simplify (PostGIS only)
  9. st_touches
  10. st_within
  11. st_x
  12. st_y

Here are some examples:

  1. >>> from gluon.dal import DAL, Field, geoPoint, geoLine, geoPolygon
  2. >>> db = DAL("mssql://user:pass@host:db")
  3. >>> sp = db.define_table('spatial', Field('loc', 'geometry()'))

Below we insert a point, a line, and a polygon:

  1. >>> sp.insert(loc=geoPoint(1, 1))
  2. 1
  3. >>> sp.insert(loc=geoLine((100, 100), (20, 180), (180, 180)))
  4. 2
  5. >>> sp.insert(loc=geoPolygon((0, 0), (150, 0), (150, 150), (0, 150), (0, 0)))
  6. 3

Notice that

  1. rows = db(sp).select()

Always returns the geometry data serialized as text. You can also do the same more explicitly using st_astext():

  1. >>> print db(sp).select(sp.id, sp.loc.st_astext())
  2. spatial.id,spatial.loc.STAsText()
  3. 1,"POINT (1 2)"
  4. 2,"LINESTRING (100 100, 20 180, 180 180)"
  5. 3,"POLYGON ((0 0, 150 0, 150 150, 0 150, 0 0))"

You can ask for the native representation by using st_asgeojson() (in PostGIS only):

  1. >>> print db(sp).select(sp.id, sp.loc.st_asgeojson().with_alias('loc'))
  2. spatial.id,loc
  3. 1,[1, 2]
  4. 2,[[100, 100], [20 180], [180, 180]]
  5. 3,[[[0, 0], [150, 0], [150, 150], [0, 150], [0, 0]]]

(notice an array is a point, an array of arrays is a line, and an array of array of arrays is a polygon).

Here are example of how to use geographical functions:

  1. >>> query = sp.loc.st_intersects(geoLine((20, 120), (60, 160)))
  2. >>> query = sp.loc.st_overlaps(geoPolygon((1, 1), (11, 1), (11, 11), (11, 1), (1, 1)))
  3. >>> query = sp.loc.st_contains(geoPoint(1, 1))
  4. >>> print db(query).select(sp.id, sp.loc)
  5. spatial.id,spatial.loc
  6. 3,"POLYGON ((0 0, 150 0, 150 150, 0 150, 0 0))"

Computed distances can also be retrieved as floating point numbers:

  1. >>> dist = sp.loc.st_distance(geoPoint(-1,2)).with_alias('dist')
  2. >>> print db(sp).select(sp.id, dist)
  3. spatial.id,dist
  4. 1,2.0
  5. 2,140.714249456
  6. 3,1.0

Copy data from one db into another

Consider the situation in which you have been using the following database:

  1. db = DAL('sqlite://storage.sqlite')

and you wish to move to another database using a different connection string:

  1. db = DAL('postgres://username:password@localhost/mydb')

Before you switch, you want to move the data and rebuild all the metadata for the new database. We assume the new database to exist but we also assume it is empty.

Web2py provides a script that does this work for you:

  1. cd web2py
  2. python scripts/cpdb.py \
  3. -f applications/app/databases \
  4. -y 'sqlite://storage.sqlite' \
  5. -Y 'postgres://username:password@localhost/mydb' \
  6. -d ../gluon

After running the script you can simply switch the connection string in the model and everything should work out of the box. The new data should be there.

This script provides various command line options that allows you to move data from one application to another, move all tables or only some tables, clear the data in the tables. For more info try:

  1. python scripts/cpdb.py -h

Note on new DAL and adapters

The source code of the Database Abstraction Layer was completely rewritten in 2010. While it stays backward compatible, the rewrite made it more modular and easier to extend. Here we explain the main logic.

The file “gluon/dal.py” defines, among other, the following classes.

  1. ConnectionPool
  2. BaseAdapter extends ConnectionPool
  3. Row
  4. DAL
  5. Reference
  6. Table
  7. Expression
  8. Field
  9. Query
  10. Set
  11. Rows

Their use has been explained in the previous sections, except for BaseAdapter. When the methods of a Table or Set object need to communicate with the database they delegate to methods of the adapter the task to generate the SQL and or the function call.

For example:

  1. db.mytable.insert(myfield='myvalue')

calls

  1. Table.insert(myfield='myvalue')

which delegates the adapter by returning:

  1. db._adapter.insert(db.mytable, db.mytable._listify(dict(myfield='myvalue')))

Here db.mytable._listify converts the dict of arguments into a list of (field,value) and calls the insert method of the adapter. db._adapter does more or less the following:

  1. query = db._adapter._insert(db.mytable, list_of_fields)
  2. db._adapter.execute(query)

where the first line builds the query and the second executes it.

BaseAdapter defines the interface for all adapters.

“gluon/dal.py” at the moment of writing this book, contains the following adapters:

  1. SQLiteAdapter extends BaseAdapter
  2. JDBCSQLiteAdapter extends SQLiteAdapter
  3. MySQLAdapter extends BaseAdapter
  4. PostgreSQLAdapter extends BaseAdapter
  5. JDBCPostgreSQLAdapter extends PostgreSQLAdapter
  6. OracleAdapter extends BaseAdapter
  7. MSSQLAdapter extends BaseAdapter
  8. MSSQL2Adapter extends MSSQLAdapter
  9. MSSQL3Adapter extends MSSQLAdapter
  10. MSSQL4Adapter extends MSSQLAdapter
  11. FireBirdAdapter extends BaseAdapter
  12. FireBirdEmbeddedAdapter extends FireBirdAdapter
  13. InformixAdapter extends BaseAdapter
  14. DB2Adapter extends BaseAdapter
  15. IngresAdapter extends BaseAdapter
  16. IngresUnicodeAdapter extends IngresAdapter
  17. GoogleSQLAdapter extends MySQLAdapter
  18. NoSQLAdapter extends BaseAdapter
  19. GoogleDatastoreAdapter extends NoSQLAdapter
  20. CubridAdapter extends MySQLAdapter (experimental)
  21. TeradataAdapter extends DB2Adapter (experimental)
  22. SAPDBAdapter extends BaseAdapter (experimental)
  23. CouchDBAdapter extends NoSQLAdapter (experimental)
  24. IMAPAdapter extends NoSQLAdapter (experimental)
  25. MongoDBAdapter extends NoSQLAdapter (experimental)
  26. VerticaAdapter extends MSSQLAdapter (experimental)
  27. SybaseAdapter extends MSSQLAdapter (experimental)

which override the behavior of the BaseAdapter.

Each adapter has more or less this structure:

  1. class MySQLAdapter(BaseAdapter):
  2. # specify a driver to use
  3. driver = globals().get('pymysql', None)
  4. # map web2py types into database types
  5. types = {
  6. 'boolean': 'CHAR(1)',
  7. 'string': 'VARCHAR(%(length)s)',
  8. 'text': 'LONGTEXT',
  9. ...
  10. }
  11. # connect to the database using driver
  12. def __init__(self, db, uri, pool_size=0, folder=None, db_codec ='UTF-8',
  13. credential_decoder=lambda x:x, driver_args={},
  14. adapter_args={}):
  15. # parse uri string and store parameters in driver_args
  16. ...
  17. # define a connection function
  18. def connect(driver_args=driver_args):
  19. return self.driver.connect(**driver_args)
  20. # place it in the pool
  21. self.pool_connection(connect)
  22. # set optional parameters (after connection)
  23. self.execute('SET FOREIGN_KEY_CHECKS=1;')
  24. self.execute("SET sql_mode='NO_BACKSLASH_ESCAPES';")
  25. # override BaseAdapter methods as needed
  26. def lastrowid(self, table):
  27. self.execute('select last_insert_id();')
  28. return int(self.cursor.fetchone()[0])

Looking at the various adapters as example should be easy to write new ones.

When db instance is created:

  1. db = DAL('mysql://...')

the prefix in the uri string defines the adapter. The mapping is defined in the following dictionary also in “gluon/dal.py”:

  1. ADAPTERS = {
  2. 'sqlite': SQLiteAdapter,
  3. 'spatialite': SpatiaLiteAdapter,
  4. 'sqlite:memory': SQLiteAdapter,
  5. 'spatialite:memory': SpatiaLiteAdapter,
  6. 'mysql': MySQLAdapter,
  7. 'postgres': PostgreSQLAdapter,
  8. 'postgres:psycopg2': PostgreSQLAdapter,
  9. 'postgres2:psycopg2': NewPostgreSQLAdapter,
  10. 'oracle': OracleAdapter,
  11. 'mssql': MSSQLAdapter,
  12. 'mssql2': MSSQL2Adapter,
  13. 'mssql3': MSSQL3Adapter,
  14. 'mssql4' : MSSQL4Adapter,
  15. 'vertica': VerticaAdapter,
  16. 'sybase': SybaseAdapter,
  17. 'db2': DB2Adapter,
  18. 'teradata': TeradataAdapter,
  19. 'informix': InformixAdapter,
  20. 'informix-se': InformixSEAdapter,
  21. 'firebird': FireBirdAdapter,
  22. 'firebird_embedded': FireBirdAdapter,
  23. 'ingres': IngresAdapter,
  24. 'ingresu': IngresUnicodeAdapter,
  25. 'sapdb': SAPDBAdapter,
  26. 'cubrid': CubridAdapter,
  27. 'jdbc:sqlite': JDBCSQLiteAdapter,
  28. 'jdbc:sqlite:memory': JDBCSQLiteAdapter,
  29. 'jdbc:postgres': JDBCPostgreSQLAdapter,
  30. 'gae': GoogleDatastoreAdapter, # discouraged, for backward compatibility
  31. 'google:datastore': GoogleDatastoreAdapter,
  32. 'google:datastore+ndb': GoogleDatastoreAdapter,
  33. 'google:sql': GoogleSQLAdapter,
  34. 'couchdb': CouchDBAdapter,
  35. 'mongodb': MongoDBAdapter,
  36. 'imap': IMAPAdapter
  37. }

the uri string is then parsed in more detail by the adapter itself.

For any adapter you can replace the driver with a different one:

  1. import MySQLdb as mysqldb
  2. from gluon.dal import MySQLAdapter
  3. MySQLAdapter.driver = mysqldb

i.e. mysqldb has to be that module with a .connect() method. You can specify optional driver arguments and adapter arguments:

  1. db =DAL(..., driver_args={}, adapter_args={})