Deploying on Google App Engine

It is possible to run web2py code on Google App Engine (GAE)[gae] , including DAL code.

GAE supports two versions of Python: 2.5 and 2.7 but web2py requires 2.7. Look into the “app.yaml” file described below for configuration details.

GAE also supports a Google SQL database (compatible with MySQL) and a Google NoSQL (referred to as “Datastore”).

web2py supports both, and indeed, can connect to both at the same time, using the connection strings detailed in Chapter 6.

The GAE platform provides several advantages over normal hosting solutions:

  • Ease of deployment. Google completely abstracts the underlying architecture.
  • Scalability. Google will replicate your app as many times as it takes to serve all concurrent requests.
  • One can choose between a SQL and a NoSQL database (or both together).

But also some disadvantages:

  • No read or write access to the file system.
  • Not all Python libraries are supported (you can deploy any pure Python library but not the binary ones but PIL and numpy are alreday installed).

While Google Cloud SQL is a regular mysql database, Google Datastore has some specific disadvantages:

  • No typical transactions; eventual consistency rather than strong consistency for queries.
  • No complex datastore queries. In particular there are no JOIN, LIKE, and DATE/DATETIME operators.

Here we provide a quick overview of GAE and we focus on web2py specific issues, we refer you to the official GAE documentation online for details.

Attention: You must run the web2py source distribution, not a binary distribution.

Configuration

There are three configuration files to be aware of:

  1. web2py/app.yaml
  2. web2py/queue.yaml
  3. web2py/index.yaml

app.yaml and queue.yaml are most easily created by using the template files app.example.yaml and queue.example.yaml as starting points. index.yaml is created automatically by the Google deployment software.

app.yaml has the following structure (it has been shortened using …):

  1. application: web2py
  2. version: 1
  3. api_version: 1
  4. runtime: python
  5. handlers:
  6. - url: /_ah/stats.*
  7. ...
  8. - url: /(?P<a>.+?)/static/(?P<b>.+)
  9. ...
  10. - url: /_ah/admin/.*
  11. ...
  12. - url: /_ah/queue/default
  13. ...
  14. - url: .*
  15. ...
  16. skip_files:
  17. ...

app.example.yaml (when copied to app.yaml) is configured to deploy the web2py welcome application, but not the admin or example applications. You must replace web2py with the application id that you used when registering with Google App Engine.

url: /(.+?)/static/(.+) instructs GAE to serve your app static files directly, without calling web2py logic, for speed.

url:.* instructs web2py to use the gaehandler.py for every other request.

The skip_files: session is list of regular expressions for files that do not need to deployed on GAE. In particular the lines:

  1. (applications/(admin|examples)/.*)|
  2. ((admin|examples|welcome).(w2p|tar))|

tell GAE not to deploy the default applications, except for the unpacked welcome scaffolding application. You can add more applications to be ignored here.

Except for the application id and version, you probably do not need to edit app.yaml, though you may wish to exclude the welcome application.

The file queue.yaml is used to configure GAE task queues.

The file index.yaml is automatically generated when you run your application locally using the GAE appserver (the web server that comes with the Google SDK). It contains something like this:

  1. indexes:
  2. - kind: person
  3. properties:
  4. - name: name
  5. direction: desc

In this example it tells GAE to create an index for table “person” that will be used to sort by “name” in reversed alphabetical order. You will not be able to search and sort records in your app without corresponding indexes.

It is important to always run your apps locally with the appserver and try every functionality of your app, before deployment. This will be important for testing purposes, but also to automatically generate the “index.yaml” file. Occasionally you may want to edit this file and perform cleanup, such as removing duplicate entries.

Running and deployment

Linux

Here we assume you have installed the GAE SDK. At the time of writing, web2py for GAE requires Python 2.7. You can run your app from inside the “web2py” folder by using the appserver command:

  1. python dev_appserver.py ../web2py

This will start the appserver and you can run your application at the URL:

  1. http://127.0.0.1:8080/

In order to upload your app on GAE, make sure you have edited the “app.yaml” file as explained before and set the proper application id, then run:

  1. python appcfg.py update ../web2py
Mac, Windows

On Mac and Windows, you can also use the Google App Engine Launcher. You can download the software from ref.[gae] .

Choose [File][Add Existing Application], set the path to the path of the top-level web2py folder, and press the [Run] button in the toolbar. After you have tested that it works locally, you can deploy it on GAE by simply clicking on the [Deploy] button on the toolbar (assuming you have an account).

image

On GAE, the web2py tickets/errors are also logged into the GAE administration console where logs can be accessed and searched online.

image

Configuring the handler

The file gaehandler.py is responsible for serving files on GAE and it has a few options. Here are their default values:

  1. LOG_STATS = False
  2. APPSTATS = True
  3. DEBUG = False

LOG_STATS will log the time to serve pages in the GAE logs.

APPSTATS will enable GAE appstats which provides profiling statistics. They will be made available at the URL:

  1. http://localhost:8080/_ah/stats

DEBUG sets debug mode. It make no difference in practice unless checked explicitly in your code via gluon.settings.web2py_runtime.

Avoid the filesystem

On GAE you have no access to the filesystem. You cannot open any file for writing.

For this purpose, on GAE, web2py automatically stores all uploaded files in the datastore, whether or not “upload” Field(s) have a uploadfield attribute.

You also should store sessions and tickets in the database and you have to be explicit:

  1. if request.env.web2py_runtime_gae
  2. db = DAL('gae')
  3. session.connect(request, response, db)
  4. else:
  5. db = DAL('sqlite://storage.sqlite')

The above code checks whether you are running on GAE, connects to BigTable, and instructs web2py to store sessions and tickets in there. It connects to a sqlite database otherwise. This code is already in the scaffolding app in the file “db.py”.

Memcache

If you prefer, you can store sessions in memcache too:

  1. from gluon.contrib.gae_memcache import MemcacheClient
  2. from gluon.contrib.memdb import MEMDB
  3. cache.memcache = MemcacheClient(request)
  4. cache.ram = cache.disk = cache.memcache
  5. session.connect(request, response, db=MEMDB(cache.memcache.client))

Notice that on GAE cache.ram and cache.disk should not be used, so we make them point to cache.memcache.

Datastore issues

While the Google Clould SQL functions as a regular SQL database, and is indeed based at the time of writing on mysql, Google Datastore presents significant differences.

Lack of JOINs

The lack of JOIN operations and typical relational functionality of the Datastore requires removing JOINs from web2py queries and de-normalizing the database.

Google App Engine supports some special field types, such as ListProperty and StringListProperty. You can use these types with web2py using the following old syntax:

  1. from gluon.dal import gae
  2. db.define_table('product',
  3. Field('name'),
  4. Field('tags', type=gae.StringListProperty())

or the equivalent new syntax:

  1. db.define_table('product',
  2. Field('name'),
  3. Field('tags', 'list:string')

In both cases the “tags” field is a StringListProperty therefore its values must be lists of strings, compatibly with the GAE documentation. The second notation is to be preferred because web2py will treat the field in a smarter way in the context of forms and because it will work with relational databases too.

Similarly, web2py supports list:integer and list:reference which map into a ListProperty(int).

list types are discussed in more detail in Chapter 6.

Database migrations

A good practice for migrations using Google AppEngine is the following. AppEngine supports multiple code versions. Use one code version (e.g., version 1) for the user-visible site, and another code version (e.g., version 2) for the admin code. In app.yaml for version 2, declare the handler as follows (assuming Python 2.7 is used):

  1. - url: .*
  2. script: gaehandler.wsgiapp # WSGI (Python 2.7 only)
  3. secure: optional
  4. login: admin

The login: admin clause ensures that only admins can use version 2. In the database connection string, specify migrate_enabled=False. To perform a migration, it is best to disable database access concurrent to the migration. Proceed as follows:

  • Add a file named DISABLED to the top directory of your version 1 application (the parent directory of the /controllers, /views, etc. directories), and upload the new version to GAE. This will disable version 1, and display a message “The site is temporarily down for maintenance”.
  • Upload to version 2 code with migrate_enabled=True in the db connection string, and visit it from an admin account, triggering the migration.
  • Upload to version 2 code with migrate_enabled=False, to disable further migrations.
  • Remove the file named DISABLED from version 1, and upload the code to version 1. This makes the site again visible to all.

GAE and https

If you application has id “myapp” your GAE domain is

  1. http://myapp.appspot.com/

and it can also be accessed via HTTPS

  1. https://myapp.appspot.com/

In this case it will use an “appspot.com” certificate provided by Google.

You can register a DNS entry and use any other domain name you own for your app but you will not be able to use HTTPS on it. At the time of writing, this is a GAE limitation.