The uWSGI Caching Cookbook

This is a cookbook of various caching techniques using uWSGI internal routing, The uWSGI caching framework and uWSGI Transformations

The examples assume a modular uWSGI build. You can ignore the ‘plugins’ option, if you are using a monolithic build.

Recipes are tested over uWSGI 1.9.7. Older versions may not work.

Let’s start

This is a simple perl/PSGI Dancer app we deploy on an http-socket with 4 processes.

  1. use Dancer;
  2.  
  3. get '/' => sub {
  4. "Hello World!"
  5. };
  6.  
  7. dance;

This is the uWSGI config. Pay attention to the log-micros directive. The objective of uWSGI in-memory caching is generating a responsein less than 1 millisecond (yes, this is true), so we want to get the response time logging in microseconds (thousandths of a millisecond).

  1. [uwsgi]
  2. ; load the PSGI plugin as the default one
  3. plugins = 0:psgi
  4. ; load the Dancer app
  5. psgi = myapp.pl
  6. ; enable the master process
  7. master = true
  8. ; spawn 4 processes
  9. processes = 4
  10. ; bind an http socket to port 9090
  11. http-socket = :9090
  12. ; log response time with microseconds resolution
  13. log-micros = true

Run the uWSGI instance in your terminal and just make a bunch of requests to it.

  1. curl -D /dev/stdout http://localhost:9090/

If all goes well you should see something similar in your uWSGI logs:

  1. [pid: 26586|app: 0|req: 1/1] 192.168.173.14 () {24 vars in 327 bytes} [Wed Apr 17 09:06:58 2013] GET / => generated 12 bytes in 3497 micros (HTTP/1.1 200) 4 headers in 126 bytes (0 switches on core 0)
  2. [pid: 26586|app: 0|req: 2/2] 192.168.173.14 () {24 vars in 327 bytes} [Wed Apr 17 09:07:14 2013] GET / => generated 12 bytes in 1134 micros (HTTP/1.1 200) 4 headers in 126 bytes (0 switches on core 0)
  3. [pid: 26586|app: 0|req: 3/3] 192.168.173.14 () {24 vars in 327 bytes} [Wed Apr 17 09:07:16 2013] GET / => generated 12 bytes in 1249 micros (HTTP/1.1 200) 4 headers in 126 bytes (0 switches on core 0)
  4. [pid: 26586|app: 0|req: 4/4] 192.168.173.14 () {24 vars in 327 bytes} [Wed Apr 17 09:07:17 2013] GET / => generated 12 bytes in 953 micros (HTTP/1.1 200) 4 headers in 126 bytes (0 switches on core 0)
  5. [pid: 26586|app: 0|req: 5/5] 192.168.173.14 () {24 vars in 327 bytes} [Wed Apr 17 09:07:18 2013] GET / => generated 12 bytes in 1016 micros (HTTP/1.1 200) 4 headers in 126 bytes (0 switches on core 0)

while cURL will return:

  1. HTTP/1.1 200 OK
  2. Server: Perl Dancer 1.3112
  3. Content-Length: 12
  4. Content-Type: text/html
  5. X-Powered-By: Perl Dancer 1.3112
  6.  
  7. Hello World!

The first request on a process took about 3 milliseconds (this is normal as lot of code is executed for the first request), but the following run in about 1 millisecond.

Now we want to store the response in the uWSGI cache.

The first recipe

We first create a uWSGI cache named ‘mycache’ with 100 slots of 64 KiB each (new options are at the end of the config) and for each request for ‘/’ we search in it for a specific item named ‘myhome’.

This time we load the router_cache plugin too (though it is built-in by default in monolithic servers).

  1. [uwsgi]
  2. ; load the PSGI plugin as the default one
  3. plugins = 0:psgi,router_cache
  4. ; load the Dancer app
  5. psgi = myapp.pl
  6. ; enable the master process
  7. master = true
  8. ; spawn 4 processes
  9. processes = 4
  10. ; bind an http socket to port 9090
  11. http-socket = :9090
  12. ; log response time with microseconds resolution
  13. log-micros = true
  14.  
  15. ; create a cache with 100 items (default size per-item is 64k)
  16. cache2 = name=mycache,items=100
  17. ; at each request for / check for a 'myhome' item in the 'mycache' cache
  18. ; 'route' apply a regexp to the PATH_INFO request var
  19. route = ^/$ cache:key=myhome,name=mycache

Restart uWSGI and re-run the previous test with cURL. Sadly nothing will change. Why?

Because you did not instruct uWSGI to store the plugin response in the cache. You need to use the cachestore routing action…

  1. [uwsgi]
  2. ; load the PSGI plugin as the default one
  3. plugins = 0:psgi,router_cache
  4. ; load the Dancer app
  5. psgi = myapp.pl
  6. ; enable the master process
  7. master = true
  8. ; spawn 4 processes
  9. processes = 4
  10. ; bind an http socket to port 9090
  11. http-socket = :9090
  12. ; log response time with microseconds resolution
  13. log-micros = true
  14.  
  15. ; create a cache with 100 items (default size per-item is 64k)
  16. cache2 = name=mycache,items=100
  17. ; at each request for / check for a 'myhome' item in the 'mycache' cache
  18. ; 'route' apply a regexp to the PATH_INFO request var
  19. route = ^/$ cache:key=myhome,name=mycache
  20. ; store each successful request (200 http status code) for '/' in the 'myhome' item
  21. route = ^/$ cachestore:key=myhome,name=mycache

Now re-run the test, and you should see requests going down to a range of 100-300 microseconds. The gain depends on various factors, but you should gain at least 60% in response time.

The log line reports -1 as the app id:

  1. [pid: 26703|app: -1|req: -1/2] 192.168.173.14 () {24 vars in 327 bytes} [Wed Apr 17 09:24:52 2013] GET / => generated 12 bytes in 122 micros (HTTP/1.1 200) 2 headers in 64 bytes (0 switches on core 0)

This is because when a response is served from the cache your app/plugin is not touched (in this case, no perl call is involved).

You will note less headers too:

  1. HTTP/1.1 200 OK
  2. Content-Type: text/html
  3. Content-Length: 12
  4.  
  5. Hello World!

This is because only the body of a response is cached. By default the generated response is set as text/html but you can change itor let the MIME type engine do the work for you (see later).

Cache them all !!!

We want to cache all of our requests. Some of them returns images and css, while the others are always text/html

  1. [uwsgi]
  2. ; load the PSGI plugin as the default one
  3. plugins = 0:psgi,router_cache
  4. ; load the Dancer app
  5. psgi = myapp.pl
  6. ; enable the master process
  7. master = true
  8. ; spawn 4 processes
  9. processes = 4
  10. ; bind an http socket to port 9090
  11. http-socket = :9090
  12. ; log response time with microseconds resolution
  13. log-micros = true
  14.  
  15. ; create a cache with 100 items (default size per-item is 64k)
  16. cache2 = name=mycache,items=100
  17. ; load the mime types engine
  18. mime-file = /etc/mime.types
  19.  
  20. ; at each request starting with /img check it in the cache (use mime types engine for the content type)
  21. route = ^/img/(.+) cache:key=/img/$1,name=mycache,mime=1
  22.  
  23. ; at each request ending with .css check it in the cache
  24. route = \.css$ cache:key=${REQUEST_URI},name=mycache,content_type=text/css
  25.  
  26. ; fallback to text/html all of the others request
  27. route = .* cache:key=${REQUEST_URI},name=mycache
  28. ; store each successful request (200 http status code) in the 'mycache' cache using the REQUEST_URI as key
  29. route = .* cachestore:key=${REQUEST_URI},name=mycache

Multiple caches

You may want/need to store items in different caches. We can change the previous recipe to use three different cachesfor images, css and html responses.

  1. [uwsgi]
  2. ; load the PSGI plugin as the default one
  3. plugins = 0:psgi,router_cache
  4. ; load the Dancer app
  5. psgi = myapp.pl
  6. ; enable the master process
  7. master = true
  8. ; spawn 4 processes
  9. processes = 4
  10. ; bind an http socket to port 9090
  11. http-socket = :9090
  12. ; log response time with microseconds resolution
  13. log-micros = true
  14.  
  15. ; create a cache with 100 items (default size per-item is 64k)
  16. cache2 = name=mycache,items=100
  17.  
  18. ; create a cache for images with dynamic size (images can be big, so do not waste memory)
  19. cache2 = name=images,items=20,bitmap=1,blocks=100
  20.  
  21. ; a cache for css (20k per-item is more than enough)
  22. cache2 = name=stylesheets,items=30,blocksize=20000
  23.  
  24. ; load the mime types engine
  25. mime-file = /etc/mime.types
  26.  
  27. ; at each request starting with /img check it in the 'images' cache (use mime types engine for the content type)
  28. route = ^/img/(.+) cache:key=/img/$1,name=images,mime=1
  29.  
  30. ; at each request ending with .css check it in the 'stylesheets' cache
  31. route = \.css$ cache:key=${REQUEST_URI},name=stylesheets,content_type=text/css
  32.  
  33. ; fallback to text/html all of the others request
  34. route = .* cache:key=${REQUEST_URI},name=mycache
  35.  
  36. ; store each successful request (200 http status code) in the 'mycache' cache using the REQUEST_URI as key
  37. route = .* cachestore:key=${REQUEST_URI},name=mycache
  38. ; store images and stylesheets in the corresponding caches
  39. route = ^/img/ cachestore:key=${REQUEST_URI},name=images
  40. route = ^/css/ cachestore:key=${REQUEST_URI},name=stylesheets

Important, every matched ‘cachestore’ will overwrite the previous one. So we are adding .* as the first rule.

Being more aggressive, the Expires HTTP header

You can set an expiration for each cache item. If an item has an expire, it will be translated to HTTP Expires headers.This means that once you have sent a cache item to the browser, it will not request it until it expires!

We use the previous recipe simply adding different expires to the items.

  1. [uwsgi]
  2. ; load the PSGI plugin as the default one
  3. plugins = 0:psgi,router_cache
  4. ; load the Dancer app
  5. psgi = myapp.pl
  6. ; enable the master process
  7. master = true
  8. ; spawn 4 processes
  9. processes = 4
  10. ; bind an http socket to port 9090
  11. http-socket = :9090
  12. ; log response time with microseconds resolution
  13. log-micros = true
  14.  
  15. ; create a cache with 100 items (default size per-item is 64k)
  16. cache2 = name=mycache,items=100
  17.  
  18. ; create a cache for images with dynamic size (images can be big, so do not waste memory)
  19. cache2 = name=images,items=20,bitmap=1,blocks=100
  20.  
  21. ; a cache for css (20k per-item is more than enough)
  22. cache2 = name=stylesheets,items=30,blocksize=20000
  23.  
  24. ; load the mime types engine
  25. mime-file = /etc/mime.types
  26.  
  27. ; at each request starting with /img check it in the 'images' cache (use mime types engine for the content type)
  28. route = ^/img/(.+) cache:key=/img/$1,name=images,mime=1
  29.  
  30. ; at each request ending with .css check it in the 'stylesheets' cache
  31. route = \.css$ cache:key=${REQUEST_URI},name=stylesheets,content_type=text/css
  32.  
  33. ; fallback to text/html all of the others request
  34. route = .* cache:key=${REQUEST_URI},name=mycache
  35.  
  36. ; store each successful request (200 http status code) in the 'mycache' cache using the REQUEST_URI as key
  37. route = .* cachestore:key=${REQUEST_URI},name=mycache,expires=60
  38. ; store images and stylesheets in the corresponding caches
  39. route = ^/img/ cachestore:key=${REQUEST_URI},name=images,expires=3600
  40. route = ^/css/ cachestore:key=${REQUEST_URI},name=stylesheets,expires=3600

images and stylesheets are cached for 1 hour, while html response are cached for 1 minute

Monitoring Caches

The stats server exposes cache information.

There is an ncurses-based tool (https://pypi.python.org/pypi/uwsgicachetop) using that information.

Storing GZIP variant of an object

Back to the first recipe. We may want to store two copies of a response. The “clean” one and a gzipped one for clients supporting gzip encoding.

To enable the gzip copy you only need to choose a name for the item and pass it as the ‘gzip’ option of the cachestore action.

Then check for HTTP_ACCEPT_ENCODING request header. If it contains the ‘gzip’ word you can send it the gzip variant.

  1. [uwsgi]
  2. ; load the PSGI plugin as the default one
  3. plugins = 0:psgi,router_cache
  4. ; load the Dancer app
  5. psgi = myapp.pl
  6. ; enable the master process
  7. master = true
  8. ; spawn 4 processes
  9. processes = 4
  10. ; bind an http socket to port 9090
  11. http-socket = :9090
  12. ; log response time with microseconds resolution
  13. log-micros = true
  14.  
  15. ; create a cache with 100 items (default size per-item is 64k)
  16. cache2 = name=mycache,items=100
  17. ; if the client support GZIP give it the gzip body
  18. route-if = contains:${HTTP_ACCEPT_ENCODING};gzip cache:key=gzipped_myhome,name=mycache,content_encoding=gzip
  19. ; else give it the clear version
  20. route = ^/$ cache:key=myhome,name=mycache
  21.  
  22. ; store each successful request (200 http status code) for '/' in the 'myhome' item in gzip too
  23. route = ^/$ cachestore:key=myhome,gzip=gzipped_myhome,name=mycache

Storing static files in the cache for fast serving

You can populate a uWSGI cache on server startup with static files for fast serving them. The option –load-file-in-cache is the right tool for the job

  1. [uwsgi]
  2. plugins = 0:notfound,router_cache
  3. http-socket = :9090
  4. cache2 = name=files,bitmap=1,items=1000,blocksize=10000,blocks=2000
  5. load-file-in-cache = files /usr/share/doc/socat/index.html
  6. route-run = cache:key=${REQUEST_URI},name=files

You can specify all of the –load-file-in-cache directive you need but a better approach would be

  1. [uwsgi]
  2. plugins = router_cache
  3. http-socket = :9090
  4. cache2 = name=files,bitmap=1,items=1000,blocksize=10000,blocks=2000
  5. for-glob = /usr/share/doc/socat/*.html
  6. load-file-in-cache = files %(_)
  7. endfor =
  8. route-run = cache:key=${REQUEST_URI},name=files

this will store all of the html files in /usr/share/doc/socat.

Items are stored with the path as the key.

When a non-existent item is requested the connection is closed and you should get an ugly

  1. -- unavailable modifier requested: 0 --

This is because the internal routing system failed to manage the request, and no request plugin is available to manage the request.

You can build a better infrastructure using the simple ‘notfound’ plugin (it will always return a 404)

  1. [uwsgi]
  2. plugins = 0:notfound,router_cache
  3. http-socket = :9090
  4. cache2 = name=files,bitmap=1,items=1000,blocksize=10000,blocks=2000
  5. for-glob = /usr/share/doc/socat/*.html
  6. load-file-in-cache = files %(_)
  7. endfor =
  8. route-run = cache:key=${REQUEST_URI},name=files

You can store file in the cache as gzip too using –load-file-in-cache-gzip

This option does not allow to set the name of the cache item, so to support client iwith and without gzip support we can use 2 different caches

  1. [uwsgi]
  2. plugins = 0:notfound,router_cache
  3. http-socket = :9090
  4. cache2 = name=files,bitmap=1,items=1000,blocksize=10000,blocks=2000
  5. cache2 = name=compressedfiles,bitmap=1,items=1000,blocksize=10000,blocks=2000
  6. for-glob = /usr/share/doc/socat/*.html
  7. load-file-in-cache = files %(_)
  8. load-file-in-cache-gzip = compressedfiles %(_)
  9. endfor =
  10. ; take the item from the compressed cache
  11. route-if = contains:${HTTP_ACCEPT_ENCODING};gzip cache:key=${REQUEST_URI},name=compressedfiles,content_encoding=gzip
  12. ; fallback to the uncompressed one
  13. route-run = cache:key=${REQUEST_URI},name=files

Caching for authenticated users

If you authenticate users with http basic auth, you can differentiate caching for each one using the ${REMOTE_USER} request variable:

  1. [uwsgi]
  2. ; load the PSGI plugin as the default one
  3. plugins = 0:psgi,router_cache
  4. ; load the Dancer app
  5. psgi = myapp.pl
  6. ; enable the master process
  7. master = true
  8. ; spawn 4 processes
  9. processes = 4
  10. ; bind an http socket to port 9090
  11. http-socket = :9090
  12. ; log response time with microseconds resolution
  13. log-micros = true
  14.  
  15. ; create a cache with 100 items (default size per-item is 64k)
  16. cache2 = name=mycache,items=100
  17. ; check if the user is authenticated
  18. route-if-not = empty:${REMOTE_USER} goto:cacheme
  19. route-run = break:
  20.  
  21. ; the following rules are executed only if REMOTE_USER is defined
  22. route-label = cacheme
  23. route = ^/$ cache:key=myhome_for_${REMOTE_USER},name=mycache
  24. ; store each successful request (200 http status code) for '/'
  25. route = ^/$ cachestore:key=myhome_for_${REMOTE_USER},name=mycache

Cookie-based authentication is generally more complex, but the vast majority of time a session id is passed as a cookie.

You may want to use this session_id as the key

  1. [uwsgi]
  2. ; load the PHP plugin as the default one
  3. plugins = 0:php,router_cache
  4. ; enable the master process
  5. master = true
  6. ; spawn 4 processes
  7. processes = 4
  8. ; bind an http socket to port 9090
  9. http-socket = :9090
  10. ; log response time with microseconds resolution
  11. log-micros = true
  12.  
  13. ; create a cache with 100 items (default size per-item is 64k)
  14. cache2 = name=mycache,items=100
  15. ; check if the user is authenticated
  16. route-if-not = empty:${cookie[PHPSESSID]} goto:cacheme
  17. route-run = break:
  18.  
  19. ; the following rules are executed only if the PHPSESSID cookie is defined
  20. route-label = cacheme
  21. route = ^/$ cache:key=myhome_for_${cookie[PHPSESSID]},name=mycache
  22. ; store each successful request (200 http status code) for '/'
  23. route = ^/$ cachestore:key=myhome_for_${cookie[PHPSESSID]},name=mycache

Obviously a malicious user could build a fake session id and could potentially fill your cache. You should always checkthe session id. There is no single solution, but a good example for file-based php session is the following one:

  1. [uwsgi]
  2. ; load the PHP plugin as the default one
  3. plugins = 0:php,router_cache
  4. ; enable the master process
  5. master = true
  6. ; spawn 4 processes
  7. processes = 4
  8. ; bind an http socket to port 9090
  9. http-socket = :9090
  10. ; log response time with microseconds resolution
  11. log-micros = true
  12.  
  13. ; create a cache with 100 items (default size per-item is 64k)
  14. cache2 = name=mycache,items=100
  15. ; check if the user is authenticated
  16. route-if-not = empty:${cookie[PHPSESSID]} goto:cacheme
  17. route-run = break:
  18.  
  19. ; the following rules are executed only if the PHPSESSID cookie is defined
  20. route-label = cacheme
  21. ; stop if the session file does not exist
  22. route-if-not = isfile:/var/lib/php5/sessions/sess_${cookie[PHPSESSID]} break:
  23. route = ^/$ cache:key=myhome_for_${cookie[PHPSESSID]},name=mycache
  24. ; store each successful request (200 http status code) for '/'
  25. route = ^/$ cachestore:key=myhome_for_${cookie[PHPSESSID]},name=mycache

Caching to files

Sometimes, instead of caching in memory you want to store static files.

The transformation_tofile plugin allows you to store responses in files:

  1. [uwsgi]
  2. ; load the PHP plugin as the default one
  3. plugins = 0:psgi,transformation_tofile,router_static
  4. ; load the Dancer app
  5. psgi = myapp.pl
  6. ; enable the master process
  7. master = true
  8. ; spawn 4 processes
  9. processes = 4
  10. ; bind an http socket to port 9090
  11. http-socket = :9090
  12. ; log response time with microseconds resolution
  13. log-micros = true
  14.  
  15. ; check if a file exists
  16. route-if = isfile:/var/www/cache/${hex[PATH_INFO]}.html static:/var/www/cache/${hex[PATH_INFO]}.html
  17. ; otherwise store the response in it
  18. route-run = tofile:/var/www/cache/${hex[PATH_INFO]}.html

the hex[] routing var take a request variable content and encode it in hexadecimal. As PATH_INFO tend to contains / it is a better approach than storingfull path names (or using other encoding scheme like base64 that can include slashes too)