Using Libcloud in multi-threaded and async environments

Libcloud’s primary task is to communicate with different provider APIs using HTTP. This means most of the work is not CPU intensive, but performing all those HTTP requests includes a lot of waiting which makes the library I/O bound.

Most of the time you want to perform more operations in parallel or just want your code to finish faster (for example starting a lot of servers or periodically polling for node status).

Problems like this are usually solved using threads or async libraries such as Twisted, Tornado or gevent.

This page contains some information and tips about how to use Libcloud in such environments.

Libcloud and thread-safety

Important thing to keep in mind when dealing with threads is thread-safety. Libcloud driver instance is not thread safe. This means if you don’t want to deal with complex (and usually inefficient) locking the easiest solution is to create a new driver instance inside each thread.

Using Libcloud with gevent

gevent has an ability to monkey patch and replace functions in the Python socket, urllib2, httplib and time module with its own functions which don’t block.

You need to do two things when you want to use Libcloud with gevent:

  • Enable monkey patching
  1. from gevent import monkey
  2. monkey.patch_all()
  • Create a separate driver instance for each Greenlet. This is necessary because a driver instance reuses the same Connection class.

For an example see Efficiently download multiple files using gevent.

Using Libcloud with Twisted

Libcloud has no Twisted support included in the core which means you need to be careful when you use it with Twisted and some other async frameworks.

If you don’t use it properly it can block the whole reactor (similar as any other blocking library or a long CPU-intensive task) which means the execution of other pending tasks in the event queue will be blocked.

A simple solution to prevent blocking the reactor is to run Libcloud calls inside a thread. In Twisted this can be achieved using threads.deferToThread which runs a provided method inside the Twisted thread pool.

The example below demonstrates how to create a new node inside a thread without blocking the whole reactor.

  1. from __future__ import absolute_import
  2. from pprint import pprint
  3. # pylint: disable=import-error
  4. from twisted.internet import defer, threads, reactor
  5. from libcloud.compute.types import Provider
  6. from libcloud.compute.providers import get_driver
  7. @defer.inlineCallbacks
  8. def create_node(name):
  9. node = yield threads.deferToThread(_thread_create_node,
  10. name=name)
  11. pprint(node)
  12. reactor.stop()
  13. def _thread_create_node(name):
  14. Driver = get_driver(Provider.RACKSPACE)
  15. conn = Driver('username', 'api key')
  16. image = conn.list_images()[0]
  17. size = conn.list_sizes()[0]
  18. node = conn.create_node(name=name, image=image, size=size)
  19. return node
  20. def stop(*args, **kwargs):
  21. reactor.stop()
  22. d = create_node(name='my-lc-node')
  23. d.addCallback(stop) # pylint: disable=no-member
  24. d.addErrback(stop) # pylint: disable=no-member
  25. reactor.run()