Request Content Checksums

Various pieces of code can consume the request data and preprocess it.For instance JSON data ends up on the request object already read andprocessed, form data ends up there as well but goes through a differentcode path. This seems inconvenient when you want to calculate thechecksum of the incoming request data. This is necessary sometimes forsome APIs.

Fortunately this is however very simple to change by wrapping the inputstream.

The following example calculates the SHA1 checksum of the incoming data asit gets read and stores it in the WSGI environment:

  1. import hashlib
  2.  
  3. class ChecksumCalcStream(object):
  4.  
  5. def __init__(self, stream):
  6. self._stream = stream
  7. self._hash = hashlib.sha1()
  8.  
  9. def read(self, bytes):
  10. rv = self._stream.read(bytes)
  11. self._hash.update(rv)
  12. return rv
  13.  
  14. def readline(self, size_hint):
  15. rv = self._stream.readline(size_hint)
  16. self._hash.update(rv)
  17. return rv
  18.  
  19. def generate_checksum(request):
  20. env = request.environ
  21. stream = ChecksumCalcStream(env['wsgi.input'])
  22. env['wsgi.input'] = stream
  23. return stream._hash

To use this, all you need to do is to hook the calculating stream inbefore the request starts consuming data. (Eg: be careful accessingrequest.form or anything of that nature. before_request_handlersfor instance should be careful not to access it).

Example usage:

  1. @app.route('/special-api', methods=['POST'])def special_api(): hash = generate_checksum(request)

  2. # Accessing this parses the input stream
  3. files = request.files
  4. # At this point the hash is fully constructed.
  5. checksum = hash.hexdigest()
  6. return 'Hash was: %s' % checksum