GridFS Example

This example shows how to use gridfs to store large binary objects (e.g. files) in MongoDB.

See also

The API docs for gridfs.

See also

This blog post for some motivation behind this API.

Setup

We start by creating a GridFS instance to use:

  1. >>> from pymongo import MongoClient
  2. >>> import gridfs
  3. >>>
  4. >>> db = MongoClient().gridfs_example
  5. >>> fs = gridfs.GridFS(db)

Every GridFS instance is created with and will operate on a specific Database instance.

Saving and Retrieving Data

The simplest way to work with gridfs is to use its key/value interface (the put() and get() methods). To write data to GridFS, use put():

  1. >>> a = fs.put(b"hello world")

put() creates a new file in GridFS, and returns the value of the file document’s "_id" key. Given that "_id" we can use get() to get back the contents of the file:

  1. >>> fs.get(a).read()
  2. 'hello world'

get() returns a file-like object, so we get the file’s contents by calling read().

In addition to putting a str as a GridFS file, we can also put any file-like object (an object with a read() method). GridFS will handle reading the file in chunk-sized segments automatically. We can also add additional attributes to the file as keyword arguments:

  1. >>> b = fs.put(fs.get(a), filename="foo", bar="baz")
  2. >>> out = fs.get(b)
  3. >>> out.read()
  4. 'hello world'
  5. >>> out.filename
  6. u'foo'
  7. >>> out.bar
  8. u'baz'
  9. >>> out.upload_date
  10. datetime.datetime(...)

The attributes we set in put() are stored in the file document, and retrievable after calling get(). Some attributes (like "filename") are special and are defined in the GridFS specification - see that document for more details.