GridFS Example
This example shows how to use gridfs
to store large binary objects (e.g. files) in MongoDB.
See also
The API docs for gridfs
.
See also
This blog post for some motivation behind this API.
Setup
We start by creating a GridFS
instance to use:
>>> from pymongo import MongoClient
>>> import gridfs
>>>
>>> db = MongoClient().gridfs_example
>>> fs = gridfs.GridFS(db)
Every GridFS
instance is created with and will operate on a specific Database
instance.
Saving and Retrieving Data
The simplest way to work with gridfs
is to use its key/value interface (the put()
and get()
methods). To write data to GridFS, use put()
:
>>> a = fs.put(b"hello world")
put()
creates a new file in GridFS, and returns the value of the file document’s "_id"
key. Given that "_id"
we can use get()
to get back the contents of the file:
>>> fs.get(a).read()
'hello world'
get()
returns a file-like object, so we get the file’s contents by calling read()
.
In addition to putting a str
as a GridFS file, we can also put any file-like object (an object with a read()
method). GridFS will handle reading the file in chunk-sized segments automatically. We can also add additional attributes to the file as keyword arguments:
>>> b = fs.put(fs.get(a), filename="foo", bar="baz")
>>> out = fs.get(b)
>>> out.read()
'hello world'
>>> out.filename
u'foo'
>>> out.bar
u'baz'
>>> out.upload_date
datetime.datetime(...)
The attributes we set in put()
are stored in the file document, and retrievable after calling get()
. Some attributes (like "filename"
) are special and are defined in the GridFS specification - see that document for more details.