gridfs – Tools for working with GridFS

gridfs – Tools for working with GridFS

GridFS is a specification for storing large objects in Mongo.

The gridfs package is an implementation of GridFS on top of pymongo, exposing a file-like interface.

See also

The MongoDB documentation on gridfs.

delete(file_id, session=None)

Given an file_id, delete this stored file’s files collection document and associated chunks from a GridFS bucket.

For example:
```
my_db = MongoClient().test
fs = GridFSBucket(my_db)
# Get _id of file to delete
file_id = fs.upload_from_stream("test_file", "data I want to store!")
fs.delete(file_id)
```
Raises NoFile if no file with file_id exists.
- Parameters
  - file_id: The _id of the file to be deleted.
  - session (optional): a ClientSession
Changed in version 3.6: Added session parameter.
download_to_stream(file_id, destination, session=None)

Downloads the contents of the stored file specified by file_id and writes the contents to destination.

For example:
```
my_db = MongoClient().test
fs = GridFSBucket(my_db)
# Get _id of file to read
file_id = fs.upload_from_stream("test_file", "data I want to store!")
# Get file to write to
file = open('myfile','wb+')
fs.download_to_stream(file_id, file)
file.seek(0)
contents = file.read()
```
Raises NoFile if no file with file_id exists.
- Parameters
  - file_id: The _id of the file to be downloaded.
  - destination: a file-like object implementing write().
  - session (optional): a ClientSession
Changed in version 3.6: Added session parameter.
download_to_stream_by_name(filename, destination, revision=- 1, session=None)

Write the contents of filename (with optional revision) to destination.

For example:
```
my_db = MongoClient().test
fs = GridFSBucket(my_db)
# Get file to write to
file = open('myfile','wb')
fs.download_to_stream_by_name("test_file", file)
```
Raises NoFile if no such version of that file exists.

Raises ValueError if filename is not a string.
- Parameters
  - filename: The name of the file to read from.
  - destination: A file-like object that implements write().
  - revision (optional): Which revision (documents with the same filename and different uploadDate) of the file to retrieve. Defaults to -1 (the most recent revision).
  - session (optional): a ClientSession
  Note
  
  Revision numbers are defined as follows:
  - 0 = the original stored file
  - 1 = the first revision
  - 2 = the second revision
  - etc…
  - -2 = the second most recent revision
  - -1 = the most recent revision
Changed in version 3.6: Added session parameter.
find(\args, **kwargs*)

Find and return the files collection documents that match filter

Returns a cursor that iterates across files matching arbitrary queries on the files collection. Can be combined with other modifiers for additional control.

For example:
```
for grid_data in fs.find({"filename": "lisa.txt"},
                        no_cursor_timeout=True):
    data = grid_data.read()
```
would iterate through all versions of “lisa.txt” stored in GridFS. Note that setting no_cursor_timeout to True may be important to prevent the cursor from timing out during long multi-file processing work.

As another example, the call:
```
most_recent_three = fs.find().sort("uploadDate", -1).limit(3)
```
would return a cursor to the three most recently uploaded files in GridFS.

Follows a similar interface to find() in Collection.

If a ClientSession is passed to find(), all returned GridOut instances are associated with that session.
- Parameters
  - filter: Search query.
  - batch_size (optional): The number of documents to return per batch.
  - limit (optional): The maximum number of documents to return.
  - no_cursor_timeout (optional): The server normally times out idle cursors after an inactivity period (10 minutes) to prevent excess memory use. Set this option to True prevent that.
  - skip (optional): The number of documents to skip before returning.
  - sort (optional): The order by which to sort results. Defaults to None.
open_download_stream(file_id, session=None)

Opens a Stream from which the application can read the contents of the stored file specified by file_id.

For example:
```
my_db = MongoClient().test
fs = GridFSBucket(my_db)
# get _id of file to read.
file_id = fs.upload_from_stream("test_file", "data I want to store!")
grid_out = fs.open_download_stream(file_id)
contents = grid_out.read()
```
Returns an instance of GridOut.

Raises NoFile if no file with file_id exists.
- Parameters
  - file_id: The _id of the file to be downloaded.
  - session (optional): a ClientSession
Changed in version 3.6: Added session parameter.
open_download_stream_by_name(filename, revision=- 1, session=None)

Opens a Stream from which the application can read the contents of filename and optional revision.

For example:
```
my_db = MongoClient().test
fs = GridFSBucket(my_db)
grid_out = fs.open_download_stream_by_name("test_file")
contents = grid_out.read()
```
Returns an instance of GridOut.

Raises NoFile if no such version of that file exists.

Raises ValueError filename is not a string.
- Parameters
  - filename: The name of the file to read from.
  - revision (optional): Which revision (documents with the same filename and different uploadDate) of the file to retrieve. Defaults to -1 (the most recent revision).
  - session (optional): a ClientSession
  Note
  
  Revision numbers are defined as follows:
  - 0 = the original stored file
  - 1 = the first revision
  - 2 = the second revision
  - etc…
  - -2 = the second most recent revision
  - -1 = the most recent revision
Changed in version 3.6: Added session parameter.
open_upload_stream(filename, chunk_size_bytes=None, metadata=None, session=None)

Opens a Stream that the application can write the contents of the file to.

The user must specify the filename, and can choose to add any additional information in the metadata field of the file document or modify the chunk size. For example:
```
my_db = MongoClient().test
fs = GridFSBucket(my_db)
grid_in = fs.open_upload_stream(
      "test_file", chunk_size_bytes=4,
      metadata={"contentType": "text/plain"})
grid_in.write("data I want to store!")
grid_in.close()  # uploaded on close
```
Returns an instance of GridIn.

Raises NoFile if no such version of that file exists. Raises ValueError if filename is not a string.
- Parameters
  - filename: The name of the file to upload.
  - chunk_size_bytes (options): The number of bytes per chunk of this file. Defaults to the chunk_size_bytes in GridFSBucket.
  - metadata (optional): User data for the ‘metadata’ field of the files collection document. If not provided the metadata field will be omitted from the files collection document.
  - session (optional): a ClientSession
Changed in version 3.6: Added session parameter.
open_upload_stream_with_id(file_id, filename, chunk_size_bytes=None, metadata=None, session=None)

Opens a Stream that the application can write the contents of the file to.

The user must specify the file id and filename, and can choose to add any additional information in the metadata field of the file document or modify the chunk size. For example:
```
my_db = MongoClient().test
fs = GridFSBucket(my_db)
grid_in = fs.open_upload_stream_with_id(
      ObjectId(),
      "test_file",
      chunk_size_bytes=4,
      metadata={"contentType": "text/plain"})
grid_in.write("data I want to store!")
grid_in.close()  # uploaded on close
```
Returns an instance of GridIn.

Raises NoFile if no such version of that file exists. Raises ValueError if filename is not a string.
- Parameters
  - file_id: The id to use for this file. The id must not have already been used for another file.
  - filename: The name of the file to upload.
  - chunk_size_bytes (options): The number of bytes per chunk of this file. Defaults to the chunk_size_bytes in GridFSBucket.
  - metadata (optional): User data for the ‘metadata’ field of the files collection document. If not provided the metadata field will be omitted from the files collection document.
  - session (optional): a ClientSession
Changed in version 3.6: Added session parameter.
rename(file_id, new_filename, session=None)

Renames the stored file with the specified file_id.

For example:
```
my_db = MongoClient().test
fs = GridFSBucket(my_db)
# Get _id of file to rename
file_id = fs.upload_from_stream("test_file", "data I want to store!")
fs.rename(file_id, "new_test_name")
```
Raises NoFile if no file with file_id exists.
- Parameters
  - file_id: The _id of the file to be renamed.
  - new_filename: The new name of the file.
  - session (optional): a ClientSession
Changed in version 3.6: Added session parameter.
upload_from_stream(filename, source, chunk_size_bytes=None, metadata=None, session=None)

Uploads a user file to a GridFS bucket.

Reads the contents of the user file from source and uploads it to the file filename. Source can be a string or file-like object. For example:
```
my_db = MongoClient().test
fs = GridFSBucket(my_db)
file_id = fs.upload_from_stream(
    "test_file",
    "data I want to store!",
    chunk_size_bytes=4,
    metadata={"contentType": "text/plain"})
```
Returns the _id of the uploaded file.

Raises NoFile if no such version of that file exists. Raises ValueError if filename is not a string.
- Parameters
  - filename: The name of the file to upload.
  - source: The source stream of the content to be uploaded. Must be a file-like object that implements read() or a string.
  - chunk_size_bytes (options): The number of bytes per chunk of this file. Defaults to the chunk_size_bytes of GridFSBucket.
  - metadata (optional): User data for the ‘metadata’ field of the files collection document. If not provided the metadata field will be omitted from the files collection document.
  - session (optional): a ClientSession
Changed in version 3.6: Added session parameter.
upload_from_stream_with_id(file_id, filename, source, chunk_size_bytes=None, metadata=None, session=None)

Uploads a user file to a GridFS bucket with a custom file id.

Reads the contents of the user file from source and uploads it to the file filename. Source can be a string or file-like object. For example:
```
my_db = MongoClient().test
fs = GridFSBucket(my_db)
file_id = fs.upload_from_stream(
    ObjectId(),
    "test_file",
    "data I want to store!",
    chunk_size_bytes=4,
    metadata={"contentType": "text/plain"})
```
Raises NoFile if no such version of that file exists. Raises ValueError if filename is not a string.
- Parameters
  - file_id: The id to use for this file. The id must not have already been used for another file.
  - filename: The name of the file to upload.
  - source: The source stream of the content to be uploaded. Must be a file-like object that implements read() or a string.
  - chunk_size_bytes (options): The number of bytes per chunk of this file. Defaults to the chunk_size_bytes of GridFSBucket.
  - metadata (optional): User data for the ‘metadata’ field of the files collection document. If not provided the metadata field will be omitted from the files collection document.
  - session (optional): a ClientSession
Changed in version 3.6: Added session parameter.

Sub-modules: