Collations
See also
The API docs for collation
.
Collations are a new feature in MongoDB version 3.4. They provide a set of rulesto use when comparing strings that comply with the conventions of a particularlanguage, such as Spanish or German. If no collation is specified, the serversorts strings based on a binary comparison. Many languages have specificordering rules, and collations allow users to build applications that adhere tolanguage-specific comparison rules.
In French, for example, the last accent in a given word determines the sortingorder. The correct sorting order for the following four words in French is:
- cote < côte < coté < côté
Specifying a French collation allows users to sort string fields using theFrench sort order.
Usage
Users can specify a collation for acollection, anindex, or aCRUD command.
Collation Parameters:
Collations can be specified with the Collation
modelor with plain Python dictionaries. The structure is the same:
- Collation(locale=<string>,
- caseLevel=<bool>,
- caseFirst=<string>,
- strength=<int>,
- numericOrdering=<bool>,
- alternate=<string>,
- maxVariable=<string>,
- backwards=<bool>)
The only required parameter is locale
, which the server parses asan ICU format locale ID.For example, set locale
to en_US
to represent US Englishor fr_CA
to represent Canadian French.
For a complete description of the available parameters, see the MongoDB manual.
Assign a Default Collation to a Collection
The following example demonstrates how to create a new collection calledcontacts
and assign a default collation with the fr_CA
locale. Thisoperation ensures that all queries that are run against the contacts
collection use the fr_CA
collation unless another collation is explicitlyspecified:
- from pymongo import MongoClient
- from pymongo.collation import Collation
- db = MongoClient().test
- collection = db.create_collection('contacts',
- collation=Collation(locale='fr_CA'))
Assign a Default Collation to an Index
When creating a new index, you can specify a default collation.
The following example shows how to create an index on the name
field of the contacts
collection, with the unique
parameterenabled and a default collation with locale
set to fr_CA
:
- from pymongo import MongoClient
- from pymongo.collation import Collation
- contacts = MongoClient().test.contacts
- contacts.create_index('name',
- unique=True,
- collation=Collation(locale='fr_CA'))
Specify a Collation for a Query
Individual queries can specify a collation to use when sortingresults. The following example demonstrates a query that runs on thecontacts
collection in database test
. It matches ondocuments that contain New York
in the city
field,and sorts on the name
field with the fr_CA
collation:
- from pymongo import MongoClient
- from pymongo.collation import Collation
- collection = MongoClient().test.contacts
- docs = collection.find({'city': 'New York'}).sort('name').collation(
- Collation(locale='fr_CA'))
Other Query Types
You can use collations to control document matching rules for several differenttypes of queries. All the various update and delete methods(update_one()
,update_many()
,delete_one()
, etc.) support collation, andyou can create query filters which employ collations to comply with any of thelanguages and variants available to the locale
parameter.
The following example uses a collation with strength
set toSECONDARY
, which considers onlythe base character and character accents in string comparisons, but not casesensitivity, for example. All documents in the contacts
collection withjürgen
(case-insensitive) in the first_name
field are updated:
- from pymongo import MongoClient
- from pymongo.collation import Collation, CollationStrength
- contacts = MongoClient().test.contacts
- result = contacts.update_many(
- {'first_name': 'jürgen'},
- {'$set': {'verified': 1}},
- collation=Collation(locale='de',
- strength=CollationStrength.SECONDARY))