Development Checklist
The following checklist, along with theOperations Checklist, providesrecommendations to help you avoid issues in your production MongoDBdeployment.
Data Durability
- Ensure that your replica set includes at least three data-bearing nodeswith
w:majority
write concern. Three data-bearing nodes arerequired for replica-set wide data durability. - Ensure that all instances use journaling.
Schema Design
Data in MongoDB has a dynamic schema. Collections do not enforce document structure. Thisfacilitates iterative development and polymorphism. Nevertheless,collections often hold documents with highly homogeneousstructures. See Data Modeling Concepts for more information.
- Determine the set of collections that you will need and theindexes required to support your queries. With the exception ofthe
_id
index, you must create all indexes explicitly: MongoDBdoes not automatically create any indexes other than_id
. - Ensure that your schema design supports your deployment type: ifyou planning to use sharded clusters forhorizontal scaling, design your schema to include a strong shardkey. The shard key affects read and write performance bydetermining how MongoDB partitions data. See: Impacts ofShard Keys on Cluster Operationsfor information about what qualities a shard key should possess.You cannot change the shard key once it is set.
- Ensure that your schema design does not rely on indexed arrays thatgrow in length without bound. Typically, best performance canbe achieved when such indexed arrays have fewer than 1000 elements.
- Consider the document size limits when designing your schema.The
BSON Document Size
limit is 16MB per document. Ifyou require larger documents, use GridFS.
Replication
- Use an odd number of replica set members to ensure that electionsproceed successfully. If you have an even number of members, usean arbiter to ensure an odd number of votes.
Note
For the following MongoDB versions, pv1
increases the likelihoodof w:1
rollbacks compared to pv0
(no longer supported in MongoDB 4.0+) for replica sets with arbiters:
- MongoDB 3.4.1
- MongoDB 3.4.0
- MongoDB 3.2.11 or earlierSee Replica Set Protocol Version.
Ensure that your secondaries remain up-to-date by usingmonitoring tools and byspecifying appropriate write concern.
Do not use secondary reads to scale overall read throughput. See:Can I use more replica nodes to scale for an overview of readscaling. For information about secondary reads, see:Read Preference.
Sharding
Ensure that your shard key distributes the load evenly on your shards.See: Shard Keys for more information.
Use targeted operationsfor workloads that need to scale with the number of shards.
For MongoDB 3.4 and earlier, read from the primary nodes fornon-targeted or broadcastqueries as these queries may be sensitive to stale or orphaneddata.
For MongoDB 3.6 and later, secondaries no longer return orphaneddata unless using read concern "available"
(whichis the default read concern for reads against secondaries when notassociated with causally consistent sessions).
Starting in MongoDB 3.6, all members of the shard replica setmaintain chunk metadata, allowing them to filter out orphanswhen not using "available"
. As such,non-targeted or broadcastqueries that are not using "available"
can besafely run on any member and will not return orphaned data.
The "available"
read concern can returnorphaned documents from secondarymembers since it does not check for updated chunk metadata.However, if the return of orphaned documents is immaterial to anapplication, the "available"
read concern providesthe lowest latency reads possible among the various read concerns.
- Pre-split and manually balance chunks when inserting largedata sets into a new non-hashed sharded collection. Pre-splittingand manually balancing enables the insert load to be distributedamong the shards, increasing performance for the initial load.
Drivers
- Make use of connection pooling. Most MongoDB drivers supportconnection pooling. Adjust the connection pool size to suit youruse case, beginning at 110-115% of the typical number of concurrentdatabase requests.
- Ensure that your applications handle transient write and read errorsduring replica set elections.
- Ensure that your applications handle failed requests and retry them ifapplicable. Drivers do not automatically retry failed requests.
- Use exponential backoff logic for database request retries.
- Use
cursor.maxTimeMS()
for reads and wtimeout forwrites if you need to cap execution time for database operations.