- FAQ: Sharding with MongoDB
- Is sharding appropriate for a new deployment?
- Can I select a different shard key after sharding a collection?
- Why are my documents not distributed across the shards?
- How does mongos detect changes in the sharded cluster configuration?
- What does writebacklisten in the log mean?
- How does mongos use connections?
FAQ: Sharding with MongoDB
This document answers common questions about Sharding. See alsothe Sharding section in the manual, which provides anoverview of sharding, including details on:
- Shard Keys and Considerations for Shard Key Selection
- Query Routing
- High Availability
- Data Partitioning with Chunks andChunk Migration Process
- Troubleshoot Sharded Clusters
Is sharding appropriate for a new deployment?
Sometimes. However, if your data set fits on a single server, youshould begin with an unsharded deployment as sharding while your dataset is small provides little advantage .
Can I select a different shard key after sharding a collection?
No.
There is no automatic support in MongoDB for choosing a different shard keyafter sharding a collection. This reality underscoresthe importance of choosing a good shard key. If youmust change a shard key after sharding a collection, the best option is to:
- dump all data from MongoDB into an external format.
- drop the original sharded collection.
- configure sharding using a more ideal shard key.
- pre-split the shardkey range to ensure initial even distribution.
- restore the dumped data into MongoDB.
Although you cannot select a different shard key for a shardedcollection, starting in MongoDB 4.2, you can update a document’s shardkey value unless the shard key field is the immutable _id
field.For details on updating the shard key values, seeChange a Document’s Shard Key Value.
Before MongoDB 4.2, a document’s shard key field value is immutable.
See also
Why are my documents not distributed across the shards?
The balancer starts distributing data across the shards once thedistribution of chunks has reached certain thresholds. SeeMigration Thresholds.
In addition, MongoDB cannot move a chunk if the number of documents inthe chunk exceeds a certain number. SeeMaximum Number of Documents Per Chunk to Migrate and Indivisible Chunks.
How does mongos detect changes in the sharded cluster configuration?
mongos
instances maintain a cache of the configdatabase that holds the metadata for the sharded cluster.
mongos
updates its cache lazily by issuing a request to ashard and discovering that its metadata is out of date. To force themongos
to reload its cache, you can run theflushRouterConfig
command against each mongos
directly.
What does writebacklisten in the log mean?
The writeback listener is a process that opens a long poll to relaywrites back from a mongod
or mongos
aftermigrations to make sure they have not gone to the wrong server. Thewriteback listener sends writes back to the correct server ifnecessary.
These messages are a key part of the sharding infrastructure and shouldnot cause concern.
How does mongos use connections?
Each mongos
instance maintains a pool of connections to themembers of the sharded cluster. Client requests use these connectionsone at a time; i.e. requests are not multiplexed or pipelined.
When client requests complete, the mongos
returns theconnection to the pool. These pools do not shrink when the number ofclients decreases. This can lead to an unused mongos
with alarge number of open connections. If the mongos
is no longerin use, it is safe to restart the process to close existing connections.
To return aggregated statistics related to all of the outgoingconnection pools used by the mongos
, connect amongo
shell to the mongos
with , and run theconnPoolStats
command:
- db.adminCommand("connPoolStats");
See the System Resource Utilization section of the UNIX ulimit Settingsdocument.