Build Indexes on Sharded Clusters
To minimize the impact of building an index on sharded clusterswith replica set shards, use the following procedure to build indexesin a rolling fashion. For building an index on replica set deployments,see Build Indexes on Replica Sets instead.
The following procedure for sharded clusters deployments does take outone member of the shard replica set at a time. However, thisprocedure will only affect one member at a time rather than _all_secondaries at the same time.
Considerations
Unique Indexes
To create unique indexes using the followingprocedure, you must stop all writes to the collection during thisprocedure.
If you cannot stop all writes to the collection during this procedure,do not use the procedure on this page. Instead, build your unique indexon the collection by issuing db.collection.createIndex()
onthe mongos
for a sharded cluster.
Oplog Size
Ensure that your oplog is large enough to permit the indexingor re-indexing operation to complete without falling too far behind tocatch up. See the oplog sizingdocumentation for additional information.
Prerequisites
- For building unique indexes
- To create unique indexes using the followingprocedure, you must stop all writes to the collection during the indexbuild. Otherwise, you may end up with inconsistent data across thereplica set members. If you cannot stop all writes to the collection,do not use the following procedure to create unique indexes.
Warning
If you cannot stop all writes to the collection, do not use thefollowing procedure to create unique indexes.
Procedure
A. Stop the Balancer
Connect a mongo
shell to a mongos
instance in the sharded cluster, and run sh.stopBalancer()
todisable the balancer: [1]
- sh.stopBalancer()
Note
If a migration is in progress, the system will complete thein-progress migration before stopping the balancer.
To verify that the balancer is disabled, runsh.getBalancerState()
, which returns false if the balanceris disabled:
- sh.getBalancerState()
[1] | Starting in MongoDB 4.2, sh.stopBalancer() also disablesauto-splitting for the sharded cluster. |
B. Determine the Distribution of the Collection
From the mongo
shell connected to themongos
, refresh the cached routing table for thatmongos
to avoid returning stale distribution informationfor the collection. Once refreshed, rundb.collection.getShardDistribution()
for the collection youwish to build the index.
For example, if you want to an ascending indexon the records
collection in the test
database:
- db.adminCommand( { flushRouterConfig: "test.records" } );
- db.records.getShardDistribution();
The method outputs the shard distribution. For example, consider asharded cluster with 3 shards shardA
, shardB
, and shardC
and the db.collection.getShardDistribution()
returns thefollowing:
- Shard shardA at shardA/s1-mongo1.example.net:27018,s1-mongo2.example.net:27018,s1-mongo3.example.net:27018
- data : 1KiB docs : 50 chunks : 1
- estimated data per chunk : 1KiB
- estimated docs per chunk : 50
- Shard shardC at shardC/s3-mongo1.example.net:27018,s3-mongo2.example.net:27018,s3-mongo3.example.net:27018
- data : 1KiB docs : 50 chunks : 1
- estimated data per chunk : 1KiB
- estimated docs per chunk : 50
- Totals
- data : 3KiB docs : 100 chunks : 2
- Shard shardA contains 50% data, 50% docs in cluster, avg obj size on shard : 40B
- Shard shardC contains 50% data, 50% docs in cluster, avg obj size on shard : 40B
From the output, you only build the indexes for test.records
onshardA
and shardC
.
C. Build Indexes on the Shards That Contain Collection Chunks
For each shard that contains chunks for the collection, follow theprocedure to build the index on the shard.
C1. Stop One Secondary and Restart as a Standalone
For an affected shard, stop the mongod
processassociated with one of its secondary. Restart after making the followingconfiguration updates:
- Configuration File
- Command-line Options
If you are using a configuration file, make the followingconfiguration updates:
- Change the
net.port
to a different port. [2]Make a note of the original port setting as a comment. - Comment out the
replication.replSetName
option. - Comment out the
sharding.clusterRole
option. - Set parameter
skipShardingConfigurationChecks
(also available for MongoDB 3.6.3+, 3.4.11+, 3.2.19+) totrue
in thesetParameter
section. - Set parameter
disableLogicalSessionCacheRefresh
totrue
in thesetParameter
section.
For example, for a shard replica set member, theupdated configuration file will include content likethe following example:
- net:
- bindIp: localhost,<hostname(s)|ip address(es)>
- port: 27218
- # port: 27018
- #replication:
- # replSetName: shardA
- #sharding:
- # clusterRole: shardsvr
- setParameter:
- skipShardingConfigurationChecks: true
- disableLogicalSessionCacheRefresh: true
And restart:
- mongod --config <path/To/ConfigFile>
Other settings (e.g. storage.dbPath
, etc.) remain the same.
If using command-line options, make the followingconfiguration updates:
- Modify
—port
to a different port. [2] - Remove
—replSet
. - Remove
—shardsvr
if ashard member and—configsvr
if a config server member. - Set parameter
skipShardingConfigurationChecks
(alsoavailable for MongoDB 3.6.3+, 3.4.11+, 3.2.19+) totrue
in the—setParameter
option. - Set parameter
disableLogicalSessionCacheRefresh
totrue
in the—setParameter
option.
For example, restart your shard replica set memberwithout the —replSet
and—shardsvr
options.Specify a new port number and set both theskipShardingConfigurationChecks
anddisableLogicalSessionCacheRefresh
parameters totrue:
- mongod --port 27218 --setParameter skipShardingConfigurationChecks=true --setParameter disableLogicalSessionCacheRefresh=true
Other settings (e.g. —dbpath
, etc.) remain the same.
[2] | (1, 2) By running the mongod on a differentport, you ensure that the other members of the replica set and allclients will not contact the member while you are building theindex. |
C2. Build the Index
Connect directly to the mongod
instance running as astandalone on the new port and create the new index for thisinstance.
For example, connect a mongo
shell to the instance,and use the db.collection.createIndex()
method to createan ascending index on the username
field of the records
collection:
- db.records.createIndex( { username: 1 } )
C3. Restart the Program mongod as a Replica Set Member
When the index build completes, shutdown the mongod
instance. Undo the configuration changes made when starting as astandalone to return to its original configuration and restart.
Important
Be sure to remove theskipShardingConfigurationChecks
parameter anddisableLogicalSessionCacheRefresh
parameter.
For example, to restart your replica set shard member:
- Configuration File
- Command-line Options
If you are using a configuration file:
- Revert to the original port number.
- Uncomment the
replication.replSetName
. - Uncomment the
sharding.clusterRole
. - Remove parameter
skipShardingConfigurationChecks
in thesetParameter
section. - Remove parameter
disableLogicalSessionCacheRefresh
in thesetParameter
section.
- net:
- bindIp: localhost,<hostname(s)|ip address(es)>
- port: 27018
- replication:
- replSetName: shardA
- sharding:
- clusterRole: shardsvr
Other settings (e.g. storage.dbPath
, etc.) remain the same.
And restart:
- mongod --config <path/To/ConfigFile>
If you are using command-line options:
- Revert to the original port number.
- Include
—replSet
. - Include
—shardsvr
ifa shard member or—configsvr
if a config server member. - Remove parameter
skipShardingConfigurationChecks
. - Remove parameter
disableLogicalSessionCacheRefresh
.
For example:
- mongod --port 27018 --replSet shardA --shardsvr
Other settings (e.g. —dbpath
, etc.) remain the same.
Allow replication to catch up on this member.
C4. Repeat the Procedure for the Remaining Secondaries for the Shard
Once the member catches up with the other members of the set, repeatthe procedure one member at a time for the remaining secondarymembers for the shard:
- C1. Stop One Secondary and Restart as a Standalone
- C2. Build the Index
- C3. Restart the Program mongod as a Replica Set Member
C5. Build the Index on the Primary
When all the secondaries for the shard have the new index, step downthe primary for the shard, restart it as a standalone using theprocedure described above, and build the index on the former primary:
- Use the
rs.stepDown()
method in themongo
shellto step down the primary. Upon successful stepdown, the current primarybecomes a secondary and the replica set members elect a new primary. - C1. Stop One Secondary and Restart as a Standalone
- C2. Build the Index
- C3. Restart the Program mongod as a Replica Set Member
D. Repeat for the Other Affected Shards
Once you finish building the index for a shard, repeatC. Build Indexes on the Shards That Contain Collection Chunks for the otheraffected shards.
E. Restart the Balancer
Once you finish the rolling index build for the affected shards,restart the balancer.
Connect a mongo
shell to a mongos
instance in the sharded cluster, and run sh.startBalancer()
: [3]
- sh.startBalancer()
[3] | Starting in MongoDB 4.2, sh.startBalancer() also enablesauto-splitting for the sharded cluster. |