Convert a Replica Set to a Sharded Cluster

Overview

This tutorial converts a single three-member replica set to a shardedcluster with two shards. Each shard is an independent three-memberreplica set. This tutorial is specific to MongoDB 4.2. For otherversions of MongoDB, refer to the corresponding version of the MongoDBManual.

The procedure is as follows:

Prerequisites

This tutorial uses a total of ten servers: one server for themongos and three servers each for the first replicaset, the second replica set, and the config server replica set.

Each server must have a resolvable domain, hostname, or IP addresswithin your system.

The tutorial uses the default data directories (e.g. /data/db and/data/configdb). Create the appropriate directories withappropriate permissions. To use different paths, seeConfiguration File Options .

Procedures

Set Up Initial Replica Set

This procedure creates the initial three-member replica set rs0.The replica set members are on the following hosts:mongodb0.example.net, mongodb1.example.net, andmongodb2.example.net.

Start each member of the replica set with the appropriate options.

For each member, start a mongod instance with thefollowing settings:

If your application connects to more than one replica set, each setshould have a distinct name. Some drivers group replica setconnections by replica set name.

  • Set net.bindIp option to the hostname/ip or a comma-delimitedlist of hostnames/ips, and

  • Set any other settings as appropriate for your deployment.

In this tutorial, the three mongod instances areassociated with the following hosts:

Replica Set MemberHostname
Member 0mongodb0.example.net
Member 1mongodb1.example.net
Member 2mongodb2.example.net

The following example specifies the replica set name and the ipbinding through the —replSet and —bind_ipcommand-line options:

Warning

Before binding to a non-localhost (e.g. publicly accessible)IP address, ensure you have secured your cluster from unauthorizedaccess. For a complete list of security recommendations, seeSecurity Checklist. At minimum, considerenabling authentication andhardening network infrastructure.

  1. mongod --replSet "rs0" --bind_ip localhost,<hostname(s)|ip address(es)>

For <hostname(s)|ip address(es)>, specify the hostname(s) and/orip address(es) for your mongod instance that remoteclients (including the other members of the replica set) can use toconnect to the instance.

Alternatively, you can also specify the replica set name and the ip addresses in a configuration file:

  1. replication:
  2. replSetName: "rs0"
  3. net:
  4. bindIp: localhost,<hostname(s)|ip address(es)>

To start mongod with a configuration file, specify theconfiguration file’s path with the —config option:

  1. mongod --config <path-to-config>

In production deployments, you can configure a init scriptto manage this process. Init scripts are beyond the scope of thisdocument.

Connect a mongo shell to one of the mongod instances.

From the same machine where one of the mongod is running(in this tutorial, mongodb0.example.net), start the mongoshell. To connect to the mongod listening to localhost onthe default port of 27017, simply issue:

  1. mongo

Depending on your path, you may need to specify the path to themongo binary.

Initiate the replica set.

From the mongo shell, run rs.initiate() onreplica set member 0.

Important

Run rs.initiate() on just one and only onemongod instance for the replica set.

Tip

When possible, use a logical DNS hostname instead of an ip address,particularly when configuring replica set members or sharded clustermembers. The use of logical DNS hostnames avoids configurationchanges due to ip address changes.

  1. rs.initiate( {
  2. _id : "rs0",
  3. members: [
  4. { _id: 0, host: "mongodb0.example.net:27017" },
  5. { _id: 1, host: "mongodb1.example.net:27017" },
  6. { _id: 2, host: "mongodb2.example.net:27017" }
  7. ]
  8. })

MongoDB initiates a replica set, usingthe default replica set configuration.

Create and populate a new collection.

The following step adds one million documents to the collectiontest_collection and can take several minutes depending onyour system.

To determine the primary, use rs.status().

Issue the following operations on the primary of the replica set:

  1. use test
  2. var bulk = db.test_collection.initializeUnorderedBulkOp();
  3. people = ["Marc", "Bill", "George", "Eliot", "Matt", "Trey", "Tracy", "Greg", "Steve", "Kristina", "Katie", "Jeff"];
  4. for(var i=0; i<1000000; i++){
  5. user_id = i;
  6. name = people[Math.floor(Math.random()*people.length)];
  7. number = Math.floor(Math.random()*10001);
  8. bulk.insert( { "user_id":user_id, "name":name, "number":number });
  9. }
  10. bulk.execute();

For more information on deploying a replica set, seeDeploy a Replica Set.

Restart the Replica Set as a Shard

Changed in version 3.4: For MongoDB 3.4 sharded clusters, mongod instances forthe shards must explicitly specify its role as a shardsvr,either via the configuration file settingsharding.clusterRole or via the command line option—shardsvr.

Note

Default port for mongod instances with the shardsvrrole is 27018. To use a different port, specifynet.port setting or —port option.

Determine the primary and secondary members.

Connect a mongo shell to one of the members and runrs.status() to determine the primary and secondary members.

Restart secondary members with the —shardsvr option.

One secondary at a time, restart each secondary with the —shardsvroption. To continue to use the same port, include the —portoption. Include additional options, such as —bind_ip, asappropriate for your deployment.

Warning

Before binding to a non-localhost (e.g. publicly accessible)IP address, ensure you have secured your cluster from unauthorizedaccess. For a complete list of security recommendations, seeSecurity Checklist. At minimum, considerenabling authentication andhardening network infrastructure.

  1. mongod --replSet "rs0" --shardsvr --port 27017 --bind_ip localhost,<hostname(s)|ip address(es)>

Include any other options as appropriate for your deployment.Repeat this step for the other secondary.

Step down the primary.

Connect a mongo shell to the primary and stepdown the primary.

  1. rs.stepDown()

Restart the primary with the —shardsvr option.

Restart the primary with the —shardsvr option.To continue to use the same port, include the —port option.

  1. mongod --replSet "rs0" --shardsvr --port 27017 --bind_ip localhost,<hostname(s)|ip address(es)>

Include any other options as appropriate for your deployment.

Deploy Config Server Replica Set and mongos

This procedure deploys the three-member replica set for the configservers and themongos.

  • The config servers use the following hosts: mongodb7.example.net,mongodb8.example.net, and mongodb9.example.net.
  • The mongos uses mongodb6.example.net.

Deploy the config servers as a three-member replica set.

Start a config server on mongodb7.example.net,mongodb8.example.net, and mongodb9.example.net. Specify thesame replica set name. The config servers use the default datadirectory /data/configdb and the default port 27019.

Warning

Before binding to a non-localhost (e.g. publicly accessible)IP address, ensure you have secured your cluster from unauthorizedaccess. For a complete list of security recommendations, seeSecurity Checklist. At minimum, considerenabling authentication andhardening network infrastructure.

  1. mongod --configsvr --replSet configReplSet --bind_ip localhost,<hostname(s)|ip address(es)>

To modify the default settings or to include additional optionsspecific to your deployment, see mongod orConfiguration File Options.

Connect a mongo shell to one of the config servers andrun rs.initiate() to initiate the replica set.

Important

Run rs.initiate() on just one and only onemongod instance for the replica set.

Tip

When possible, use a logical DNS hostname instead of an ip address,particularly when configuring replica set members or sharded clustermembers. The use of logical DNS hostnames avoids configurationchanges due to ip address changes.

  1. rs.initiate( {
  2. _id: "configReplSet",
  3. configsvr: true,
  4. members: [
  5. { _id: 0, host: "mongodb07.example.net:27019" },
  6. { _id: 1, host: "mongodb08.example.net:27019" },
  7. { _id: 2, host: "mongodb09.example.net:27019" }
  8. ]
  9. } )

Start a mongos instance.

On mongodb6.example.net, start the mongos specifyingthe config server replica set name followed by a slash / and at leastone of the config server hostnames and ports.

  1. mongos --configdb configReplSet/mongodb07.example.net:27019,mongodb08.example.net:27019,mongodb09.example.net:27019 --bind_ip localhost,<hostname(s)|ip address(es)>

Add Initial Replica Set as a Shard

The following procedure adds the initial replica set rs0 as a shard.

Connect a mongo shell to the mongos.

  1. mongo mongodb6.example.net:27017/admin

Add the shard.

Add a shard to the cluster with the sh.addShard method:

  1. sh.addShard( "rs0/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net:27017" )

Add Second Shard

The following procedure deploys a new replica set rs1 for thesecond shard and adds it to the cluster. The replica set members are onthe following hosts: mongodb3.example.net,mongodb4.example.net, and mongodb5.example.net.

Changed in version 3.4: For MongoDB 3.4 sharded clusters, mongod instances forthe shards must explicitly specify its role as a shardsvr,either via the configuration file settingsharding.clusterRole or via the command line option—shardsvr.

Note

Default port for mongod instances with the shardsvrrole is 27018. To use a different port, specifynet.port setting or —port option.

Start each member of the replica set with the appropriate options.

For each member, start a mongod, specifying the replicaset name through the replSet option and its role as ashard with the —shardsvr option. Specify additional options,such as —bind_ip, as appropriate.

Warning

Before binding to a non-localhost (e.g. publicly accessible)IP address, ensure you have secured your cluster from unauthorizedaccess. For a complete list of security recommendations, seeSecurity Checklist. At minimum, considerenabling authentication andhardening network infrastructure.

For replication-specific parameters, seeReplication Options.

  1. mongod --replSet "rs1" --shardsvr --port 27017 --bind_ip localhost,<hostname(s)|ip address(es)>

Repeat this step for the other two members of the rs1 replica set.

Connect a mongo shell to a replica set member.

Connect a mongo shell to one member of the replica set(e.g. mongodb3.example.net)

  1. mongo mongodb3.example.net

Initiate the replica set.

From the mongo shell, run rs.initiate() toinitiate a replica set that consists of the current member.

Important

Run rs.initiate() on just one and only onemongod instance for the replica set.

Tip

When possible, use a logical DNS hostname instead of an ip address,particularly when configuring replica set members or sharded clustermembers. The use of logical DNS hostnames avoids configurationchanges due to ip address changes.

  1. rs.initiate( {
  2. _id : "rs1",
  3. members: [
  4. { _id: 0, host: "mongodb3.example.net:27017" },
  5. { _id: 1, host: "mongodb4.example.net:27017" },
  6. { _id: 2, host: "mongodb5.example.net:27017" }
  7. ]
  8. })

Connect a mongo shell to the mongos.

  1. mongo mongodb6.example.net:27017/admin

Add the shard.

In a mongo shell connected to the mongos, addthe shard to the cluster with the sh.addShard() method:

  1. sh.addShard( "rs1/mongodb3.example.net:27017,mongodb4.example.net:27017,mongodb5.example.net:27017" )

Shard a Collection

Connect a mongo shell to the mongos.

  1. mongo mongodb6.example.net:27017/admin

Enable sharding for a database.

Before you can shard a collection, you must first enable shardingfor the collection’s database. Enabling sharding for a database does notredistribute data but makes it possible to shard the collections inthat database.

The following operation enablessharding on the test database:

  1. sh.enableSharding( "test" )

mongos uses"majority" for theenableSharding command and its helpersh.enableSharding().

The operation returns the status of the operation:

  1. { "ok" : 1 }

Determine the shard key.

For the collection to shard, determine the shard key. The shardkey determines how MongoDB distributes thedocuments between shards. Good shard keys:

  • have values that are evenly distributed among all documents,
  • group documents that are often accessed at the same time into contiguous chunks, and
  • allow for effective distribution of activity among shards.

Once you shard a collection with the specified shard key, youcannot change the shard key. For more information on shard keys,see Shard Keys.

This procedure will use the number field as the shard key fortest_collection.

Create an index on the shard key.

Before sharding a non-empty collection, create an index onthe shard key.

  1. use test
  2. db.test_collection.createIndex( { number : 1 } )

Shard the collection.

In the test database, shard the test_collection,specifying number as the shard key.

  1. use test
  2. sh.shardCollection( "test.test_collection", { "number" : 1 } )

mongos uses "majority" for thewrite concern of theshardCollection command and its helpersh.shardCollection().

The method returns the status of the operation:

  1. { "collectionsharded" : "test.test_collection", "ok" : 1 }

The balancer redistributeschunks of documents when it next runs. As clients insert additionaldocuments into this collection, the mongos routes thedocuments to the appropriate shard.

Confirm the shard is balancing.

To confirm balancing activity, run db.stats() ordb.printShardingStatus() in the test database.

  1. use test
  2. db.stats()
  3. db.printShardingStatus()

Example output of the db.stats():

  1. {
  2. "raw" : {
  3. "rs0/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net:27017" : {
  4. "db" : "test",
  5. "collections" : 1,
  6. "views" : 0,
  7. "objects" : 640545,
  8. "avgObjSize" : 70.83200339949052,
  9. "dataSize" : 45370913,
  10. "storageSize" : 50438144,
  11. "numExtents" : 0,
  12. "indexes" : 2,
  13. "indexSize" : 24502272,
  14. "ok" : 1,
  15. "$gleStats" : {
  16. "lastOpTime" : Timestamp(0, 0),
  17. "electionId" : ObjectId("7fffffff0000000000000003")
  18. }
  19. },
  20. "rs1/mongodb3.example.net:27017,mongodb4.example.net:27017,mongodb5.example.net:27017" : {
  21. "db" : "test",
  22. "collections" : 1,
  23. "views" : 0,
  24. "objects" : 359455,
  25. "avgObjSize" : 70.83259935179647,
  26. "dataSize" : 25461132,
  27. "storageSize" : 8630272,
  28. "numExtents" : 0,
  29. "indexes" : 2,
  30. "indexSize" : 8151040,
  31. "ok" : 1,
  32. "$gleStats" : {
  33. "lastOpTime" : Timestamp(0, 0),
  34. "electionId" : ObjectId("7fffffff0000000000000001")
  35. }
  36.  
  37. }
  38. },
  39. "objects" : 1000000,
  40. "avgObjSize" : 70,
  41. "dataSize" : 70832045,
  42. "storageSize" : 59068416,
  43. "numExtents" : 0,
  44. "indexes" : 4,
  45. "indexSize" : 32653312,
  46. "fileSize" : 0,
  47. "extentFreeList" : {
  48. "num" : 0,
  49. "totalSize" : 0
  50. },
  51. "ok" : 1
  52. }

Example output of the db.printShardingStatus():

  1. --- Sharding Status ---
  2. sharding version: {
  3. "_id" : 1,
  4. "minCompatibleVersion" : 5,
  5. "currentVersion" : 6,
  6. "clusterId" : ObjectId("5be0a488039b1964a7208c60")
  7. }
  8. shards:
  9. { "_id" : "rs0", "host" : "rs0/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net:27017", "state" : 1 }
  10. { "_id" : "rs1", "host" : "rs1/mongodb3.example.net:27017,mongodb4.example.net:27017,mongodb5.example.net:27017", "state" : 1 }
  11. active mongoses:
  12. "3.6.8" : 1
  13. autosplit:
  14. Currently enabled: yes
  15. balancer:
  16. Currently enabled: yes
  17. Currently running: yes
  18. Collections with active migrations:
  19. test.test_collection started at Mon Nov 05 2018 15:16:45 GMT-0500
  20. Failed balancer rounds in last 5 attempts: 0
  21. Migration Results for the last 24 hours:
  22. 1 : Success
  23. databases:
  24. { "_id" : "test", "primary" : "rs0", "partitioned" : true }
  25. test.test_collection
  26. shard key: { "number" : 1 }
  27. unique: false
  28. balancing: true
  29. chunks:
  30. rs0 5
  31. rs1 1
  32. { "number" : { "$minKey" : 1 } } -->> { "number" : 1195 } on : rs1 Timestamp(2, 0)
  33. { "number" : 1195 } -->> { "number" : 2394 } on : rs0 Timestamp(2, 1)
  34. { "number" : 2394 } -->> { "number" : 3596 } on : rs0 Timestamp(1, 5)
  35. { "number" : 3596 } -->> { "number" : 4797 } on : rs0 Timestamp(1, 6)
  36. { "number" : 4797 } -->> { "number" : 9588 } on : rs0 Timestamp(1, 1)
  37. { "number" : 9588 } -->> { "number" : { "$maxKey" : 1 } } on : rs0 Timestamp(1, 2)

Run these commands for a second time to demonstrate that chunks are migrating from rs0 to rs1.