- Restore a Sharded Cluster
- Considerations
- A. (Optional) Review Replica Set Configurations
- B. Prepare the Target Host for Restoration
- C. Restore Config Server Replica Set
- Restore the CSRS primary mongod data files.
- Drop the local database.
- For any planned or completed shard hostname or replica set name changes, update the metadata in config.shards .
- Restart the mongod as a new single-node replica set.
- Initiate the new replica set.
- Add additional replica set members.
- Configure any additional required replication settings.
- D. Restore Each Shard Replica Set
- Restore the shard primary mongod data files.
- Create a temporary user with the __system role.
- Drop the local database.
- Remove the minOpTimeRecovery document from the admin.system.versions collection.
- Optional: For any CSRS hostname or replica set name changes, update shard metadata in each shard’s identity document.
- Restart the mongod as a new single-node replica set.
- Initiate the new replica set.
- Add additional replica set members.
- Configure any additional required replication settings.
- Remove the temporary privileged user.
- E. Restart Each mongos
- F. Validate Cluster Accessibility
Restore a Sharded Cluster
This procedure restores a sharded cluster from an existing backupsnapshot, such as LVM snapshots. Thesource and target sharded cluster must have the same number of shards.For information on creating LVM snapshots for all components of asharded cluster, see Back Up a Sharded Cluster with File System Snapshots.
Note
mongodump
and mongorestore
cannot be part of a backup strategy for 4.2+ sharded clustersthat have sharded transactions in progress as these tools cannotguarantee a atomicity guarantees of data across the shards.
For 4.2+ sharded clusters with in-progress sharded transactions, forcoordinated backup and restore processes that maintain the atomicityguarantees of transactions across shards, see:
For MongoDB 4.0 and earlier deployments, refer to the correspondingversions of the manual. For example:
Considerations
For encrypted storage engines thatuse AES256-GCM
encryption mode, AES256-GCM
requires that everyprocess use a unique counter block value with the key.
For encrypted storage engineconfigured with AES256-GCM
cipher:
- Restoring from Hot Backup
- Starting in 4.2, if you restore from files taken via “hot”backup (i.e. the
mongod
is running), MongoDBcan detect “dirty” keys on startup and automatically rolloverthe database key to avoid IV (Initialization Vector) reuse.
- Restoring from Cold Backup
- However, if you restore from files taken via “cold” backup(i.e. the
mongod
is not running), MongoDBcannot detect “dirty” keys on startup, and reuse of IV voidsconfidentiality and integrity guarantees.
Starting in 4.2, to avoid the reuse of the keys afterrestoring from a cold filesystem snapshot, MongoDB adds a newcommand-line option —eseDatabaseKeyRollover
. When started with the—eseDatabaseKeyRollover
option, the mongod
instance rolls over the database keys configured withAES256-GCM
cipher and exits.
Tip
- In general, if using filesystem based backups for MongoDBEnterprise 4.2+, use the “hot” backup feature, if possible.
- For MongoDB Enterprise versions 4.0 and earlier, if you use
AES256-GCM
encryption mode, do not make copies ofyour data files or restore from filesystem snapshots (“hot” or“cold”).
A. (Optional) Review Replica Set Configurations
This procedure initiates a new replica set for theConfig Server Replica Set (CSRS) and each shard replica set usingthe default configuration. To use a different replica setconfiguration for your restored CSRS and shards, you mustreconfigure the replica set(s).
If your source cluster is healthy and accessible, connect amongo
shell to the primary replica set memberin each replica set and run rs.conf()
to view thereplica configuration document.
If you cannot access one or more components of the source shardedcluster, please reference any existing internal documentation toreconstruct the configuration requirements for each shard replica setand the config server replica set.
B. Prepare the Target Host for Restoration
- Storage Space Requirements
- Ensure the target host hardware has sufficient open storage spacefor the restored data. If the target host contains existing shardedcluster data that you want to keep, ensure that you have enoughstorage space for both the existing data and the restored data.
- LVM Requirements
- For LVM snapshots, you must have at least one LVM managedvolume group and an a logical volume with enough free space for theextracted snapshot data.
- MongoDB Version Requirements
- Ensure the target host and source host have the same MongoDB Serverversion. To check the version of MongoDB available on a host machine,run
mongod —version
from the terminal or shell.
For complete documentation on installation, seeInstall MongoDB.
- Shut Down Running MongoDB Processes
- If restoring to an existing cluster, shut down the
mongod
ormongos
process on the target host.
For hosts running mongos
, connect amongo
shell to the mongos
andrun db.shutdownServer()
from the admin
database:
- use admin
- db.shutdownServer()
For hosts running a mongod
, connect amongo
shell to the mongod
andrun db.isMaster()
:
If
ismaster
is false, themongod
is asecondary member of a replica set. Youcan shut it down by runningdb.shutdownServer()
fromtheadmin
database.If
ismaster
is true, themongod
is theprimary member of a replica set. Shut downthe secondary members of the replica set first. Users.status()
to identify the other members ofthe replica set.
The primary automatically steps down after it detects amajority of members are offline. After it steps down(db.isMaster
returnsismaster: false
), you can safelyshut down the mongod
- Prepare Data Directory
- Create a directory on the target host for the restored databasefiles. Ensure that the user that runs the
mongod
has read, write, and execute permissionsfor all files and subfolders in that directory:
- mkdir /path/to/mongodb
- chown -R mongodb:mongodb /path/to/mongodb
- chmod -R 770 /path/to/mongodb
Substitute /path/to/mongodb
with the path to the data directoryyou created.
- Prepare Log Directory
- Create a directory on the target host for the
mongod
log files. Ensure that the user thatruns themongod
has read, write, and executepermissions for all files and subfolders in that directory:
- mkdir /path/to/mongodb/logs
- chown -R mongodb:mongodb /path/to/mongodb/logs
- chmod -R 770 /path/to/mongodb/logs
Substitute /path/to/mongodb/logs
with the path to the logdirectory you created.
- Create Configuration File
- This procedure assumes starting a
mongod
with a configuration file.
Create the configuration file in your preferred location. Ensure thatthe user that runs the mongod
has read andwrite permissions on the configuration file:
- touch /path/to/mongod.conf
- chown mongodb:mongodb /path/to/mongodb/mongod.conf
- chmod 644 /path/to/mongodb/mongod.conf
Open the configuration file in your preferred text editor andmodify at it as required by your deployment. Alternatively,if you have access to the original configuration file for themongod
, copy it your preferred locationon the target host.
Important
Validate that your configuration file includes the followingsettings:
storage.dbPath
must be set to the path to yourpreferred data directory.systemLog.path
must be set to the path to yourpreferred log directorynet.bindIp
must include the IP address of thehost machine.replication.replSetName
has the same value acrosseach member in any given replica set.sharding.clusterRole
has the same value across eachmember in any given replica set.
C. Restore Config Server Replica Set
Restore the CSRS primary mongod data files.
Select the tab that corresponds to your preferred backup method:
- LVM Snapshot
- Other Backup Files (NOT mongodump)
- Mount the LVM snapshot on the target host machine.The specific steps for mounting an LVM snapshot depends onyour LVM configuration.
The following example assumes an LVM snapshot createdusing the Create a Snapshot step in theBack Up and Restore with Filesystem Snapshotsprocedure.
- lvcreate --size 250GB --name mongod-datafiles-snapshot vg0
- gzip -d -c mongod-datafiles-snapshot.gz | dd o/dev/vg0/mongod-datafiles-snapshot
- mount /dev/vg0/mongod-datafiles-snapshot /snap/mongodb
This example may not apply to all possible LVMconfigurations. Refer to the LVM documentation for yoursystem for more complete guidance on LVM restoration.
- Copy the
mongod
data files from thesnapshot mount to the data directory created inB. Prepare the Target Host for Restoration:
- cp -a /snap/mongodb/path/to/mongodb /path/to/mongodb
The -a
option recursively copies the contents ofthe source path to the destination path while preservingfolder and file permissions.
- Comment out or omit the followingconfiguration file settings:
- #replication
- # replSetName: myCSRSName
- #sharding
- # clusterRole: configsvr
To start the mongod
using aconfiguration file, specify the—config
option in the commandline specifying the full path to the configuration file:
- mongod --config /path/to/mongodb/mongod.conf
If you have mongod
configured torun as a system service, start it using the recommendedprocess for your system service manager.
After the mongod
starts, connectto it using the mongo
shell.
Make the data files stored in your selected backup mediumaccessible on the host. This may require mounting thebackup volume, opening the backup in a software utility,or using another tool to extract the data to disk. Referto the documentation for your preferred backup tool forinstructions on accessing the data contained in thebackup.
Copy the
mongod
data files from thebackup data location to the data directory created inB. Prepare the Target Host for Restoration:
- cp -a /backup/mongodb/path/to/mongodb /path/to/mongodb
The -a
option recursively copies the contents ofthe source path to the destination path while preservingfolder and file permissions.
- Comment out or omit the followingconfiguration file settings:
- #replication
- # replSetName: myCSRSName
- #sharding
- # clusterRole: configsvr
- To start the
mongod
using aconfiguration file, specify the—config
option in the commandline specifying the full path to the configuration file:
- mongod --config /path/to/mongodb/mongod.conf
Cloud Manager or Ops Manager Only
If performing a manual restoration of a Cloud Manageror Ops Manager backup, you must specify thedisableLogicalSessionCacheRefresh
server parameterprior to startup.
- mongod --config /path/to/mongodb/mongod.conf \
- --setParameter disableLogicalSessionCacheRefresh=true
If you have mongod
configured torun as a system service, start it using the recommendedprocess for your system service manager.
After the mongod
starts, connectto it using the mongo
shell.
Drop the local database.
Use db.dropDatabase()
to drop the local
database:
- use local
- db.dropDatabase()
For any planned or completed shard hostname or replica set name changes, update the metadata in config.shards .
You can skip this step if all of the following are true:
- No shard member host machine hostname has or will change during this procedure.
- No shard replica set name has or will change during this procedure.
Issue the following find()
method on theshards
collection in the Config Database.Replace <shardName>
with the name of the shard. By default theshard name is its replica set name. If you added the shardusing the addShard
command and specified a customname
, you must specify that name
to <shardName>
.
- use config
- db.shards.find( { "_id" : "<shardName>" } )
This operation returns a document thatresembles the following:
- {
- "_id" : "shard1",
- "host" : "myShardName/alpha.example.net:27018,beta.example.net:27018,charlie.example.net:27018",
- "state" : 1
- }
Important
The _id
value must match the shardName
value in the_id : "shardIdentity"
document on the corresponding shard.When restoring the shards later in this procedure, validate thatthe _id
field in shards
matches theshardName
value on the shard.
Use the updateOne()
method to update thehosts
string to reflect the planned replica set name andhostname list for the shard. For example, the following operationupdates the host
connection string for the shard with "_id" : "shard1"
:
- db.shards.updateOne(
- { "_id" : "shard1" },
- { $set : { "host" : "myNewShardName/repl1.example.net:27018,repl2.example.net:27018,repl3.example.net:27018" } }
- )
Repeat this process until all shard metadata accurately reflectsthe planned replica set name and hostname list for each shard in thecluster.
Note
If you do not know the shard name, issue thefind()
method on the shards
collection with an empty filter document {}
:
- use config
- db.shards.find({})
Each document in the result set represents one shard in thecluster. For each document, check the host
field for aconnection string that matches the shard in question, i.e. amatching replica set name and member hostname list. Use the _id
of that document in place of <shardName>
.
Restart the mongod as a new single-node replica set.
Shut down themongod
. Uncomment or add the followingconfiguration file options:
- replication
- replSetName: myNewCSRSName
- sharding
- clusterRole: configsvr
If you want to change the replica set name, you must updatethe replSetName
field with the new namebefore proceeding.
Start the mongod
with the updatedconfiguration file:
- mongod --config /path/to/mongodb/mongod.conf
If you have mongod
configured to run as asystem service, start it using the recommended process for yoursystem service manager.
After the mongod
starts, connect to it usingthe mongo
shell.
Initiate the new replica set.
Initiate the replica set using rs.initiate()
with thedefault settings.
- rs.initiate()
Once the operation completes, use rs.status()
to checkthat the member has become the primary.
Add additional replica set members.
For each replica set member in the CSRS, start themongod
on its host machine. Once you havestarted up all remaining members of the cluster successfully,connect a mongo
shell to the primary replicaset member. From the primary, use the rs.add()
method toadd each member of the replica set. Include the replica set name asthe prefix, followed by the hostname and port of the member’smongod
process:
- rs.add("myNewCSRSName/config2.example.net:27019")
- rs.add("myNewCSRSName/config3.example.net:27019")
If you want to add the member with specific replicamember
configuration settings, you can pass adocument to rs.add()
that defines the member hostnameand any members[n]
settings your deployment requires.
- rs.add(
- {
- "host" : "myNewCSRSName/config2.example.net:27019",
- priority: <int>,
- votes: <int>,
- tags: <int>
- }
- )
Each new member performs aninitial sync to catch up to theprimary. Depending on factors such as the amount of data to sync, yournetwork topology and health, and the power of each host machine,initial sync may take an extended period of time to complete.
The replica set may elect a new primary while you add additionalmembers. Use rs.status()
to identify which member isthe current primary. You can only run rs.add()
from theprimary.
Configure any additional required replication settings.
The rs.reconfig()
method updates the replica setconfiguration based on a configuration document passed in as aparameter. You must run reconfig()
against the primary member of the replica set.
Reference the original configuration file output of the replica set asidentified in step A. Review Replica Set Configurationsand apply settings as needed.
D. Restore Each Shard Replica Set
Restore the shard primary mongod data files.
Select the tab that corresponds to your preferred backup method:
- LVM Snapshot
- Other Backup Files (NOT mongodump)
- Mount the LVM snapshot on the target host machine.The specific steps for mounting an LVM snapshot depends onyour LVM configuration.
The following example assumes an LVM snapshot createdusing the Create a Snapshot step in theBack Up and Restore with Filesystem Snapshotsprocedure.
- lvcreate --size 250GB --name mongod-datafiles-snapshot vg0
- gzip -d -c mongod-datafiles-snapshot.gz | dd o/dev/vg0/mongod-datafiles-snapshot
- mount /dev/vg0/mongod-datafiles-snapshot /snap/mongodb
This example may not apply to all possible LVMconfigurations. Refer to the LVM documentation for yoursystem for more complete guidance on LVM restoration.
- Copy the
mongod
data files from thesnapshot mount to the data directory created inB. Prepare the Target Host for Restoration:
- cp -a /snap/mongodb/path/to/mongodb /path/to/mongodb
The -a
option recursively copies the contents ofthe source path to the destination path while preservingfolder and file permissions.
- Comment out or omit the followingconfiguration file settings:
- #replication
- # replSetName: myShardName
- #sharding
- # clusterRole: shardsvr
To start the mongod
using aconfiguration file, specify the—config
option in the commandline specifying the full path to the configuration file:
- mongod --config /path/to/mongodb/mongod.conf
If you have mongod
configured torun as a system service, start it using the recommendedprocess for your system service manager.
After the mongod
starts, connectto it using the mongo
shell.
Make the data files stored in your selected backup mediumaccessible on the host. This may require mounting thebackup volume, opening the backup in a software utility,or using another tool to extract the data to disk. Referto the documentation for your preferred backup tool forinstructions on accessing the data contained in thebackup.
Copy the
mongod
data files from thebackup data location to the data directory created inB. Prepare the Target Host for Restoration:
- c -a /backup/mongodb/path/to/mongodb /path/to/mongodb
The -a
option recursively copies the contents ofthe source path to the destination path while preservingfolder and file permissions.
- Comment out or omit the followingconfiguration file settings:
- #replication
- # replSetName: myShardName
- #sharding
- # clusterRole: shardsvr
- To start the
mongod
using aconfiguration file, specify the—config
option in the commandline specifying the full path to the configuration file:
- mongod --config /path/to/mongodb/mongod.conf
Cloud Manager or Ops Manager Only
If performing a manual restoration of a Cloud Manageror Ops Manager backup, you must specify thedisableLogicalSessionCacheRefresh
server parameterprior to startup:
- mongod --config /path/to/mongodb/mongod.conf \
- --setParameter disableLogicalSessionCacheRefresh=true
If you have mongod
configured torun as a system service, start it using the recommendedprocess for your system service manager.
After the mongod
starts, connectto it using the mongo
shell.
Create a temporary user with the __system role.
During this procedure you will modify documents in theadmin.system.version
collection. For clusters enforcingauthentication, only the __system
role grants permission to modify this collection. You can skip thisstep if the cluster does not enforce authentication.
Warning
The __system
role entitles its holder to take any actionagainst any object in the database. This procedure includesinstructions for removing the user created in this step. Do notkeep this user active beyond the scope of this procedure.
Consider creating this user with the clientSource
authentication restrictionconfigured such that only the specified hosts canauthenticate as the privileged user.
- Authenticate as a user with the
userAdmin
role on theadmin
database oruserAdminAnyDatabase
role:
- use admin
- db.auth("myUserAdmin","mySecurePassword")
- Create a user with the
__system
role:
- db.createUser(
- {
- user: "mySystemUser",
- pwd: "<replaceMeWithAStrongPassword>",
- roles: [ "__system" ]
- }
- )
Passwords should be random, long, and complex to ensure systemsecurity and to prevent or delay malicious access.
- Authenticate as the privileged user:
- db.auth("mySystemUser","<replaceMeWithAStrongPassword>")
Drop the local database.
Use db.dropDatabase()
to drop the local
database:
- use local
- db.dropDatabase()
Remove the minOpTimeRecovery document from the admin.system.versions collection.
Issue the following deleteOne()
method on thesystem.version
collection in theadmin
database:
- use admin
- db.system.version.deleteOne( { _id: "minOpTimeRecovery" } )
Optional: For any CSRS hostname or replica set name changes, update shard metadata in each shard’s identity document.
You can skip this step if all of the following are true:
- The hostnames for any CSRS host did not change during this procedure.
- The CSRS replica set name did not change during this procedure.
The system.version
collection on the admin
database contains metadata relatedto the shard, including the CSRS connection string. If either theCSRS name or any member hostnames changed while restoring the CSRS,you must update this metadata.
Issue the following find()
method on thesystem.version
collection in theadmin
database:
- use admin
- db.system.version.find( {"_id" : "shardIdentity" } )
The find()
method returns a document thatresembles the following:
- {
- "_id" : "shardIdentity",
- "clusterId" : ObjectId("2bba123c6eeedcd192b19024"),
- "shardName" : "shard1",
- "configsvrConnectionString" : "myCSRSName/alpha.example.net:27019,beta.example.net:27019,charlie.example.net:27019" }
The following updateOne
methodupdates the document such that the host
string representsthe most current CSRS connection string:
- db.system.version.updateOne(
- { "_id" : "shardIdentity" },
- { $set :
- { "configsvrConnectionString" : "myNewCSRSName/config1.example.net:27019,config2.example.net:27019,config3.example.net:27019"}
- }
- )
Important
The shardName
value must match the _id
value in theshards
collection on the CSRS. Validatethat the metadata on the CSRS match the metadata for the shard.Refer to substep 3 in theC. Restore Config Server Replica Set portionof this procedure for instructions on viewing theCSRS metadata.
Restart the mongod as a new single-node replica set.
Shut down themongod
. Uncomment or add the followingconfiguration file options:
- replication
- replSetName: myNewShardName
- sharding
- clusterRole: shardsvr
If you want to change the replica set name, you must updatethe replSetName
field with the new namebefore proceeding.
Start the mongod
with the updatedconfiguration file:
- mongod --config /path/to/mongodb/mongod.conf
If you have mongod
configured to run as asystem service, start it using the recommended process for yoursystem service manager.
After the mongod
starts, connect to it usingthe mongo
shell.
Initiate the new replica set.
Initiate the replica set using rs.initiate()
with thedefault settings.
- rs.initiate()
Once the operation completes, use rs.status()
to checkthat the member has become the primary.
Add additional replica set members.
For each replica set member in the shard replica set, start themongod
on its host machine. Once you havestarted up all remaining members of the cluster successfully,connect a mongo
shell to the primary replicaset member. From the primary, use the rs.add()
method toadd each member of the replica set. Include the replica set name asthe prefix, followed by the hostname and port of the member’smongod
process:
- rs.add("myNewShardName/repl2.example.net:27018")
- rs.add("myNewShardName/repl3.example.net:27018")
If you want to add the member with specific replicamember
configuration settings, you can pass adocument to rs.add()
that defines the member hostnameand any members[n]
settings your deployment requires.
- rs.add(
- {
- "host" : "myNewShardName/repl2.example.net:27018",
- priority: <int>,
- votes: <int>,
- tags: <int>
- }
- )
Each new member performs aninitial sync to catch up to theprimary. Depending on factors such as the amount of data to sync, yournetwork topology and health, and the power of each host machine,initial sync may take an extended period of time to complete.
The replica set may elect a new primary while you add additionalmembers. Use rs.status()
to identify which member isthe current primary. You can only run rs.add()
from theprimary.
Configure any additional required replication settings.
The rs.reconfig()
method updates the replica setconfiguration based on a configuration document passed in as aparameter. You must run reconfig()
against the primary member of the replica set.
Reference the original configuration file output of the replica set asidentified in step A. Review Replica Set Configurationsand apply settings as needed.
Remove the temporary privileged user.
For clusters enforcing authentication, remove the privileged usercreated earlier in this procedure:
- Authenticate as a user with the
userAdmin
role on theadmin
database oruserAdminAnyDatabase
role:
- use admin
- db.auth("myUserAdmin","mySecurePassword")
- Delete the privileged user:
- db.removeUser("mySystemUser")
E. Restart Each mongos
Restart each mongos
in the cluster.
- mongos --config /path/to/config/mongos.conf
Include all other command line options as required by your deployment.
If the CSRS replica set name or any member hostname changed, update themongos
configuration file settingsharding.configDB
with updated configuration serverconnection string:
- sharding:
- configDB: "myNewCSRSName/config1.example.net:27019,config2.example.net:27019,config3.example.net:27019"
F. Validate Cluster Accessibility
Connect a mongo
shell to one of themongos
processes for the cluster. Usesh.status()
to check the overall cluster status. Ifsh.status()
indicates that the balancer is not running, usesh.startBalancer()
to restart the balancer. [1]
To confirm that all shards are accessible and communicating, inserttest data into a temporary sharded collection. Confirm that data isbeing split and migrated between each shard in your cluster. You canconnect a mongo
shell to each shard primary anduse db.collection.find()
to validate that the data wassharded as expected.
[1] | Starting in MongoDB 4.2, sh.startBalancer() also enablesauto-splitting for the sharded cluster. |