Replace a master node

ENTERPRISE

Replacing a master node in an existing DC/OS cluster

You can replace a master node in an existing DC/OS cluster. You should keep in mind, however, that you should only ever replace one master at a time. The following steps summarize how to replace a master node for a DC/OS cluster.

To replace a master node:

  1. Back up ZooKeeper.

  2. Back up the DC/OS identity and access management CockroachDB® database to a file by running a command similar to the following on the master node:

    1. dcos-shell iam-database-backup > ~/iam-backup.sql

    For more information about backing up the DC/OS identity and access management CockroachDB database, see How do I backup the IAM database?

  3. Back up /var/lib/dcos/exhibitor-tls-artifacts if it exists.

    1. tar czf exhibitor-tls-artifacts.tar.gz /var/lib/dcos/exhibitor-tls-artifacts
  4. Shut down the master node you want to replace.

  5. Add the new master node to replace the one taken offline in the previous step.

    Static master discovery

    If you have configured static master discovery in your config.yaml file (master_discovery: static):

    • Verify that the new server has the same internal IP address as the old master node.
    • Verify that the old server is completely unreachable from the cluster.
    • Copy exhibitor-tls-artifacts.tar.gz to the new master node.

      1. scp exhibitor-tls-artifacts.tar.gz root@<new-master-host>:/root
    • Extract the archive on the master

      1. tar xzf /root/exhibitor-tls-artifacts.tar.gz -C /
    • Install the new master as you would normally.

    Dynamic master discovery

    If you have configured dynamic master discovery in your config.yaml file ( master_discovery: master_http_loadbalancer):

    • Install the new master as you would normally.
  6. Check that the new master is healthy.

    IMPORTANT: This step is required. Be sure to confirm that the new master has joined the cluster successfully before replacing any additional master nodes or performing any additional administrative tasks.

    To validate that the master node replacement completed successfully, follow the steps to Validate the upgrade as described in Upgrading a master.