Manually Upgrading a Cluster Deployment
This page will guide you through the process of a manual upgrade of a cluster setup. The different nodes in a cluster can be upgraded one at a time without incurring downtime of the cluster and very short downtimes of the single nodes.
The manual upgrade procedure described in this Section can be used to upgrade to a new hotfix, or to perform an upgrade to a new minor version of ArangoDB. Please refer to the Upgrade Paths section for detailed information.
It is highly recommended to upgrade 3.6.x deployments to at least 3.6.15 and 3.7.x deployments to at least 3.7.13 because of a technical problem, see Technical Alert #6.
Preparations
The ArangoDB installation packages (e.g. for Debian or Ubuntu) set up a convenient standalone instance of arangod
. During installation, this instance’s database will be upgraded (see --database.auto-upgrade
) and the service will be (re)started.
You have to make sure that your cluster deployment is independent of this standalone instance. Specifically, make sure that the database directory as well as the socket used by the standalone instance provided by the package are separate from the ones in your cluster configuration. Also, that you haven’t modified the init script or systemd unit file for the standalone instance in way that it would start or stop your cluster instance instead.
You can read about the details on how to deploy your cluster independently of the standalone instance in the cluster deployment preliminary.
In the following, we assume that you don’t use the standalone instance from the package but only a manually started cluster instance, and we will move the standalone instance out of the way if necessary so you have to make as little changes as possible to the running cluster.
Install the new ArangoDB version binary
The first step is to install the new ArangoDB package.
Note: you do not have to stop the cluster (arangod) processes before upgrading it.
For example, if you want to upgrade to 3.7.13
on Debian or Ubuntu, either call
$ apt install arangodb=3.7.13
(apt-get
on older versions) if you have added the ArangoDB repository. Or install a specific package using
$ dpkg -i arangodb3-3.7.13-1_amd64.deb
after you have downloaded the corresponding file from download.arangodb.com.
Stop the Standalone Instance
As the package will automatically start the standalone instance, you might want to stop it now, as otherwise this standalone instance that is started on your machine can create some confusion later. As you are starting the cluster processes manually you do not need this standalone instance, and you can hence stop it:
$ service arangodb3 stop
Also, you might want to remove the standalone instance from the default runlevels to prevent it to start on the next reboot of your machine. How this is done depends on your distribution and init system. For example, on older Debian and Ubuntu systems using a SystemV-compatible init, you can use:
$ update-rc.d -f arangodb3 remove
Set supervision in maintenance mode
It is required to disable cluster supervision in order to upgrade your cluster. The following API calls will activate and de-activate the Maintenance mode of the Supervision job:
You might use curl to send the API call.
Activate Maintenance mode
curl -u username:password <coordinator>/_admin/cluster/maintenance -XPUT -d'"on"'
For Example:
curl http://localhost:7002/_admin/cluster/maintenance -XPUT -d'"on"'
{"error":false,"warning":"Cluster supervision deactivated.
It will be reactivated automatically in 60 minutes unless this call is repeated until then."}
Note: In case the manual upgrade takes longer than 60 minutes, the API call has to be resend.
Deactivate Maintenance mode
The cluster supervision reactivates 60 minutes after disabling it. It can be manually reactivated by the following API call:
curl -u username:password <coordinator>/_admin/cluster/maintenance -XPUT -d'"off"'
For example:
curl http://localhost:7002/_admin/cluster/maintenance -XPUT -d'"off"'
{"error":false,"warning":"Cluster supervision reactivated."}
Upgrade the cluster processes
Now all the cluster (Agents, DB-Servers and Coordinators) processes (arangod) have to be upgraded on each node.
Note: The maintenance mode has to be activated.
In order to stop the arangod processes we will need to use a command like kill -15
:
kill -15 <pid-of-arangod-process>
The pid associated to your cluster can be checked using a command like ps:
ps -C arangod -fww
The output of the command above does not only show the PID’s of all arangod processes but also the used commands, which can be useful for the following restart of all arangod processes.
The output below is from a test machine where three Agents, two DB-Servers and two Coordinators are running locally. In a more production-like scenario, you will find only one instance of each one running:
ps -C arangod -fww
UID PID PPID C STIME TTY TIME CMD
max 29075 8072 0 13:50 pts/2 00:00:42 arangod --server.endpoint tcp://0.0.0.0:5001 --agency.my-address=tcp://127.0.0.1:5001 --server.authentication false --agency.activate true --agency.size 3 --agency.endpoint tcp://127.0.0.1:5001 --agency.supervision true --log.file a1 --javascript.app-path /tmp --database.directory agent1
max 29208 8072 2 13:51 pts/2 00:02:08 arangod --server.endpoint tcp://0.0.0.0:5002 --agency.my-address=tcp://127.0.0.1:5002 --server.authentication false --agency.activate true --agency.size 3 --agency.endpoint tcp://127.0.0.1:5001 --agency.supervision true --log.file a2 --javascript.app-path /tmp --database.directory agent2
max 29329 16224 0 13:51 pts/3 00:00:42 arangod --server.endpoint tcp://0.0.0.0:5003 --agency.my-address=tcp://127.0.0.1:5003 --server.authentication false --agency.activate true --agency.size 3 --agency.endpoint tcp://127.0.0.1:5001 --agency.supervision true --log.file a3 --javascript.app-path /tmp --database.directory agent3
max 29461 16224 1 13:53 pts/3 00:01:11 arangod --server.authentication=false --server.endpoint tcp://0.0.0.0:6001 --cluster.my-address tcp://127.0.0.1:6001 --cluster.my-role PRIMARY --cluster.agency-endpoint tcp://127.0.0.1:5001 --cluster.agency-endpoint tcp://127.0.0.1:5002 --cluster.agency-endpoint tcp://127.0.0.1:5003 --log.file db1 --javascript.app-path /tmp --database.directory dbserver1
max 29596 8072 0 13:54 pts/2 00:00:56 arangod --server.authentication=false --server.endpoint tcp://0.0.0.0:6002 --cluster.my-address tcp://127.0.0.1:6002 --cluster.my-role PRIMARY --cluster.agency-endpoint tcp://127.0.0.1:5001 --cluster.agency-endpoint tcp://127.0.0.1:5002 --cluster.agency-endpoint tcp://127.0.0.1:5003 --log.file db2 --javascript.app-path /tmp --database.directory dbserver2
max 29824 16224 1 13:55 pts/3 00:01:53 arangod --server.authentication=false --server.endpoint tcp://0.0.0.0:7001 --cluster.my-address tcp://127.0.0.1:7001 --cluster.my-role COORDINATOR --cluster.agency-endpoint tcp://127.0.0.1:5001 --cluster.agency-endpoint tcp://127.0.0.1:5002 --cluster.agency-endpoint tcp://127.0.0.1:5003 --log.file c1 --javascript.app-path /tmp --database.directory coordinator1
max 29938 16224 2 13:56 pts/3 00:02:13 arangod --server.authentication=false --server.endpoint tcp://0.0.0.0:7002 --cluster.my-address tcp://127.0.0.1:7002 --cluster.my-role COORDINATOR --cluster.agency-endpoint tcp://127.0.0.1:5001 --cluster.agency-endpoint tcp://127.0.0.1:5002 --cluster.agency-endpoint tcp://127.0.0.1:5003 --log.file c2 --javascript.app-path /tmp --database.directory coordinator2
Upgrade a cluster node
The following procedure is upgrading Agent, DB-Server and Coordinator on one node.
Note: The starting commands of Agent, DB-Server and Coordinator have to be reused.
Stop the Agent
kill -15 <pid-of-agent>
Upgrade the Agent
The arangod process of the Agent has to be upgraded using the same command that has been used before with the additional option:
--database.auto-upgrade=true
The Agent will stop automatically after the upgrade.
Restart the Agent
The arangod process of the Agent has to be restarted using the same command that has been used before (without the additional option).
Stop the DB-Server
kill -15 <pid-of-dbserver>
Upgrade the DB-Server
The arangod process of the DB-Server has to be upgraded using the same command that has been used before with the additional option:
--database.auto-upgrade=true
The DB-Server will stop automatically after the upgrade.
Restart the DB-Server
The arangod process of the DB-Server has to be restarted using the same command that has been used before (without the additional option).
Stop the Coordinator
kill -15 <pid-of-coordinator>
Upgrade the Coordinator
The arangod process of the Coordinator has to be upgraded using the same command that has been used before with the additional option:
--database.auto-upgrade=true
The Coordinator will stop automatically after the upgrade.
Restart the Coordinator
The arangod process of the Coordinator has to be restarted using the same command that has been used before (without the additional option).
After repeating this process on every node all Agents, DB-Servers and Coordinators are upgraded and the manual upgrade has successfully finished.
The cluster supervision is reactivated by the API call:
curl -u username:password <coordinator>/_admin/cluster/maintenance -XPUT -d'"off"'