Active Failover Administration

Active Failover Administration

This section includes information related to the administration of an Active Failover setup.

For a general introduction to the ArangoDB Active Failover setup, please refer to the Active Failover chapter.

Introduction

The Active Failover setup requires almost no manual administration.

You may still need to replace, upgrade or remove individual nodes in an Active Failover setup.

Determining the current Leader

It is possible to determine the leader by asking any of the involved single-server instances. Just send a request to the /_api/cluster/endpoints REST API.

curl http://server.domain.org:8530/_api/cluster/endpoints
{
  "error": false,
  "code": 200,
  "endpoints": [
    {
      "endpoint": "tcp://[::1]:8530"
    },
    {
      "endpoint": "tcp://[::1]:8531"
    }
  ]
}

This API will return you all available endpoints, the first endpoint is defined to be the current Leader. This endpoint is always available and will not be blocked with a HTTP/1.1 503 Service Unavailable response on a Follower

Reading from Follower

Followers in the active-failover setup are in a read-only mode. It is possible to read from these followers by adding a X-Arango-Allow-Dirty-Read: true header on each request. Responses will then automatically contain the X-Arango-Potential-Dirty-Read header so that clients can reject accidental dirty reads.

Depending on the driver support for your specific programming language, you should be able to enable this option.

Upgrading / Replacing / Removing a Leader

A Leader is the active server which can receive all read and write operations in an Active-Failover setup.

Upgrading or removing a Leader can be a little tricky, because as soon as you stop the leader’s process you will trigger a failover situation. This can be intended here, but you will probably want to halt all writes to the leader for a certain amount of time to allow the follower to catch up on all operations.

After you have ensured that the follower is sufficiently caught up, you can stop the leader process via the shutdown API or by sending a SIGTERM signal to the process (i.e. kill <process-id>). This will trigger an orderly shutdown, and should trigger an immediate switch to the follower. If your client drivers are configured correctly, you should notice almost no interruption in your applications.

Once you upgraded the local server via the --database.auto-upgrade option, you can add it again to the Active Failover setup. The server will resync automatically with the new Leader and become a Follower.

Upgrading / Replacing / Removing a Follower

A Follower is the passive server which tries to mirror all the data stored in the Leader.

To upgrade a follower you only need to stop the process and start it with --database.auto-upgrade. The server process will automatically resync with the Leader after a restart.

The clean way of removing a Follower is to first start a replacement Follower (otherwise you will lose resiliency). To start a Follower please have a look into our deployment guide. After you have your replacement ready you can just kill the process and remove it.