Starting Manually

An ArangoDB Active Failover setup consists of several running tasks or processes.

This section describes how to start an Active Failover by manually starting allthe needed processes.

Before continuing, be sure to read the Architecturesection to get a basic understanding of the underlying architecture and the involvedroles in an ArangoDB Active Failover setup.

We will include commands for a local test (all processes running on a single machine)and for a more real production scenario, which makes use of 3 different machines.

Local Tests

In this paragraph we will include commands to manually start an Active Failoverwith 3 Agents, and two single server instances.

We will assume that all processes runs on the same machine (127.0.0.1). Such scenarioshould be used for testing only.

Local Test Agency

To start up an Agency you first have to activate it. This is done by providingthe option —agency.activate true.

To start up the Agency in its fault tolerant mode set the —agency.size to 3.You will then have to start at least 3 Agents before the Agency will start operation.

During initialization the Agents have to find each other. To do so provide atleast one common —agency.endpoint. The Agents will then coordinate startupthemselves. They will announce themselves with their external address which may bespecified using —agency.my-address. This is required in bridged docker setupsor NATed environments.

So in summary these are the commands to start an Agency of size 3:

  1. arangod --server.endpoint tcp://0.0.0.0:5001 \
  2. --agency.my-address=tcp://127.0.0.1:5001 \
  3. --server.authentication false \
  4. --agency.activate true \
  5. --agency.size 3 \
  6. --agency.endpoint tcp://127.0.0.1:5001 \
  7. --agency.supervision true \
  8. --agency.supervision-grace-period 30 \
  9. --database.directory agent1 &
  10. arangod --server.endpoint tcp://0.0.0.0:5002 \
  11. --agency.my-address=tcp://127.0.0.1:5002 \
  12. --server.authentication false \
  13. --agency.activate true \
  14. --agency.size 3 \
  15. --agency.endpoint tcp://127.0.0.1:5001 \
  16. --agency.supervision true \
  17. --agency.supervision-grace-period 30 \
  18. --database.directory agent2 &
  19. arangod --server.endpoint tcp://0.0.0.0:5003 \
  20. --agency.my-address=tcp://127.0.0.1:5003 \
  21. --server.authentication false \
  22. --agency.activate true \
  23. --agency.size 3 \
  24. --agency.endpoint tcp://127.0.0.1:5001 \
  25. --agency.supervision true \
  26. --agency.supervision-grace-period 30 \
  27. --database.directory agent3 &

Note that to avoid unnecessary failovers, it may make sense to increase the value for the startup option —agency.supervision-grace-period to a value beyond 30 seconds.

Single Server Test Instances

To start the two single server instances, you can use the following commands:

  1. arangod --server.authentication false \
  2. --server.endpoint tcp://127.0.0.1:6001 \
  3. --cluster.my-address tcp://127.0.0.1:6001 \
  4. --cluster.my-role SINGLE \
  5. --cluster.agency-endpoint tcp://127.0.0.1:5001 \
  6. --cluster.agency-endpoint tcp://127.0.0.1:5002 \
  7. --cluster.agency-endpoint tcp://127.0.0.1:5003 \
  8. --replication.automatic-failover true \
  9. --database.directory singleserver6001 &
  10. arangod --server.authentication false \
  11. --server.endpoint tcp://127.0.0.1:6002 \
  12. --cluster.my-address tcp://127.0.0.1:6002 \
  13. --cluster.my-role SINGLE \
  14. --cluster.agency-endpoint tcp://127.0.0.1:5001 \
  15. --cluster.agency-endpoint tcp://127.0.0.1:5002 \
  16. --cluster.agency-endpoint tcp://127.0.0.1:5003 \
  17. --replication.automatic-failover true \
  18. --database.directory singleserver6002 &

Multiple Machines

The method from the previous paragraph can be extended to a more real production scenario,to start an Active Failover on multiple machines. The only changes are that onehas to replace all local addresses 127.0.0.1 by the actual IP address of thecorresponding server. Obviously, it would no longer be necessary to use differentport numbers on different servers.

Let’s assume that you want to start you Active Failover with 3 Agents and twosingle servers on three different machines with IP addresses:

  1. 192.168.1.1
  2. 192.168.1.2
  3. 192.168.1.3

Let’s also suppose that each of the above machines runs an Agent, an the firstand second machine run also the single instance.

If we use:

  • 8531 as port of the Agents
  • 8529 as port of the Coordinators

then the commands you have to use are reported in the following subparagraphs.

Agency

On 192.168.1.1:

  1. arangod --server.endpoint tcp://0.0.0.0:8531 \
  2. --agency.my-address tcp://192.168.1.1:8531 \
  3. --server.authentication false \
  4. --agency.activate true \
  5. --agency.size 3 \
  6. --agency.supervision true \
  7. --agency.supervision-grace-period 30 \
  8. --database.directory agent

On 192.168.1.2:

  1. arangod --server.endpoint tcp://0.0.0.0:8531 \
  2. --agency.my-address tcp://192.168.1.2:8531 \
  3. --server.authentication false \
  4. --agency.activate true \
  5. --agency.size 3 \
  6. --agency.supervision true \
  7. --agency.supervision-grace-period 30 \
  8. --database.directory agent

On 192.168.1.3:

  1. arangod --server.endpoint tcp://0.0.0.0:8531 \
  2. --agency.my-address tcp://192.168.1.3:8531 \
  3. --server.authentication false \
  4. --agency.activate true \
  5. --agency.size 3 \
  6. --agency.endpoint tcp://192.168.1.1:8531 \
  7. --agency.endpoint tcp://192.168.1.2:8531 \
  8. --agency.endpoint tcp://192.168.1.3:8531 \
  9. --agency.supervision true \
  10. --agency.supervision-grace-period 30 \
  11. --database.directory agent

Note that to avoid unnecessary failovers, it may make sense to increase the value for the startup option —agency.supervision-grace-period to a value beyond 30 seconds.

Single Server Instances

On 192.168.1.1:

  1. arangod --server.authentication=false \
  2. --server.endpoint tcp://0.0.0.0:8529 \
  3. --cluster.my-address tcp://192.168.1.1:8529 \
  4. --cluster.my-role SINGLE \
  5. --cluster.agency-endpoint tcp://192.168.1.1:8531 \
  6. --cluster.agency-endpoint tcp://192.168.1.2:8531 \
  7. --cluster.agency-endpoint tcp://192.168.1.3:8531 \
  8. --replication.automatic-failover true \
  9. --database.directory singleserver &

On 192.168.1.2:

Wait until the previous server is fully started, then start the second single serverinstance:

  1. arangod --server.authentication=false \
  2. --server.endpoint tcp://0.0.0.0:8529 \
  3. --cluster.my-address tcp://192.168.1.2:8529 \
  4. --cluster.my-role SINGLE \
  5. --cluster.agency-endpoint tcp://192.168.1.1:8531 \
  6. --cluster.agency-endpoint tcp://192.168.1.2:8531 \
  7. --cluster.agency-endpoint tcp://192.168.1.3:8531 \
  8. --replication.automatic-failover true \
  9. --database.directory singleserver &

Note: in the above commands, you can use host names, if they can be resolved,instead of IP addresses.

Manual Start in Docker

Manually starting an Active Failover via Docker is basically the same as described in the paragraphs above.

A bit of extra care has to be invested due to the way in which Docker isolates its network. By default it fully isolates the network and by doing so an endpoint like —server.endpoint tcp://0.0.0.0:8529will only bind to all interfaces inside the Docker container which does not includeany external interface on the host machine. This may be sufficient if you just wantto access it locally but in case you want to expose it to the outside you mustfacilitate Dockers port forwarding using the -p command line option. Be sure tocheck the official Docker documentation.

You can simply use the -p flag in Docker to make the individual processes available on the hostmachine or you could use Docker’s linksto enable process intercommunication.

An example configuration might look like this:

  1. docker run -e ARANGO_NO_AUTH=1 -p 192.168.1.1:10000:8529 arangodb/arangodb arangod \
  2. --server.endpoint tcp://0.0.0.0:8529\
  3. --cluster.my-address tcp://192.168.1.1:10000 \
  4. --cluster.my-role SINGLE \
  5. --cluster.agency-endpoint tcp://192.168.1.1:9001 \
  6. --cluster.agency-endpoint tcp://192.168.1.2:9001 \
  7. --cluster.agency-endpoint tcp://192.168.1.3:9001 \
  8. --replication.automatic-failover true

This will start a single server within a Docker container with an isolated network. Within the Docker container it will bind to all interfaces (this will be 127.0.0.1:8529and some internal Docker IP on port 8529). By supplying -p 192.168.1.1:10000:8529we are establishing a port forwarding from our local IP (192.168.1.1 port 10000 inthis example) to port 8529 inside the container. Within the command we are tellingarangod how it can be reached from the outside —cluster.my-address tcp://192.168.1.1:10000.

Authentication

To start the official Docker container you will have to decide on an authenticationmethod, otherwise the container will not start.

Provide one of the arguments to Docker as an environment variable. There are threeoptions:

  • ARANGO_NO_AUTH=1

Disable authentication completely. Useful for local testing or for operatingin a trusted network (without a public interface).

  • ARANGO_ROOT_PASSWORD=password

Start ArangoDB with the given password for root.

  • ARANGO_RANDOM_ROOT_PASSWORD=1

Let ArangoDB generate a random root password.

For an in depth guide about Docker and ArangoDB please check the official documentation:hub.docker.com/r/arangodb/arangodb/.Note that we are using the image arangodb/arangodb here which is always the most current one.There is also the “official” one called arangodb whose documentation is here:hub.docker.com/_/arangodb/{:target=”_blank”}