Automatic Rebalancing

Automatic Rebalancing

This page walks you through a simple demonstration of how CockroachDB automatically rebalances data as you scale. Starting with a 3-node local cluster, you'll run a sample workload and watch the replica count increase. You'll then add 2 more nodes and watch how CockroachDB automatically rebalances replicas to efficiently use all available capacity.

Before you begin

Make sure you have already installed CockroachDB.

Step 1. Start a 3-node cluster

Use the cockroach start command to start 3 nodes:

# In a new terminal, start node 1:
$ cockroach start \
--insecure \
--store=scale-node1 \
--listen-addr=localhost:26257 \
--http-addr=localhost:8080 \
--join=localhost:26257,localhost:26258,localhost:26259

# In a new terminal, start node 2:
$ cockroach start \
--insecure \
--store=scale-node2 \
--listen-addr=localhost:26258 \
--http-addr=localhost:8081 \
--join=localhost:26257,localhost:26258,localhost:26259

# In a new terminal, start node 3:
$ cockroach start \
--insecure \
--store=scale-node3 \
--listen-addr=localhost:26259 \
--http-addr=localhost:8082 \
--join=localhost:26257,localhost:26258,localhost:26259

Step 2. Initialize the cluster

In a new terminal, use the cockroach init command to perform a one-time initialization of the cluster:

$ cockroach init \
--insecure \
--host=localhost:26257

Step 3. Verify that the cluster is live

In a new terminal, connect the built-in SQL shell to any node to verify that the cluster is live:

$ cockroach sql --insecure --host=localhost:26257

> SHOW DATABASES;

  database_name
+---------------+
  defaultdb
  postgres
  system
(3 rows)

Exit the SQL shell:

> \q

Step 4. Run a sample workload

CockroachDB comes with built-in load generators for simulating different types of client workloads, printing out per-operation statistics every second and totals after a specific duration or max number of operations. In this tutorial, you'll use the tpcc workload to simulate transaction processing using a rich schema of multiple tables.

Load the initial schema and data:

$ cockroach workload init tpcc \
'postgresql://root@localhost:26257?sslmode=disable'

The initial data is enough for the purpose of this tutorial, but you can run the workload for as long as you like to increase the data size, adjusting the —duration flag as appropriate:

$ cockroach workload run tpcc \
--duration=30s \
'postgresql://root@localhost:26257?sslmode=disable'

You'll see per-operation statistics print to standard output every second.

Step 5. Watch the replica count increase

Open the Admin UI at http://localhost:8080 and you’ll see the replica count increase as the tpcc workload writes data.

Step 6. Add 2 more nodes

Adding capacity is as simple as starting more nodes and joining them to the running cluster:

# In a new terminal, start node 4:
$ cockroach start \
--insecure \
--store=scale-node4 \
--listen-addr=localhost:26260 \
--http-addr=localhost:8083 \
--join=localhost:26257,localhost:26258,localhost:26259

# In a new terminal, start node 5:
$ cockroach start \
--insecure \
--store=scale-node5 \
--listen-addr=localhost:26261 \
--http-addr=localhost:8084 \
--join=localhost:26257,localhost:26258,localhost:26259

Step 7. Watch data rebalance across all 5 nodes

Back in the Admin UI, you'll now see 5 nodes listed. At first, the replica count will be lower for nodes 4 and 5. Very soon, however, you'll see those numbers even out across all nodes, indicating that data is being automatically rebalanced to utilize the additional capacity of the new nodes.

Note:

After scaling to 5 nodes, the Admin UI will call out a number of under-replicated ranges. This is due to the cluster preferring 5 replicas for important internal system data by default. When the cluster is less than 5 nodes, this preference is ignored in reporting, but as soon as there are more than 3 nodes, the cluster recognizes this preference and reports the under-replicated state in the UI. As those ranges are up-replicated, the under-replicated range count will decrease to 0.

Step 8. Stop the cluster

Once you're done with your test cluster, stop each node by switching to its terminal and pressing CTRL-C.

Tip:

For the last node, the shutdown process will take longer (about a minute) and will eventually force kill the node. This is because, with only 1 node still online, a majority of replicas are no longer available (2 of 3), and so the cluster is not operational. To speed up the process, press CTRL-C a second time.

If you do not plan to restart the cluster, you may want to remove the nodes' data stores:

$ rm -rf scale-node1 scale-node2 scale-node3 scale-node4 scale-node5

What's next?

Explore other core CockroachDB benefits and features:

Was this page helpful?
Yes No