Large datasets

Large datasets

Use this benchmark to understand the performance, failure, and scaling characteristics of YugabyteDB with a massive dataset (multiple TB per node). In order to accomplish that, we will do the following:

Load 30 billion key-value records
Each write operation inserts a single record
Perform a read-heavy workload that does random reads in the presence of some writes
Perform a read-heavy workload that does reads of a subset of data in the presence of some writes

Each record is a key-value record of size almost 300 bytes.

Key size: 50 Bytes
Value size: 256 Bytes (chosen to be not very compressible)

Recommended configuration

Note that the load tester was run from a separate machine in the same availability zone (AZ).

Machine types

A machine in the Amazon Web Services (AWS) cloud with the following specifications was chosen: 32-vcpus, 240 GB RAM, 4 x 1.9TB nvme SSD.

Cloud: AWS
Node type: i3.8xlarge

Cluster creation

Create a standard four-node cluster, with replication factor (RF) of 3. Pass the following option to the YugabyteDB processes.

--yb_num_shards_per_tserver=20

The yb_num_shards_per_tserver was set to 20 (default value is 8). This is done because the i3.8xlarge nodes have four disks. In future, YugabyteDB will automatically pick better defaults for nodes with multiple disks.

Create the YCQL_ADDRS environment variable using the export command:

$ export YCQL_ADDRS="<ip1>:9042,<ip2>:9042,<ip3>:9042,<ip4>:9042"

Initial load phase

The data was loaded at a steady rate over about 4 days using the CassandraKeyValue workload. To load the data, run the following command:

$ java -jar yb-sample-apps.jar        \
      --workload CassandraKeyValue    \
      --nouuid --nodes $YCQL_ADDRS    \
      --value_size 256                \
      --num_unique_keys 30000000000   \
      --num_writes 30000000000        \
      --num_threads_write 256         \
      --num_threads_read 1

Write IOPS

You should see a steady 85,000 inserts per second with write latencies of around 2.5 milliseconds. This is shown graphically below.

Data set size growth rate

The graph below shows the steady growth in SSTables size at a node from Sep 4 to Sep 7 beyond which it stabilizes at 6.5 TB.

Final data set size

The figure below is from the yb-master Admin UI that shows the tablet servers, number of tablets on each, number of tablet leaders, and size of the on-disk SSTable files.

Note

The uncompressed dataset size per node is 8 TB, while the compressed size is 6.5 TB. This is because the load generator generates random bytes, which are not very compressible.

Real world workloads generally have much more compressible data.

Expected results

The results you see should be similar to the observations shown below.

Load phase results

Name	Observation
Records inserted	30 Billion
Size of each record	~ 300 bytes
Time taken to insert data	4.4 days
Sustained insert Rate	85K inserts/second
Final dataset in cluster	26TB across 4 nodes
Final dataset size per node	6.5TB / node

Read-heavy workload results

Name	Observation
Random-data read heavy workload	185K reads/sec and 1K writes/sec
Recent-data read heavy Workload	385K reads/sec and 6.5K writes/sec

Cluster expansion and induced failures

Expanded from four to five nodes in about eight hours
- Deliberately rate limited at 200 MB/sec
New node takes traffic as soon the first tablet arrives
- Pressure relieved from old nodes very quickly
Induced one node failure in five-node cluster
- Cluster rebalanced in 2 hrs 10 minutes