Testing tools
Testing is one of the most important part during the development of a distributed system. We have the following type of test.
This page includes our existing test tool which are part of the Ozone source base.
Note: we have more tests (like TCP-DS, TCP-H tests via Spark or Hive) which are not included here because they use external tools only.
Unit test
As every almost every java project we have the good old unit tests inside each of our projects.
Integration test (JUnit)
Traditional unit tests are supposed to test only one unit, but we also have higher level unit tests. They use MiniOzoneCluster
which is a helper method to start real daemons (scm,om,datanodes) during the unit test.
From maven/java point of view they are just simple unit tests (JUnit library is used) but to separate them (and solve some dependency problems) we moved all of these tests to hadoop-ozone/integration-test
Smoketest
We use docker-compose based pseudo-cluster to run different configuration of Ozone. To be sure that the different configuration can be started we implemented acceptance tests with the help of https://robotframework.org/.
The smoketests are available from the distribution (./smoketest
) but the robot files defines only the tests: usually they start CLI and check the output.
To run the tests in different environment (docker-compose, kubernetes) you need a definition to start the containers and execute the right tests in the right containers.
These definition of the tests are included in the compose
directory (check ./compose/*/test.sh
or ./compose/test-all.sh
).
For example a simple way to test the distribution package:
cd compose/ozone
./test.sh
Blockade
Blockade is a tool to test network failures and partitions (it’s inspired by the legendary Jepsen tests).
Blockade tests are implemented with the help of tests and can be started from the ./blockade
directory of the distribution.
cd blocakde
pip install pytest==2.8.7,blockade
python -m pytest -s .
See the README in the blockade directory for more details.
MiniChaosOzoneCluster
This is a way to get chaos in your machine. It can be started from the source code and a MiniOzoneCluster (which starts real daemons) will be started and killed randomly.
Freon
Freon is a command line application which is included in the Ozone distribution. It’s a load generator which is used in our stress tests.
Random keys:
In randomkeys mode, the data written into ozone cluster is randomly generated. Each key will be of size 10 KB.
The number of volumes/buckets/keys can be configured. The replication type and factor (eg. replicate with ratis to 3 nodes) also can be configured.
For more information use:
bin/ozone freon –help
For example:
ozone freon randomkeys --num-of-volumes=10 --num-of-buckets 10 --num-of-keys 10 --replication-type=RATIS --factor=THREE
***************************************************
Status: Success
Git Base Revision: 48aae081e5afacbb3240657556b26c29e61830c3
Number of Volumes created: 10
Number of Buckets created: 100
Number of Keys added: 1000
Ratis replication factor: THREE
Ratis replication type: RATIS
Average Time spent in volume creation: 00:00:00,035
Average Time spent in bucket creation: 00:00:00,319
Average Time spent in key creation: 00:00:03,659
Average Time spent in key write: 00:00:10,894
Total bytes written: 10240000
Total Execution time: 00:00:16,898
***********************
Genesis
Genesis is a microbenchmarking tool. It’s also included in the distribution (ozone genesis
) but it doesn’t require real cluster. It measures different part of the code in an isolated way (eg. the code which saves the data to the local RocksDB based key value stores)
Example run:
ozone genesis -benchmark=BenchMarkRocksDbStore
# JMH version: 1.19
# VM version: JDK 11.0.1, VM 11.0.1+13-LTS
# VM invoker: /usr/lib/jvm/java-11-openjdk-11.0.1.13-3.el7_6.x86_64/bin/java
# VM options: -Dproc_genesis -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/var/log/hadoop -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/hadoop -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender
# Warmup: 2 iterations, 1 s each
# Measurement: 20 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 4 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: org.apache.hadoop.ozone.genesis.BenchMarkRocksDbStore.test
# Parameters: (backgroundThreads = 4, blockSize = 8, maxBackgroundFlushes = 4, maxBytesForLevelBase = 512, maxOpenFiles = 5000, maxWriteBufferNumber = 16, writeBufferSize = 64)
# Run progress: 0.00% complete, ETA 00:00:22
# Fork: 1 of 1
# Warmup Iteration 1: 213775.360 ops/s
# Warmup Iteration 2: 32041.633 ops/s
Iteration 1: 196342.348 ops/s
?stack: <delayed till summary>
Iteration 2: 41926.816 ops/s
?stack: <delayed till summary>
Iteration 3: 210433.231 ops/s
?stack: <delayed till summary>
Iteration 4: 46941.951 ops/s
?stack: <delayed till summary>
Iteration 5: 212825.884 ops/s
?stack: <delayed till summary>
Iteration 6: 145914.351 ops/s
?stack: <delayed till summary>
Iteration 7: 141838.469 ops/s
?stack: <delayed till summary>
Iteration 8: 205334.438 ops/s
?stack: <delayed till summary>
Iteration 9: 163709.519 ops/s
?stack: <delayed till summary>
Iteration 10: 162494.608 ops/s
?stack: <delayed till summary>
Iteration 11: 199155.793 ops/s
?stack: <delayed till summary>
Iteration 12: 209679.298 ops/s
?stack: <delayed till summary>
Iteration 13: 193787.574 ops/s
?stack: <delayed till summary>
Iteration 14: 127004.147 ops/s
?stack: <delayed till summary>
Iteration 15: 145511.080 ops/s
?stack: <delayed till summary>
Iteration 16: 223433.864 ops/s
?stack: <delayed till summary>
Iteration 17: 169752.665 ops/s
?stack: <delayed till summary>
Iteration 18: 165217.191 ops/s
?stack: <delayed till summary>
Iteration 19: 191038.476 ops/s
?stack: <delayed till summary>
Iteration 20: 196335.579 ops/s
?stack: <delayed till summary>
Result "org.apache.hadoop.ozone.genesis.BenchMarkRocksDbStore.test":
167433.864 ?(99.9%) 43530.883 ops/s [Average]
(min, avg, max) = (41926.816, 167433.864, 223433.864), stdev = 50130.230
CI (99.9%): [123902.981, 210964.748] (assumes normal distribution)
Secondary result "org.apache.hadoop.ozone.genesis.BenchMarkRocksDbStore.test:?stack":
Stack profiler:
....[Thread state distributions]....................................................................
78.9% RUNNABLE
20.0% TIMED_WAITING
1.1% WAITING
....[Thread state: RUNNABLE]........................................................................
59.8% 75.8% org.rocksdb.RocksDB.put
16.5% 20.9% org.rocksdb.RocksDB.get
0.7% 0.9% java.io.UnixFileSystem.delete0
0.7% 0.9% org.rocksdb.RocksDB.disposeInternal
0.3% 0.4% java.lang.Long.formatUnsignedLong0
0.1% 0.2% org.apache.hadoop.ozone.genesis.BenchMarkRocksDbStore.test
0.1% 0.1% java.lang.Long.toUnsignedString0
0.1% 0.1% org.apache.hadoop.ozone.genesis.generated.BenchMarkRocksDbStore_test_jmhTest.test_thrpt_jmhStub
0.0% 0.1% java.lang.Object.clone
0.0% 0.0% java.lang.Thread.currentThread
0.4% 0.5% <other>
....[Thread state: TIMED_WAITING]...................................................................
20.0% 100.0% java.lang.Object.wait
....[Thread state: WAITING].........................................................................
1.1% 100.0% jdk.internal.misc.Unsafe.park
# Run complete. Total time: 00:00:38
Benchmark (backgroundThreads) (blockSize) (maxBackgroundFlushes) (maxBytesForLevelBase) (maxOpenFiles) (maxWriteBufferNumber) (writeBufferSize) Mode Cnt Score Error Units
BenchMarkRocksDbStore.test 4 8 4 512 5000 16 64 thrpt 20 167433.864 ? 43530.883 ops/s
BenchMarkRocksDbStore.test:?stack 4 8 4 512 5000 16 64 thrpt NaN ---