compare

The compare command helps you analyze the difference between two benchmark tests. This can help you analyze the performance impact of changes made from a previous test based on a specific Git revision.

Usage

You can compare two different workload tests using their TestExecution IDs. To find a list of tests run from a specific workload, use opensearch-benchmark list test_executions. You should receive an output similar to the following:

  1. ____ _____ __ ____ __ __
  2. / __ \____ ___ ____ / ___/___ ____ ___________/ /_ / __ )___ ____ _____/ /_ ____ ___ ____ ______/ /__
  3. / / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \ / __ / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
  4. / /_/ / /_/ / __/ / / /__/ / __/ /_/ / / / /__/ / / / / /_/ / __/ / / / /__/ / / / / / / / / /_/ / / / ,<
  5. \____/ .___/\___/_/ /_/____/\___/\__,_/_/ \___/_/ /_/ /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/ /_/|_|
  6. /_/
  7. Recent test-executions:
  8. Recent test_executions:
  9. TestExecution ID TestExecution Timestamp Workload Workload Parameters TestProcedure ProvisionConfigInstance User Tags workload Revision Provision Config Revision
  10. ------------------------------------ ------------------------- ---------- --------------------- ------------------- ------------------------- ----------- ------------------- ---------------------------
  11. 729291a0-ee87-44e5-9b75-cc6d50c89702 20230524T181718Z geonames append-no-conflicts 4gheap 30260cf
  12. f91c33d0-ec93-48e1-975e-37476a5c9fe5 20230524T170134Z geonames append-no-conflicts 4gheap 30260cf
  13. d942b7f9-6506-451d-9dcf-ef502ab3e574 20230524T144827Z geonames append-no-conflicts 4gheap 30260cf
  14. a33845cc-c2e5-4488-a2db-b0670741ff9b 20230523T213145Z geonames append-no-conflicts

Then, use compare to call a --baseline test and a --contender test for comparison.

  1. opensearch-benchmark compare --baseline=417ed42-6671-9i79-11a1-e367636068ce --contender=beb154e4-0a05-4f45-ad9f-e34f9a9e51f7

You should receive the following response comparing the final benchmark metrics for both tests:

  1. ____ _____ __ ____ __ __
  2. / __ \____ ___ ____ / ___/___ ____ ___________/ /_ / __ )___ ____ _____/ /_ ____ ___ ____ ______/ /__
  3. / / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \ / __ / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
  4. / /_/ / /_/ / __/ / / /__/ / __/ /_/ / / / /__/ / / / / /_/ / __/ / / / /__/ / / / / / / / / /_/ / / / ,<
  5. \____/ .___/\___/_/ /_/____/\___/\__,_/_/ \___/_/ /_/ /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/ /_/|_|
  6. /_/
  7. Comparing baseline
  8. TestExecution ID: 729291a0-ee87-44e5-9b75-cc6d50c89702
  9. TestExecution timestamp: 2023-05-24 18:17:18
  10. with contender
  11. TestExecution ID: a33845cc-c2e5-4488-a2db-b0670741ff9b
  12. TestExecution timestamp: 2023-05-23 21:31:45
  13. ------------------------------------------------------
  14. _______ __ _____
  15. / ____(_)___ ____ _/ / / ___/_________ ________
  16. / /_ / / __ \/ __ `/ / \__ \/ ___/ __ \/ ___/ _ \
  17. / __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/
  18. /_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/
  19. ------------------------------------------------------
  20. Metric Baseline Contender Diff
  21. -------------------------------------------------------- ---------- ----------- -----------------
  22. Min Indexing Throughput [docs/s] 19501 19118 -383.00000
  23. Median Indexing Throughput [docs/s] 20232 19927.5 -304.45833
  24. Max Indexing Throughput [docs/s] 21172 20849 -323.00000
  25. Total indexing time [min] 55.7989 56.335 +0.53603
  26. Total merge time [min] 12.9766 13.3115 +0.33495
  27. Total refresh time [min] 5.20067 5.20097 +0.00030
  28. Total flush time [min] 0.0648667 0.0681833 +0.00332
  29. Total merge throttle time [min] 0.796417 0.879267 +0.08285
  30. Query latency term (50.0 percentile) [ms] 2.10049 2.15421 +0.05372
  31. Query latency term (90.0 percentile) [ms] 2.77537 2.84168 +0.06630
  32. Query latency term (100.0 percentile) [ms] 4.52081 5.15368 +0.63287
  33. Query latency country_agg (50.0 percentile) [ms] 112.049 110.385 -1.66392
  34. Query latency country_agg (90.0 percentile) [ms] 128.426 124.005 -4.42138
  35. Query latency country_agg (100.0 percentile) [ms] 155.989 133.797 -22.19185
  36. Query latency scroll (50.0 percentile) [ms] 16.1226 14.4974 -1.62519
  37. Query latency scroll (90.0 percentile) [ms] 17.2383 15.4079 -1.83043
  38. Query latency scroll (100.0 percentile) [ms] 18.8419 18.4241 -0.41784
  39. Query latency country_agg_cached (50.0 percentile) [ms] 1.70223 1.64502 -0.05721
  40. Query latency country_agg_cached (90.0 percentile) [ms] 2.34819 2.04318 -0.30500
  41. Query latency country_agg_cached (100.0 percentile) [ms] 3.42547 2.86814 -0.55732
  42. Query latency default (50.0 percentile) [ms] 5.89058 5.83409 -0.05648
  43. Query latency default (90.0 percentile) [ms] 6.71282 6.64662 -0.06620
  44. Query latency default (100.0 percentile) [ms] 7.65307 7.3701 -0.28297
  45. Query latency phrase (50.0 percentile) [ms] 1.82687 1.83193 +0.00506
  46. Query latency phrase (90.0 percentile) [ms] 2.63714 2.46286 -0.17428
  47. Query latency phrase (100.0 percentile) [ms] 5.39892 4.22367 -1.17525
  48. Median CPU usage (index) [%] 668.025 679.15 +11.12499
  49. Median CPU usage (stats) [%] 143.75 162.4 +18.64999
  50. Median CPU usage (search) [%] 223.1 229.2 +6.10000
  51. Total Young Gen GC time [s] 39.447 40.456 +1.00900
  52. Total Young Gen GC count 10 11 +1.00000
  53. Total Old Gen GC time [s] 7.108 7.703 +0.59500
  54. Total Old Gen GC count 10 11 +1.00000
  55. Index size [GB] 3.25475 3.25098 -0.00377
  56. Total written [GB] 17.8434 18.3143 +0.47083
  57. Heap used for segments [MB] 21.7504 21.5901 -0.16037
  58. Heap used for doc values [MB] 0.16436 0.13905 -0.02531
  59. Heap used for terms [MB] 20.0293 19.9159 -0.11345
  60. Heap used for norms [MB] 0.105469 0.0935669 -0.01190
  61. Heap used for points [MB] 0.773487 0.772155 -0.00133
  62. Heap used for points [MB] 0.677795 0.669426 -0.00837
  63. Segment count 136 121 -15.00000
  64. Indices Stats(90.0 percentile) [ms] 3.16053 3.21023 +0.04969
  65. Indices Stats(99.0 percentile) [ms] 5.29526 3.94132 -1.35393
  66. Indices Stats(100.0 percentile) [ms] 5.64971 7.02374 +1.37403
  67. Nodes Stats(90.0 percentile) [ms] 3.19611 3.15251 -0.04360
  68. Nodes Stats(99.0 percentile) [ms] 4.44111 4.87003 +0.42892
  69. Nodes Stats(100.0 percentile) [ms] 5.22527 5.66977 +0.44450

Options

You can use the following options to customize the results of your test comparison:

  • --baseline: The baseline TestExecution ID used to compare the contender TestExecution.
  • --contender: The TestExecution ID for the contender being compared to the baseline.
  • --results-format: Defines the output format for the command line results, either markdown or csv. Default is markdown.
  • --results-number-align: Defines the column number alignment for when the compare command outputs results. Default is right.
  • --results-file: When provided a file path, writes the compare results to the file indicated in the path.
  • --show-in-results: Determines whether or not to include the comparison in the results file.