TOPN queries refer to queries that involve ORDER BY LIMIT operations, which are common in log retrieval and other detailed query scenarios. Doris automatically optimizes this type of query.

  1. SELECT * FROM tablex WHERE xxx ORDER BY c1,c2 ... LIMIT n

Optimization Points

  1. During execution, dynamic range filters are built for the sorting columns (e.g., c1 >= 1000), which automatically apply the preceding conditions when reading data, leveraging zonemap indexes to filter out some rows or even entire files.
  2. If the sorting fields c1, c2 are exactly the prefix of the table key, further optimization is applied. When reading data, only the header or tail of the data files is read, reducing the amount of data read to just the n rows needed.
  3. SELECT * deferred materialization, during the data reading and sorting process, only the sorting columns are read, not the other columns. After obtaining the row numbers that meet the conditions, the entire data of those n rows needed is read, significantly reducing the amount of data read and sorted.

Limitations

  1. It only applies to DUP and MOW tables, not to MOR and AGG tables.
  2. Due to the high memory consumption on very large n, it will not take effect if n is greater than topn_opt_limit_threshold.

Configuration and Query Analysis

The following two parameters are session variables that can be set for a specific SQL or globally.

  1. topn_opt_limit_threshold: This session variable determines whether TOPN optimization is applied. It defaults to 1024, and setting it to 0 disables the optimization.

  2. enable_two_phase_read_optimization: This session variable determines whether to enable this optimization. It defaults to true, and setting it to false disables the optimization.

Checking if TOPN Query Optimization is Enabled

To confirm if TOPN query optimization is enabled for a particular SQL, you can use the EXPLAIN statement to get the query plan. An example is as follows:

  • TOPN OPT indicates that optimization point 1 is applied.
  • VOlapScanNode with SORT LIMIT indicates optimization point 2 is applied.
  • OPT TWO PHASE indicates optimization point 3 is applied.
  1. 1:VTOP-N(137)
  2. | order by: @timestamp18 DESC
  3. | TOPN OPT
  4. | OPT TWO PHASE
  5. | offset: 0
  6. | limit: 10
  7. | distribute expr lists: applicationName5
  8. |
  9. 0:VOlapScanNode(106)
  10. TABLE: log_db.log_core_all_no_index(log_core_all_no_index), PREAGGREGATION: ON
  11. SORT INFO:
  12. @timestamp18
  13. SORT LIMIT: 10
  14. TOPN OPT:1
  15. PREDICATES: ZYCFC-TRACE-ID4 like '%flowId-1720055220933%'
  16. partitions=1/8 (p20240704), tablets=250/250, tabletList=1727094,1727096,1727098 ...
  17. cardinality=345472780, avgRowSize=0.0, numNodes=1
  18. pushAggOp=NONE

Checking the Effectiveness of TOPN Query Optimization During Execution

First, set topn_opt_limit_threshold to 0 to disable TOPN query optimization and compare the execution time of the SQL with and without optimization enabled.

After enabling TOPN query optimization, search for RuntimePredicate in the query profile and focus on the following metrics:

  • RowsZonemapRuntimePredicateFiltered: The number of rows filtered out, the higher the better.
  • NumSegmentFiltered: The number of data files filtered out, the higher the better.
  • BlockConditionsFilteredZonemapRuntimePredicateTime: The time taken to filter data, the lower the better.

Before version 2.0.3, the RuntimePredicate metrics were not separated out, and the Zonemap metrics can be used as a rough guide.

  1. SegmentIterator:
  2. - BitmapIndexFilterTimer: 46.54us
  3. - BlockConditionsFilteredBloomFilterTime: 10.352us
  4. - BlockConditionsFilteredDictTime: 7.299us
  5. - BlockConditionsFilteredTime: 202.23ms
  6. - BlockConditionsFilteredZonemapRuntimePredicateTime: 0ns
  7. - BlockConditionsFilteredZonemapTime: 402.917ms
  8. - BlockInitSeekCount: 399
  9. - BlockInitSeekTime: 11.309ms
  10. - BlockInitTime: 215.59ms
  11. - BlockLoadTime: 7s567ms
  12. - BlocksLoad: 392.97K (392970)
  13. - CachedPagesNum: 0
  14. - CollectIteratorMergeTime: 0ns
  15. - CollectIteratorNormalTime: 0ns
  16. - CompressedBytesRead: 29.76 MB
  17. - DecompressorTimer: 427.713ms
  18. - ExprFilterEvalTime: 3s930ms
  19. - FirstReadSeekCount: 392.921K (392921)
  20. - FirstReadSeekTime: 528.287ms
  21. - FirstReadTime: 1s134ms
  22. - IOTimer: 51.286ms
  23. - InvertedIndexFilterTime: 49.457us
  24. - InvertedIndexQueryBitmapCopyTime: 0ns
  25. - InvertedIndexQueryBitmapOpTime: 0ns
  26. - InvertedIndexQueryCacheHit: 0
  27. - InvertedIndexQueryCacheMiss: 0
  28. - InvertedIndexQueryTime: 0ns
  29. - InvertedIndexSearcherOpenTime: 0ns
  30. - InvertedIndexSearcherSearchTime: 0ns
  31. - LazyReadSeekCount: 0
  32. - LazyReadSeekTime: 0ns
  33. - LazyReadTime: 106.952us
  34. - NumSegmentFiltered: 0
  35. - NumSegmentTotal: 50
  36. - OutputColumnTime: 61.987ms
  37. - OutputIndexResultColumnTimer: 12.345ms
  38. - RawRowsRead: 3.929151M (3929151)
  39. - RowsBitmapIndexFiltered: 0
  40. - RowsBloomFilterFiltered: 0
  41. - RowsConditionsFiltered: 6.38976M (6389760)
  42. - RowsDictFiltered: 0
  43. - RowsInvertedIndexFiltered: 0
  44. - RowsKeyRangeFiltered: 0
  45. - RowsShortCircuitPredFiltered: 0
  46. - RowsShortCircuitPredInput: 0
  47. - RowsStatsFiltered: 6.38976M (6389760)
  48. - RowsVectorPredFiltered: 0
  49. - RowsVectorPredInput: 0
  50. - RowsZonemapRuntimePredicateFiltered: 6.38976M (6389760)
  51. - SecondReadTime: 0ns
  52. - ShortPredEvalTime: 0ns
  53. - TotalPagesNum: 2.301K (2301)
  54. - UncompressedBytesRead: 137.99 MB
  55. - VectorPredEvalTime: 0ns