Read Optimized
Overview
For Primary Key Table, it’s a ‘MergeOnRead’ technology. When reading data, multiple layers of LSM data are merged, and the number of parallelism will be limited by the number of buckets. Although Paimon’s merge performance is efficient, it still cannot catch up with the ordinary AppendOnly table.
We recommend that you use Deletion Vectors mode.
If you don’t want to use Deletion Vectors mode, you want to query fast enough in certain scenarios, but can only find older data, you can also:
- Configure ‘compaction.optimization-interval’ when writing data. For streaming jobs, optimized compaction will then be performed periodically; For batch jobs, optimized compaction will be carried out when the job ends. (Or configure
'full-compaction.delta-commits'
, its disadvantage is that it can only perform compaction synchronously, which will affect writing efficiency) - Query from read-optimized system table. Reading from results of optimized files avoids merging records with the same key, thus improving reading performance.
You can flexibly balance query performance and data latency when reading.
当前内容版权归 Apache Paimon 或其关联方所有,如需对内容或内容相关联开源项目进行关注与资助,请访问 Apache Paimon .