Deletion Vectors
Overview
The Deletion Vectors mode is designed to takes into account both data reading and writing efficiency.
In this mode, additional overhead (looking up LSM Tree and generating the corresponding Deletion File) will be introduced during writing, but during reading, data can be directly retrieved by employing data with deletion vectors, avoiding additional merge costs between different files.
Furthermore, data reading concurrency is no longer limited, and non-primary key columns can also be used for filter push down. Generally speaking, in this mode, we can get a huge improvement in read performance without losing too much write performance.
Usage
By specifying 'deletion-vectors.enabled' = 'true'
, the Deletion Vectors mode can be enabled.
Limitation
changelog-producer
needs to benone
orlookup
.changelog-producer.lookup-wait
can’t befalse
.merge-engine
can’t befirst-row
, because the read of first-row is already no merging, deletion vectors are not needed.- This mode will filter the data in level-0, so when using time travel to read
APPEND
snapshot, there will be data delay.