Append Table
If a table does not have a primary key defined, it is an append table by default.
You can only insert a complete record into the table in streaming. This type of table is suitable for use cases that do not require streaming updates (such as log data synchronization).
Flink
CREATE TABLE my_table (
product_id BIGINT,
price DOUBLE,
sales BIGINT
);
Data Distribution
By default, append table has no bucket concept. It acts just like a Hive Table. The data files are placed under partitions where they can be reorganized and reordered to speed up queries.
Automatic small file merging
In streaming writing job, without bucket definition, there is no compaction in writer, instead, will use Compact Coordinator
to scan the small files and pass compaction task to Compact Worker
. In streaming mode, if you run insert sql in flink, the topology will be like this:
Do not worry about backpressure, compaction never backpressure.
If you set write-only
to true, the Compact Coordinator
and Compact Worker
will be removed in the topology.
The auto compaction is only supported in Flink engine streaming mode. You can also start a compaction job in flink by flink action in paimon and disable all the other compaction by set write-only
.