Append Table

If a table does not have a primary key defined, it is an append table by default.

You can only insert a complete record into the table in streaming. This type of table is suitable for use cases that do not require streaming updates (such as log data synchronization).

Flink

  1. CREATE TABLE my_table (
  2. product_id BIGINT,
  3. price DOUBLE,
  4. sales BIGINT
  5. );

Data Distribution

By default, append table has no bucket concept. It acts just like a Hive Table. The data files are placed under partitions where they can be reorganized and reordered to speed up queries.

Automatic small file merging

In streaming writing job, without bucket definition, there is no compaction in writer, instead, will use Compact Coordinator to scan the small files and pass compaction task to Compact Worker. In streaming mode, if you run insert sql in flink, the topology will be like this:

Append Table - 图1

Do not worry about backpressure, compaction never backpressure.

If you set write-only to true, the Compact Coordinator and Compact Worker will be removed in the topology.

The auto compaction is only supported in Flink engine streaming mode. You can also start a compaction job in flink by flink action in paimon and disable all the other compaction by set write-only.