Expiring Partitions

You can set partition.expiration-time when creating a partitioned table. Paimon will periodically check the status of partitions and delete expired partitions according to time.

How to determine whether a partition has expired: compare the time extracted from the partition with the current time to see if survival time has exceeded the partition.expiration-time.

Note: After the partition expires, it is logically deleted and the latest snapshot cannot query its data. But the files in the file system are not immediately physically deleted, it depends on when the corresponding snapshot expires. See Expire Snapshots.

An example for single partition field:

  1. CREATE TABLE t (...) PARTITIONED BY (dt) WITH (
  2. 'partition.expiration-time' = '7 d',
  3. 'partition.expiration-check-interval' = '1 d',
  4. 'partition.timestamp-formatter' = 'yyyyMMdd'
  5. );

An example for multiple partition fields:

  1. CREATE TABLE t (...) PARTITIONED BY (other_key, dt) WITH (
  2. 'partition.expiration-time' = '7 d',
  3. 'partition.expiration-check-interval' = '1 d',
  4. 'partition.timestamp-formatter' = 'yyyyMMdd',
  5. 'partition.timestamp-pattern' = '$dt'
  6. );

More options:

OptionDefaultTypeDescription
partition.expiration-check-interval
1 hDurationThe check interval of partition expiration.
partition.expiration-time
(none)DurationThe expiration interval of a partition. A partition will be expired if it‘s lifetime is over this value. Partition time is extracted from the partition value.
partition.timestamp-formatter
(none)StringThe formatter to format timestamp from string. It can be used with ‘partition.timestamp-pattern’ to create a formatter using the specified value.
  • Default formatter is ‘yyyy-MM-dd HH:mm:ss’ and ‘yyyy-MM-dd’.
  • Supports multiple partition fields like ‘$year-$month-$day $hour:00:00’.
  • The timestamp-formatter is compatible with Java’s DateTimeFormatter.
partition.timestamp-pattern
(none)StringYou can specify a pattern to get a timestamp from partitions. The formatter pattern is defined by ‘partition.timestamp-formatter’.
  • By default, read from the first field.
  • If the timestamp in the partition is a single field called ‘dt’, you can use ‘$dt’.
  • If it is spread across multiple fields for year, month, day, and hour, you can use ‘$year-$month-$day $hour:00:00’.
  • If the timestamp is in fields dt and hour, you can use ‘$dt $hour:00:00’.