Dynamic Partition

Dynamic partition is a new feature introduced in Doris verion 0.12. It’s designed to manage partition’s Time-to-Life (TTL), reducing the burden on users.

The original design, implementation and effect can be referred to ISSUE 2262Dynamic Partition - 图1

Currently, the function of adding partitions dynamically is implemented, and the next version will support removing partitions dynamically.

Noun Interpretation

  • FE: Frontend, the front-end node of Doris. Responsible for metadata management and request access.
  • BE: Backend, Doris’s back-end node. Responsible for query execution and data storage.

Principle

In some scenarios, the user will create partitions for the table according to the day and perform routine tasks regularly every day. In this case, the user needs to manually manage the partition, otherwise the data import may fail because the partition is forgot to create, which brings additional maintenance costs to the user.

The design of implementation is that FE will starts a background thread that determines whether or not to start the thread and the scheduling frequency of the thread based on the parameters dynamic_partition_enable and dynamic_partition_check_interval_seconds in fe.conf.

When create a olap table, the dynamic_partition properties will be assigned. FE will parse dynamic_partition properties and check the legitimacy of the input parameters firstly, and then persist the properties to FE metadata, register the table to the list of dynamic partition at the same time. Daemon thread will scan the dynamic partition list periodically according to the configuration parameters, read dynamic partition properties of the table, and doing the task of adding partitions. The scheduling information of each time will be kept in the memory of FE. You can check whether the scheduling task is successful through SHOW DYNAMIC PARTITION TABLES.

Usage

Establishment of tables

When creating a table, you can specify the attribute dynamic_partition in PROPERTIES, which means that the table is a dynamic partition table.

Examples:

  1. CREATE TABLE example_db.dynamic_partition
  2. (
  3. k1 DATE,
  4. k2 INT,
  5. k3 SMALLINT,
  6. v1 VARCHAR(2048),
  7. v2 DATETIME DEFAULT "2014-02-04 15:36:00"
  8. )
  9. ENGINE=olap
  10. DUPLICATE KEY(k1, k2, k3)
  11. PARTITION BY RANGE (k1)
  12. (
  13. PARTITION p1 VALUES LESS THAN ("2014-01-01"),
  14. PARTITION p2 VALUES LESS THAN ("2014-06-01"),
  15. PARTITION p3 VALUES LESS THAN ("2014-12-01")
  16. )
  17. DISTRIBUTED BY HASH(k2) BUCKETS 32
  18. PROPERTIES(
  19. "storage_medium" = "SSD",
  20. "dynamic_partition.enable" = "true"
  21. "dynamic_partition.time_unit" = "DAY",
  22. "dynamic_partition.end" = "3",
  23. "dynamic_partition.prefix" = "p",
  24. "dynamic_partition.buckets" = "32"
  25. );

Create a dynamic partition table, specify enable dynamic partition features, take today is 2020-01-08 for example, at every time of scheduling, will create today and after 3 days in advance of four partitions (if the partition is existed, the task will be ignored), partition name respectively according to the specified prefix p20200108 p20200109 p20200110 p20200111, each partition to 32 the number of points barrels, each partition scope is as follows:

  1. [types: [DATE]; keys: [2020-01-08]; types: [DATE]; keys: [2020-01-09]; )
  2. [types: [DATE]; keys: [2020-01-09]; types: [DATE]; keys: [2020-01-10]; )
  3. [types: [DATE]; keys: [2020-01-10]; types: [DATE]; keys: [2020-01-11]; )
  4. [types: [DATE]; keys: [2020-01-11]; types: [DATE]; keys: [2020-01-12]; )

Enable Dynamic Partition Feature

  1. First of all, dynamic_partition_enable=true needs to be set in fe.conf, which can be specified by modifying the configuration file when the cluster starts up, or dynamically modified by HTTP interface at run time

  2. If you need to add dynamic partitioning properties to a table prior to version 0.12, you need to modify the properties of the table with the following command

  1. ALTER TABLE dynamic_partition set ("dynamic_partition.enable" = "true", "dynamic_partition.time_unit" = "DAY", "dynamic_partition.end" = "3", "dynamic_partition.prefix" = "p", "dynamic_partition.buckets" = "32");

Disable Dynamic Partition Feature

If you need to stop dynamic partitioning for all dynamic partitioning tables in the cluster, you need to set ‘dynamic_partition_enable=true’ in fe.conf

If you need to stop dynamic partitioning for a specified table, you can modify the properties of the table with the following command

  1. ALTER TABLE dynamic_partition set ("dynamic_partition.enable" = "false")

Modify Dynamic Partition Properties

You can modify the properties of the dynamic partition with the following command

  1. ALTER TABLE dynamic_partition set("key" = "value")

Check Dynamic Partition Table Scheduling Status

You can further view the scheduling of dynamic partitioned tables by using the following command:

  1. SHOW DYNAMIC PARTITION TABLES;
  2. +-------------------+--------+----------+------+--------+---------+---------------------+---------------------+--------+------+
  3. | TableName | Enable | TimeUnit | End | Prefix | Buckets | LastUpdateTime | LastSchedulerTime | State | Msg |
  4. +-------------------+--------+----------+------+--------+---------+---------------------+---------------------+--------+------+
  5. | dynamic_partition | true | DAY | 3 | p | 32 | 2020-01-08 20:19:09 | 2020-01-08 20:19:34 | NORMAL | N/A |
  6. +-------------------+--------+----------+------+--------+---------+---------------------+---------------------+--------+------+
  7. 1 row in set (0.00 sec)
  • LastUpdateTime: The last time of modifying dynamic partition properties
  • LastSchedulerTime: The last time of performing dynamic partition scheduling
  • State: The state of the last execution of dynamic partition scheduling
  • Msg: Error message for the last time dynamic partition scheduling was performed

Advanced Operation

FE Configuration Item

  • dynamic_partition_enable

    Whether to enable Doris’s dynamic partition feature. The default value is false, which is off. This parameter only affects the partitioning operation of dynamic partition tables, not normal tables.

  • dynamic_partition_check_interval_seconds

    The execution frequency of dynamically partitioned threads, by default 3600(1 hour), which means scheduled every 1 hour.

HTTP Restful API

Doris provides an HTTP Restful API for modifying dynamic partition configuration parameters at run time.

The API is implemented in FE, user can access it by fe_host:fe_http_port.The operation needs admin privilege.

  1. Set dynamic_partition_enable to true or false

    • Set to true

      1. GET /api/_set_config?dynamic_partition_enable=true
      2. For example: curl --location-trusted -u username:password -XGET http://fe_host:fe_http_port/api/_set_config?dynamic_partition_enable=true
      3. Return Code200
    • Set to false

      1. GET /api/_set_config?dynamic_partition_enable=false
      2. For example: curl --location-trusted -u username:password -XGET http://fe_host:fe_http_port/api/_set_config?dynamic_partition_enable=false
      3. Return Code200
  2. Set the scheduling frequency for dynamic partition

    • Set schedule frequency to 12 hours.

      1. GET /api/_set_config?dynamic_partition_check_interval_seconds=432000
      2. For example: curl --location-trusted -u username:password -XGET http://fe_host:fe_http_port/api/_set_config?dynamic_partition_check_interval_seconds=432000
      3. Return Code200