Min Load Replica Num

Importing data requires more than half of the replicas to be written successfully. However, it is not flexible enough and may cause inconvenience in some scenarios.

For example, in the case of two replicas, to import data, both replicas need to be written successfully. This means that no replica is allowed to be unavailable during the data import process. This greatly affects the availability of the cluster.

In order to solve the above problems, Doris allows users to set the minimum number of write replicas. For the task of importing data, when the number of replicas it successfully writes is greater than or equal to the minimum number of replicas written, the import is successful.

Usage

Min load replica num for single table

You can set the table property min_load_replica_num for a single olap table. The valid value of this property must be greater than 0 and not exceed replication_num(the number of replicas of the table). Its default value is -1, indicating that the property is not enabled.

The min_load_replica_num of the table can be set when creating the table.

  1. CREATE TABLE test_table1
  2. (
  3. k1 INT,
  4. k2 INT
  5. )
  6. DUPLICATE KEY(k1)
  7. DISTRIBUTED BY HASH(k1) BUCKETS 5
  8. PROPERTIES
  9. (
  10. 'replication_num' = '2',
  11. 'min_load_replica_num' = '1'
  12. );

For an existing table, you can use ALTER TABLE to modify its min_load_replica_num.

  1. ALTER TABLE test_table1
  2. SET ( 'min_load_replica_num' = '1');

You can use SHOW CREATE TABLE to view the table property min_load_replica_num.

  1. SHOW CREATE TABLE test_table1;

The PROPERTIES of the output will contain min_load_replica_num. e.g.

  1. Create Table: CREATE TABLE `test_table1` (
  2. `k1` int(11) NULL,
  3. `k2` int(11) NULL
  4. ) ENGINE=OLAP
  5. DUPLICATE KEY(`k1`)
  6. COMMENT 'OLAP'
  7. DISTRIBUTED BY HASH(`k1`) BUCKETS 5
  8. PROPERTIES (
  9. "replication_allocation" = "tag.location.default: 2",
  10. "min_load_replica_num" = "1",
  11. "storage_format" = "V2",
  12. "light_schema_change" = "true",
  13. "disable_auto_compaction" = "false",
  14. "enable_single_replica_compaction" = "false"
  15. );

Global min load replica num for all tables

You can set FE configuration item min_load_replica_num for all olap tables. The valid value of this configuration item must be greater than 0. Its default value is -1, which means that the global minimum number of load replicas is not enabled.

For a table, if the table property min_load_replica_num is valid (>0), then the table will ignore the global configuration min_load_replica_num. Otherwise, if the global configuration min_load_replica_num is valid (>0), then the minimum number of load replicas for the table will be equal to min(FE.conf.min_load_replica_num, table.replication_num/2 + 1).

For viewing and modification of FE configuration items, you can refer to here.

Other cases

If the table property min_load_replica_num is not enabled (<=0), and the global configuration min_load_replica_num is not enabled(<=0), then the data import still needs to be successfully written to the majority replica. At this point, the minimum number of write replicas for the table is equal to table.replication_num/2 + 1.