- FE Configuration
- View configuration items
- Set configuration items
- Examples
- Configurations
agent_task_resend_wait_time_ms
alter_table_timeout_second
async_load_task_pool_size
audit_log_delete_age
audit_log_dir
audit_log_modules
audit_log_roll_interval
audit_log_roll_mode
audit_log_roll_num
auth_token
autocommit
auto_increment_increment
backup_job_default_timeout_ms
backup_plugin_path
balance_load_score_threshold
batch_size
bdbje_heartbeat_timeout_second
bdbje_lock_timeout_second
broker_load_default_timeout_second
brpc_idle_wait_max_time
brpc_number_of_concurrent_requests_processed
capacity_used_percent_high_water
catalog_trash_expire_second
catalog_try_lock_timeout_ms
character_set_client
character_set_connection
character_set_results
character_set_server
check_consistency_default_timeout_second
check_java_version
clone_capacity_balance_threshold
clone_checker_interval_second
clone_distribution_balance_threshold
clone_high_priority_delay_second
clone_job_timeout_second
clone_low_priority_delay_second
clone_max_job_num
clone_normal_priority_delay_second
cluster_id
cluster_name
codegen_level
collation_connection
collation_database
collation_server
consistency_check_end_time
consistency_check_start_time
db_used_data_quota_update_interval_secs
default_rowset_type
default_storage_medium
delete_thread_num
desired_max_waiting_jobs
disable_balance
disable_cluster_feature
disable_colocate_balance
disable_colocate_join
disable_colocate_join
disable_colocate_relocate
disable_hadoop_load
disable_load_job
disable_streaming_preaggregations
div_precision_increment
dpp_bytes_per_reduce
dpp_config_str
dpp_default_cluster
dpp_default_config_str
dpp_hadoop_client_path
drop_backend_after_decommission
dynamic_partition_check_interval_seconds
dynamic_partition_enable
edit_log_port
edit_log_roll_num
edit_log_type
enable_auth_check
enable_batch_delete_by_default
enable_deploy_manager
enable_insert_strict
enable_local_replica_selection
enable_materialized_view
enable_metric_calculator
enable_spilling
enable_token_check
es_state_sync_interval_second
event_scheduler
exec_mem_limit
export_checker_interval_second
export_running_job_num_limit
export_tablet_num_per_task
export_task_default_timeout_second
expr_children_limit
expr_depth_limit
force_do_metadata_checkpoint
forward_to_master
frontend_address
hadoop_load_default_timeout_second
heartbeat_mgr_blocking_queue_size
heartbeat_mgr_threads_num
history_job_keep_max_second
http_backlog_num
http_port
http_max_line_length
http_max_header_size
ignore_meta_check
init_connect
insert_load_default_timeout_second
interactive_timeout
is_report_success
label_clean_interval_second
label_keep_max_second
language
license
load_checker_interval_second
load_etl_thread_num_high_priority
load_etl_thread_num_normal_priority
load_input_size_limit_gb
load_mem_limit
load_pending_thread_num_high_priority
load_pending_thread_num_normal_priority
load_running_job_num_limit
load_straggler_wait_second
locale
log_roll_size_mb
lower_case_table_names
master_sync_policy
max_agent_task_threads_num
max_allowed_in_element_num_of_delete
max_allowed_packet
max_backend_down_time_second
max_balancing_tablets
max_bdbje_clock_delta_ms
max_broker_concurrency
max_bytes_per_broker_scanner
max_clone_task_timeout_sec
max_connection_scheduler_threads_num
max_conn_per_user
max_create_table_timeout_second
max_distribution_pruner_recursion_depth
max_layout_length_per_row
max_load_timeout_second
max_mysql_service_task_threads_num
max_query_retry_time
max_routine_load_job_num
max_routine_load_task_concurrent_num
max_routine_load_task_num_per_be
max_running_rollup_job_num_per_table
max_running_txn_num_per_db
max_scheduling_tablets
max_small_file_number
max_small_file_size_bytes
max_stream_load_timeout_second
max_tolerable_backend_down_num
max_unfinished_load_job
metadata_checkopoint_memory_threshold
metadata_failure_recovery
meta_delay_toleration_second
meta_dir
meta_publish_timeout_ms
min_bytes_per_broker_scanner
min_clone_task_timeout_sec
mini_load_default_timeout_second
min_load_timeout_second
mysql_service_io_threads_num
mysql_service_nio_enabled
net_buffer_length
net_read_timeout
net_write_timeout
parallel_exchange_instance_num
parallel_fragment_exec_instance_num
period_of_auto_resume_min
plugin_dir
plugin_enable
priority_networks
proxy_auth_enable
proxy_auth_magic_prefix
publish_version_interval_ms
publish_version_timeout_second
qe_max_connection
qe_slow_log_ms
query_cache_size
query_cache_type
query_colocate_join_memory_limit_penalty_factor
query_port
query_timeout
remote_fragment_exec_timeout_ms
replica_ack_policy
replica_delay_recovery_second
replica_sync_policy
report_queue_size
resource_group
rewrite_count_distinct_to_bitmap_hll
rpc_port
schedule_slot_num_per_path
small_file_dir
SQL_AUTO_IS_NULL
sql_mode
sql_safe_updates
sql_select_limit
storage_cooldown_second
storage_engine
storage_flood_stage_left_capacity_bytes
storage_flood_stage_usage_percent
storage_high_watermark_usage_percent
storage_min_left_capacity_bytes
stream_load_default_timeout_second
sys_log_delete_age
sys_log_dir
sys_log_level
sys_log_roll_interval
sys_log_roll_mode
sys_log_roll_num
sys_log_verbose_modules
system_time_zone
tablet_create_timeout_second
tablet_delete_timeout_second
tablet_repair_delay_factor_second
tablet_stat_update_interval_second
test_materialized_view
thrift_backlog_num
thrift_client_timeout_ms
thrift_server_max_worker_threads
time_zone
tmp_dir
transaction_clean_interval_second
tx_isolation
txn_rollback_limit
use_new_tablet_scheduler
use_v2_rollup
using_old_load_usage_pattern
Variable Info
version
version_comment
wait_timeout
with_k8s_certs
enable_strict_storage_medium_check
thrift_server_type
cache_enable_sql_mode
cache_enable_partition_mode
cache_last_version_interval_second
cache_result_max_row_count
recover_with_empty_tablet
enable_odbc_table
FE Configuration
This document mainly introduces the relevant configuration items of FE.
View configuration items
There are two ways to view the configuration items of FE:
FE web page
Open the FE web page
http://fe_host:fe_http_port/variable
in the browser. You can see the currently effective FE configuration items inConfigure Info
.View by command
After the FE is started, you can view the configuration items of the FE in the MySQL client with the following command:
ADMIN SHOW FRONTEND CONFIG;
The meanings of the columns in the results are as follows:
- Key: the name of the configuration item.
- Value: The value of the current configuration item.
- Type: The configuration item value type, such as integer or string.
- IsMutable: whether it can be dynamically configured. If true, the configuration item can be dynamically configured at runtime. If false, it means that the configuration item can only be configured in
fe.conf
and takes effect after restarting FE. - MasterOnly: Whether it is a unique configuration item of Master FE node. If it is true, it means that the configuration item is meaningful only at the Master FE node, and is meaningless to other types of FE nodes. If false, it means that the configuration item is meaningful in all types of FE nodes.
- Comment: The description of the configuration item.
Set configuration items
There are two ways to configure FE configuration items:
Static configuration
Add and set configuration items in the
conf/fe.conf
file. The configuration items infe.conf
will be read when the FE process starts. Configuration items not infe.conf
will use default values.Dynamic configuration
After the FE starts, you can set the configuration items dynamically through the following commands. This command requires administrator priviledge.
ADMIN SET FRONTEND CONFIG (" fe_config_name "=" fe_config_value ");
Not all configuration items support dynamic configuration. You can check whether the dynamic configuration is supported by the
IsMutable
column in theADMIN SHOW FRONTEND CONFIG;
command result.If the configuration item of
MasterOnly
is modified, the command will be directly forwarded to the Master FE and only the corresponding configuration item in the Master FE will be modified.Configuration items modified in this way will become invalid after the FE process restarts.
For more help on this command, you can view it through the
HELP ADMIN SET CONFIG;
command.
Examples
Modify
async_load_task_pool_size
Through
ADMIN SHOW FRONTEND CONFIG;
you can see that this configuration item cannot be dynamically configured (IsMutable
is false). You need to add infe.conf
:async_load_task_pool_size = 20
Then restart the FE process to take effect the configuration.
Modify
dynamic_partition_enable
Through
ADMIN SHOW FRONTEND CONFIG;
you can see that the configuration item can be dynamically configured (IsMutable
is true). And it is the unique configuration of Master FE. Then first we can connect to any FE and execute the following command to modify the configuration:ADMIN SET FRONTEND CONFIG ("dynamic_partition_enable" = "true"); `
Afterwards, you can view the modified value with the following command:
set forward_to_master = true;
ADMIN SHOW FRONTEND CONFIG;
After modification in the above manner, if the Master FE restarts or a Master election is performed, the configuration will be invalid. You can add the configuration item directly in fe.conf
and restart the FE to make the configuration item permanent.
Modify
max_distribution_pruner_recursion_depth
Through
ADMIN SHOW FRONTEND CONFIG;
you can see that the configuration item can be dynamically configured (IsMutable
is true). It is not unique to Master FE.Similarly, we can modify the configuration by dynamically modifying the configuration command. Because this configuration is not unique to the Master FE, user need to connect to different FEs separately to modify the configuration dynamically, so that all FEs use the modified configuration values.
Configurations
agent_task_resend_wait_time_ms
This configuration will decide whether to resend agent task when create_time for agent_task is set, only when current_time - create_time > agent_task_resend_wait_time_ms can ReportHandler do resend agent task.
This configuration is currently mainly used to solve the problem of repeated sending of PUBLISH_VERSION
agent tasks. The current default value of this configuration is 5000, which is an experimental value.
Because there is a certain time delay between submitting agent tasks to AgentTaskQueue and submitting to be, Increasing the value of this configuration can effectively solve the problem of repeated sending of agent tasks,
But at the same time, it will cause the submission of failed or failed execution of the agent task to be executed again for an extended period of time.
alter_table_timeout_second
async_load_task_pool_size
audit_log_delete_age
audit_log_dir
audit_log_modules
audit_log_roll_interval
audit_log_roll_mode
audit_log_roll_num
auth_token
autocommit
auto_increment_increment
backup_job_default_timeout_ms
backup_plugin_path
balance_load_score_threshold
batch_size
bdbje_heartbeat_timeout_second
bdbje_lock_timeout_second
broker_load_default_timeout_second
brpc_idle_wait_max_time
brpc_number_of_concurrent_requests_processed
capacity_used_percent_high_water
catalog_trash_expire_second
catalog_try_lock_timeout_ms
character_set_client
character_set_connection
character_set_results
character_set_server
check_consistency_default_timeout_second
check_java_version
clone_capacity_balance_threshold
clone_checker_interval_second
clone_distribution_balance_threshold
clone_high_priority_delay_second
clone_job_timeout_second
clone_low_priority_delay_second
clone_max_job_num
clone_normal_priority_delay_second
cluster_id
cluster_name
codegen_level
collation_connection
collation_database
collation_server
consistency_check_end_time
consistency_check_start_time
db_used_data_quota_update_interval_secs
For better data load performance, in the check of whether the amount of data used by the database before data load exceeds the quota, we do not calculate the amount of data already used by the database in real time, but obtain the periodically updated value of the daemon thread.
This configuration is used to set the time interval for updating the value of the amount of data used by the database.
default_rowset_type
default_storage_medium
delete_thread_num
desired_max_waiting_jobs
disable_balance
disable_cluster_feature
disable_colocate_balance
disable_colocate_join
disable_colocate_join
disable_colocate_relocate
disable_hadoop_load
disable_load_job
disable_streaming_preaggregations
div_precision_increment
dpp_bytes_per_reduce
dpp_config_str
dpp_default_cluster
dpp_default_config_str
dpp_hadoop_client_path
drop_backend_after_decommission
This configuration is used to control whether the system drops the BE after successfully decommissioning the BE. If true, the BE node will be deleted after the BE is successfully offline. If false, after the BE successfully goes offline, the BE will remain in the DECOMMISSION state, but will not be dropped.
This configuration can play a role in certain scenarios. Assume that the initial state of a Doris cluster is one disk per BE node. After running for a period of time, the system has been vertically expanded, that is, each BE node adds 2 new disks. Because Doris currently does not support data balancing among the disks within the BE, the data volume of the initial disk may always be much higher than the data volume of the newly added disk. At this time, we can perform manual inter-disk balancing by the following operations:
- Set the configuration item to false.
- Perform a decommission operation on a certain BE node. This operation will migrate all data on the BE to other nodes.
- After the decommission operation is completed, the BE will not be dropped. At this time, cancel the decommission status of the BE. Then the data will start to balance from other BE nodes back to this node. At this time, the data will be evenly distributed to all disks of the BE.
- Perform steps 2 and 3 for all BE nodes in sequence, and finally achieve the purpose of disk balancing for all nodes.
dynamic_partition_check_interval_seconds
dynamic_partition_enable
edit_log_port
edit_log_roll_num
edit_log_type
enable_auth_check
enable_batch_delete_by_default
Whether to add a delete sign column when create unique table
enable_deploy_manager
enable_insert_strict
enable_local_replica_selection
enable_materialized_view
This configuration is used to turn on and off the creation of materialized views. If set to true, the function to create a materialized view is enabled. The user can create a materialized view through the CREATE MATERIALIZED VIEW
command. If set to false, materialized views cannot be created.
If you get an error The materialized view is coming soon
or The materialized view is disabled
when creating the materialized view, it means that the configuration is set to false and the function of creating the materialized view is turned off. You can start to create a materialized view by modifying the configuration to true.
This variable is a dynamic configuration, and users can modify the configuration through commands after the FE process starts. You can also modify the FE configuration file and restart the FE to take effect.
enable_metric_calculator
enable_spilling
enable_token_check
es_state_sync_interval_second
event_scheduler
exec_mem_limit
export_checker_interval_second
export_running_job_num_limit
export_tablet_num_per_task
export_task_default_timeout_second
expr_children_limit
expr_depth_limit
force_do_metadata_checkpoint
forward_to_master
frontend_address
hadoop_load_default_timeout_second
heartbeat_mgr_blocking_queue_size
heartbeat_mgr_threads_num
history_job_keep_max_second
http_backlog_num
http_port
HTTP bind port. Defaults to 8030.
http_max_line_length
The max length of an HTTP URL. The unit of this configuration is BYTE. Defaults to 4096.
http_max_header_size
The max size of allowed HTTP headers. The unit of this configuration is BYTE. Defaults to 8192.
ignore_meta_check
init_connect
insert_load_default_timeout_second
interactive_timeout
is_report_success
label_clean_interval_second
label_keep_max_second
language
license
load_checker_interval_second
load_etl_thread_num_high_priority
load_etl_thread_num_normal_priority
load_input_size_limit_gb
load_mem_limit
load_pending_thread_num_high_priority
load_pending_thread_num_normal_priority
load_running_job_num_limit
load_straggler_wait_second
locale
log_roll_size_mb
lower_case_table_names
master_sync_policy
max_agent_task_threads_num
max_allowed_in_element_num_of_delete
This configuration is used to limit element num of InPredicate in delete statement. The default value is 1024.
max_allowed_packet
max_backend_down_time_second
max_balancing_tablets
max_bdbje_clock_delta_ms
max_broker_concurrency
max_bytes_per_broker_scanner
max_clone_task_timeout_sec
Type: long Description: Used to control the maximum timeout of a clone task. The unit is second. Default value: 7200 Dynamic modification: yes
Can cooperate with mix_clone_task_timeout_sec
to control the maximum and minimum timeout of a clone task. Under normal circumstances, the timeout of a clone task is estimated by the amount of data and the minimum transfer rate (5MB/s). In some special cases, these two configurations can be used to set the upper and lower bounds of the clone task timeout to ensure that the clone task can be completed successfully.
max_connection_scheduler_threads_num
max_conn_per_user
max_create_table_timeout_second
max_distribution_pruner_recursion_depth
max_layout_length_per_row
max_load_timeout_second
max_mysql_service_task_threads_num
max_query_retry_time
max_routine_load_job_num
max_routine_load_task_concurrent_num
max_routine_load_task_num_per_be
max_running_rollup_job_num_per_table
max_running_txn_num_per_db
This configuration is mainly used to control the number of concurrent load jobs of the same database.
When there are too many load jobs running in the cluster, the newly submitted load jobs may report errors:
current running txns on db xxx is xx, larger than limit xx
When this error is encountered, it means that the load jobs currently running in the cluster exceeds the configuration value. At this time, it is recommended to wait on the business side and retry the load jobs.
Generally it is not recommended to increase this configuration value. An excessively high number of concurrency may cause excessive system load.
max_scheduling_tablets
max_small_file_number
max_small_file_size_bytes
max_stream_load_timeout_second
This configuration is specifically used to limit timeout setting for stream load. It is to prevent that failed stream load transactions cannot be canceled within a short time because of the user’s large timeout setting.
max_tolerable_backend_down_num
max_unfinished_load_job
metadata_checkopoint_memory_threshold
metadata_failure_recovery
meta_delay_toleration_second
meta_dir
meta_publish_timeout_ms
min_bytes_per_broker_scanner
min_clone_task_timeout_sec
Type: long Description: Used to control the minimum timeout of a clone task. The unit is second. Default value: 120 Dynamic modification: yes
See the description of max_clone_task_timeout_sec
.
mini_load_default_timeout_second
min_load_timeout_second
mysql_service_io_threads_num
mysql_service_nio_enabled
net_buffer_length
net_read_timeout
net_write_timeout
parallel_exchange_instance_num
parallel_fragment_exec_instance_num
period_of_auto_resume_min
plugin_dir
plugin_enable
priority_networks
proxy_auth_enable
proxy_auth_magic_prefix
publish_version_interval_ms
publish_version_timeout_second
qe_max_connection
qe_slow_log_ms
query_cache_size
query_cache_type
query_colocate_join_memory_limit_penalty_factor
query_port
query_timeout
remote_fragment_exec_timeout_ms
replica_ack_policy
replica_delay_recovery_second
replica_sync_policy
report_queue_size
resource_group
rewrite_count_distinct_to_bitmap_hll
rpc_port
schedule_slot_num_per_path
small_file_dir
SQL_AUTO_IS_NULL
sql_mode
sql_safe_updates
sql_select_limit
storage_cooldown_second
storage_engine
storage_flood_stage_left_capacity_bytes
storage_flood_stage_usage_percent
storage_high_watermark_usage_percent
storage_min_left_capacity_bytes
stream_load_default_timeout_second
sys_log_delete_age
sys_log_dir
sys_log_level
sys_log_roll_interval
sys_log_roll_mode
sys_log_roll_num
sys_log_verbose_modules
system_time_zone
tablet_create_timeout_second
tablet_delete_timeout_second
tablet_repair_delay_factor_second
tablet_stat_update_interval_second
test_materialized_view
thrift_backlog_num
thrift_client_timeout_ms
The connection timeout and socket timeout config for thrift server.
The value for thrift_client_timeout_ms is set to be larger than zero to prevent some hang up problems in java.net.SocketInputStream.socketRead0.
thrift_server_max_worker_threads
time_zone
tmp_dir
transaction_clean_interval_second
tx_isolation
txn_rollback_limit
use_new_tablet_scheduler
use_v2_rollup
using_old_load_usage_pattern
Variable Info
version
version_comment
wait_timeout
with_k8s_certs
enable_strict_storage_medium_check
This configuration indicates that when the table is being built, it checks for the presence of the appropriate storage medium in the cluster. For example, when the user specifies that the storage medium is’ SSD ‘when the table is built, but only’ HDD ‘disks exist in the cluster,
If this parameter is’ True ‘, the error ‘Failed to find enough host in all Backends with storage medium with storage medium is SSD, need 3’.
If this parameter is’ False ‘, no error is reported when the table is built. Instead, the table is built on a disk with ‘HDD’ as the storage medium.
thrift_server_type
This configuration represents the service model used by The Thrift Service of FE, is of type String and is case-insensitive.
If this parameter is ‘SIMPLE’, then the ‘TSimpleServer’ model is used, which is generally not suitable for production and is limited to test use.
If the parameter is ‘THREADED’, then the ‘TThreadedSelectorServer’ model is used, which is a non-blocking I/O model, namely the master-slave Reactor model, which can timely respond to a large number of concurrent connection requests and performs well in most scenarios.
If this parameter is THREAD_POOL
, then the TThreadPoolServer
model is used, the model for blocking I/O model, use the thread pool to handle user connections, the number of simultaneous connections are limited by the number of thread pool, if we can estimate the number of concurrent requests in advance, and tolerant enough thread resources cost, this model will have a better performance, the service model is used by default.
cache_enable_sql_mode
If this switch is turned on, the SQL query result set will be cached. If the interval between the last visit version time in all partitions of all tables in the query is greater than cache_last_version_interval_second, and the result set is less than cache_result_max_row_count, the result set will be cached, and the next same SQL will hit the cache.
cache_enable_partition_mode
When this switch is turned on, the query result set will be cached according to the partition. If the interval between the query table partition time and the query time is less than cache_last_version_interval_second, the result set will be cached according to the partition.
Part of the data will be obtained from the cache and some data from the disk when querying, and the data will be merged and returned to the client.
cache_last_version_interval_second
The time interval of the latest partitioned version of the table refers to the time interval between the data update and the current version. It is generally set to 900 seconds, which distinguishes offline and real-time import.
cache_result_max_row_count
In order to avoid occupying too much memory, the maximum number of rows that can be cached is 2000 by default. If this threshold is exceeded, the cache cannot be set.
recover_with_empty_tablet
In some very special circumstances, such as code bugs, or human misoperation, etc., all replicas of some tablets may be lost. In this case, the data has been substantially lost. However, in some scenarios, the business still hopes to ensure that the query will not report errors even if there is data loss, and reduce the perception of the user layer. At this point, we can use the blank Tablet to fill the missing replica to ensure that the query can be executed normally.
Set to true so that Doris will automatically use blank replicas to fill tablets which all replicas have been damaged or missing.
Default is false.
enable_odbc_table
If this parameter is set to true, Doris can support ODBC external table creation and query. For specific usage of ODBC table, please refer to the use document of ODBC table
The function is still in the experimental stage, so the default value is false.