Creating a Doris cluster in the compute-storage decoupled mode is to create the entire distributed system that contains both FE and BE nodes. Then, in such a cluster, users can create compute clusters. Each compute cluster is a group of computing resources consisting of one or more BE nodes.

A single FoundationDB + Meta Service + Recycler infrastructure can support multiple compute-storage decoupled clusters, where each compute-storage decoupled cluster is considered a data warehouse instance (instance).

In the compute-storage decoupled mode, the registration and changes of nodes in a warehouse is managed by Meta Service. FE, BE, and Meta Service interact for service discovery and authentication.

Creating a Doris cluster in the compute-storage decoupled mode entails interaction with Meta Service. Meta Service provides standard HTTP APIs for resource management operations. For more information, refer to Meta Service API.

The compute-storage decoupled mode of Doris adopts a service discovery mechanism. The steps to create a compute-storage separation cluster can be summarized as follows:

  1. Register and specify the data warehouse instance and its storage backend.
  2. Register and specify the FE and BE nodes that make up the data warehouse instance, including the specific machines and how they form the cluster.
  3. Configure and start all the FE and BE nodes.

Creating Cluster - 图1info

  • 127.0.0.1:5000 in the examples of this page refers to the address of Meta Service. Please replace it with the actual IP address and bRPC listening port for your own use case.
  • Please modify the configuration items in the following examples as needed.

Create cluster & storage vault

The first step is to register a data warehouse instance in Meta Service. A single Meta Service can support multiple data warehouse instances (i.e., multiple sets of FE-BE). Specifically, this process includes describing the required storage vault (i.e., the shared storage layer demonstrated in Overview) for that data warehouse instance. The options for the storage vault include HDFS and S3 (or object storage that supports the S3 protocol, such as AWS S3, GCS, Azure Blob, MinIO, Ceph, and Alibaba Cloud OSS). Storage vault is the remote shared storage used by Doris in the compute-storage decoupled mode. Users can configure multiple storage vaults for one data warehouse instance, and store different tables on different storage vaults.

This step involves calling the create_instance API of Meta Service. The key parameters include:

  • instance_id: The ID of the data warehouse instance. It is typically a UUID string, such as 6ADDF03D-4C71-4F43-9D84-5FC89B3514F8. For simplicity in this guide, a regular string is used.
  • name: The name of the data warehouse instance, which should be filled in according to actual needs. It should follow the format of [a-zA-Z][0-9a-zA-Z_]+.
  • user_id: The ID of the user who creates the data warehouse instance. It should follow the format of [a-zA-Z][0-9a-zA-Z_]+.
  • vault: The storage vault information, such as HDFS properties and S3 Bucket details. Different storage vaults entails different parameters.

For more information, refer to “create_instance” in Meta Service API.

Multiple compute-storage decoupled clusters (data warehouse instances/instances) can be created by making multiple calls to the Meta Service create_instance interface.

Create cluster using HDFS as storage vault

To create a Doris cluster in the compute-storage decoupled mode using HDFS as the storage vault, configure the following items accurately and ensure that all nodes (including FE/BE nodes, Meta Service, and Recycler) have the necessary permissions to access the specified HDFS. This includes completing the Kerberos authorization configuration and connectivity checks for the machines (which can be tested using the Hadoop Client on the respective nodes).

ParameterDescriptionRequired/OptionalNotes
instanceidinstance_idRequiredGlobally and historically unique, normally a UUID string
nameInstance name. It should conform to the format of [a-zA-Z][0-9a-zA-Z]+Optional
useridID of the user who creates the instance. It should conform to the format of [a-zA-Z][0-9a-zA-Z]+Required
vaultStorage vaultRequired
vault.hdfs_infoInformation of the HDFS storage vaultRequired
vault.build_confBuild configuration of the HDFS storage vaultRequired
vault.build_conf.fs_nameHDFS name, normally the connection addressRequired
vault.build_conf.userUser to connect to HDFSRequired
vault.build_conf.hdfs_kerberos_keytabKerberos Keytab pathOptionalRequired when using Kerberos authentication
vault.build_conf.hdfs_kerberos_principalKerberos PrincipalOptionalRequired when using Kerberos authentication
vault.build_conf.hdfs_confsOther configurations of HDFSOptionalCan be filled in as needed
vault.prefixPrefix for data storage; used for data isolationRequiredNormally named after the specific business, such as big_data

Example

  1. curl -s "127.0.0.1:5000/MetaService/http/create_instance?token=greedisgood9999" -d \
  2. '{
  3. "instance_id": "sample_instance_id",
  4. "name": "sample_instance_name",
  5. "user_id": "sample_user_id",
  6. "vault": {
  7. "hdfs_info" : {
  8. "build_conf": {
  9. "fs_name": "hdfs://172.21.0.44:4007",
  10. "user": "hadoop",
  11. "hdfs_kerberos_keytab": "/etc/emr.keytab",
  12. "hdfs_kerberos_principal": "hadoop/172.30.0.178@EMR-XXXYYY",
  13. "hdfs_confs" : [
  14. {
  15. "key": "hadoop.security.authentication",
  16. "value": "kerberos"
  17. }
  18. ]
  19. },
  20. "prefix": "sample_prefix"
  21. }
  22. }
  23. }'

Create cluster using S3 as storage vault

All object storage attributes are required in the creation statement. Specifically:

  • When using object storage systems that support the S3 protocol, such as MinIO, make sure to test the connectivity and the correctness of the Access Key (AK) and Secret Access Key (SK). You can refer to AWS CLI with MinIO Server for further guidance.
  • The value of the Bucket field should be the name of the bucket, which does NOT include the schema like s3://.
  • The external_endpoint should be kept the same as the endpoint value.
  • If you are using a non-cloud provider object storage, you can fill in any values for the region and provider fields.
ParameterDescriptionRequired/OptionalNotes
instanceidID of the data warehouse instance in the compute-storage decoupled mode, normally a UUID string. It should conform to the format of [0-9a-zA-Z-]+.RequiredExample: 6ADDF03D-4C71-4F43-9D84-5FC89B3514F8
nameInstance name. It should conform to the format of [a-zA-Z][0-9a-zA-Z]+Optional
user_idID of the user who creates the instance. It should conform to the format of [a-zA-Z][0-9a-zA-Z]+Required
vault.obj_infoObject storage configurationRequired
vault.obj_info.akObject storage Access KeyRequired
vault.obj_info.skObject storage Secret KeyRequired
vault.obj_info.bucketObject storage bucket nameRequired
vault.obj_info.prefixPrefix for data storage on object storageOptionalIf this parameter is empty, the default storage location will be in the root directory of the bucket. Example: big_data
obj_info.endpointObject storage endpointRequiredThe domain or ip:port, not including the scheme prefix such as http://.
obj_info.regionObject storage regionRequiredIf using MinIO, this parameter can be filled in with any value.
obj_info.external_endpointObject storage external endpointRequiredNormally consistent with the endpoint. Compatible with OSS. Note the difference between external and internal OSS.
vault.obj_info.providerObject storage provider; options include OSS, S3, COS, OBS, BOS, GCP, and AZURERequiredIf using MinIO, simply fill in ‘S3’.

Example (AWS S3)

  1. curl -s "127.0.0.1:5000/MetaService/http/create_instance?token=greedisgood9999" -d \
  2. '{
  3. "instance_id": "sample_instance_id",
  4. "name": "sample_instance_name",
  5. "user_id": "sample_user_id",
  6. "vault": {
  7. "obj_info": {
  8. "ak": "ak_xxxxxxxxxxx",
  9. "sk": "sk_xxxxxxxxxxx",
  10. "bucket": "sample_bucket_name",
  11. "prefix": "sample_prefix",
  12. "endpoint": "s3.amazonaws.com",
  13. "external_endpoint": "s3.amazonaws.com",
  14. "region": "us-east1",
  15. "provider": "AWS"
  16. }
  17. }
  18. }'

Example (Tencent Cloud Object Storage)

  1. curl -s "127.0.0.1:5000/MetaService/http/create_instance?token=greedisgood9999" -d \
  2. '{
  3. "instance_id": "sample_instance_id",
  4. "name": "sample_instance_name",
  5. "user_id": "sample_user_id",
  6. "vault": {
  7. "obj_info": {
  8. "ak": "ak_xxxxxxxxxxx",
  9. "sk": "sk_xxxxxxxxxxx",
  10. "bucket": "sample_bucket_name",
  11. "prefix": "sample_prefix",
  12. "endpoint": "cos.ap-beijing.myqcloud.com",
  13. "external_endpoint": "cos.ap-beijing.myqcloud.com",
  14. "region": "ap-beijing",
  15. "provider": "COS"
  16. }
  17. }
  18. }'

Manage storage vault

A warehouse can be configured with one or more storage vaults. Different tables can be stored on different storage vaults.

Concepts

  • vault name: The name of each storage vault is globally unique within the data warehouse instance, except for the built-in vault. The vault name is specified by the user when creating the storage vault.
  • built-in vault: This is the remote shared storage that stores Doris system tables. It must be configured when creating the data warehouse instance. The name of it is fixed as built_in_storage_vault. Only after configuring the built-in vault can the data warehouse (FE) be started.
  • default vault: This is the default storage vault at the data warehouse instance level. Users can specify a storage vault as the default storage vault, including thebuilt-in vault. In the compute-storage decoupled mode, data must be stored on a remote shared storage. If the user does not specify the vault_name in the PROPERTIES section of the table creation statement, data of that table will be stored in the default vault. The default vault can be reset, but the storage vault used by tables that have already been created will not change accordingly.

After configuring the built-in vault, you can create additional storage vaults as needed. After the FE startup is successful, you can perform storage vault operations through SQL statements, including creating storage vaults, viewing storage vaults, and specifying storage vaults during table creation.

Create storage vault

Syntax

  1. CREATE STORAGE VAULT [IF NOT EXISTS] <vault_name>
  2. PROPERTIES
  3. ("key" = "value",...)

<vault_name> is the user-defined name for the storage vault. It serves as the identifier for storage vault access.

Example

Create HDFS storage vault

  1. CREATE STORAGE VAULT IF NOT EXISTS ssb_hdfs_vault
  2. PROPERTIES (
  3. "type"="hdfs", -- required
  4. "fs.defaultFS"="hdfs://127.0.0.1:8020", -- required
  5. "path_prefix"="big/data", -- optional, Normally named after the specifc business
  6. "hadoop.username"="user" -- optional
  7. "hadoop.security.authentication"="kerberos" -- optional
  8. "hadoop.kerberos.principal"="hadoop/127.0.0.1@XXX" -- optional
  9. "hadoop.kerberos.keytab"="/etc/emr.keytab" -- optional
  10. );

Create S3 storage vault

  1. CREATE STORAGE VAULT IF NOT EXISTS ssb_s3_vault
  2. PROPERTIES (
  3. "type"="S3", -- required
  4. "s3.endpoint" = "oss-cn-beijing.aliyuncs.com", -- required
  5. "s3.external_endpoint" = "oss-cn-beijing.aliyuncs.com", -- required
  6. "s3.bucket" = "sample_bucket_name", -- required
  7. "s3.region" = "bj", -- required
  8. "s3.root.path" = "big/data/prefix", -- required
  9. "s3.access_key" = "ak", -- required
  10. "s3.secret_key" = "sk", -- required
  11. "provider" = "cos", -- required
  12. );

Creating Cluster - 图2info

Newly created storage vaults may NOT be immediately visible to the BE. This means if you try to import data into tables with a newly created storage vault, you might expect error reports in the short term (< 1 minute) until the storage vault is fully propagated to the BE nodes.

Properties

ParameterDescriptionRequired/OptionalExample
typeS3 and HDFS are currently supported.Requireds3 or hdfs
fs.defaultFSHDFS vault parameterRequiredhdfs://127.0.0.1:8020
path_prefixHDFS vault parameter, the path prefix for data storage, normally configured based on specific business.Optionalbig/data/dir
hadoop.usernameHDFS vault parameterOptionalhadoop
hadoop.security.authenticationHDFS vault parameterOptionalkerberos
hadoop.kerberos.principalHDFS vault parameterOptionalhadoop/127.0.0.1@XXX
hadoop.kerberos.keytabHDFS vault parameterOptional/etc/emr.keytab
dfs.client.socket-timeoutHDFS vault parameter, measured in millisecondOptional60000
s3.endpiontS3 vault parameterRequiredoss-cn-beijing.aliyuncs.com
s3.external_endpointS3 vault parameterRequiredoss-cn-beijing.aliyuncs.com
s3.bucketS3 vault parameterRequiredsample_bucket_name
s3.regionS3 vault parameterRequiredbj
s3.root.pathS3 vault parameter, path prefix for the actual data storageRequired/big/data/prefix
s3.access_keyS3 vault parameterRequired
s3.secret_keyS3 vault parameterRequired
providerS3 vault parameter. Options include OSS, AWS S3, COS, OBS, BOS, GCP, and Microsoft Azure. If using MinIO, simply fill in ‘S3’.Requiredcos

View storage vault

Syntax

  1. SHOW STORAGE VAULT

The returned result contains 4 columns, which are the name of the storage vault, the ID of the storage vault, the properties of the storage vault, and whether it is the default storage vault.

Example

  1. mysql> show storage vault;
  2. +------------------------+----------------+-------------------------------------------------------------------------------------------------------------+-----------+
  3. | StorageVaultName | StorageVaultId | Propeties | IsDefault |
  4. +------------------------+----------------+-------------------------------------------------------------------------------------------------------------+-----------+
  5. | built_in_storage_vault | 1 | build_conf { fs_name: "hdfs://127.0.0.1:8020" } prefix: "_1CF80628-16CF-0A46-54EE-2C4A54AB1519" | false |
  6. | hdfs_vault | 2 | build_conf { fs_name: "hdfs://127.0.0.1:8020" } prefix: "big/data/dir_0717D76E-FF5E-27C8-D9E3-6162BC913D97" | false |
  7. +------------------------+----------------+-------------------------------------------------------------------------------------------------------------+-----------+

Set default storage vault

Syntax

  1. SET <vault_name> AS DEFAULT STORAGE VAULT

Specify storage vault for table

In the table creation statement, if you specify the storage_vault_name in the PROPERTIES, the data will be stored in the storage vault corresponding to the specified vault name. After the table is successfully created, the storage_vaultcannot be modified, which means that the storage vault cannot be changed.

Example

  1. CREATE TABLE IF NOT EXISTS supplier (
  2. s_suppkey int(11) NOT NULL COMMENT "",
  3. s_name varchar(26) NOT NULL COMMENT "",
  4. s_address varchar(26) NOT NULL COMMENT "",
  5. s_city varchar(11) NOT NULL COMMENT "",
  6. s_nation varchar(16) NOT NULL COMMENT "",
  7. s_region varchar(13) NOT NULL COMMENT "",
  8. s_phone varchar(16) NOT NULL COMMENT ""
  9. )
  10. UNIQUE KEY (s_suppkey)
  11. DISTRIBUTED BY HASH(s_suppkey) BUCKETS 1
  12. PROPERTIES (
  13. "replication_num" = "1",
  14. "storage_vault_name" = "ssb_hdfs_vault"
  15. );

Built-in storage vault

When creating an instance, users can choose Vault Mode or Non-Vault Mode. In Vault Mode, the passed-in Vault will be set as the built-in storage vault, which is used to save internal table information (such as statistics tables). If the built-in storage vault is not created, the FE will not be able to start normally.

Users can also choose to store their new tables in the built-in storage vault. This can be done by setting the built-in storage vault as the default storage vault or by setting the storage_vault_name of the table to built-in storage vault in the table creation statement.

Modify storage vault

Some of the storage vault configurations are modifiable.

Coming soon

Delete storage vault

Only non-default storage vaults that are not referenced by any tables can be deleted.

Coming soon

Storage vault privilege

You can grant privileges of a specific storage vault to a designated MySQL user, so that the user can configure that storage vault for a newly created table or view that storage vault.

Syntax

  1. GRANT
  2. USAGE_PRIV
  3. ON STORAGE VAULT <vault_name>
  4. TO { ROLE | USER } {<role> | <user>}

Only the Admin user has the privilege to execute the GRANT statement, which is used to grant the privileges for a specified storage vault to a User/Role.

Users/Roles with the USAGE_PRIV privilege for a specific storage vault can perform the following operations:

  • View the information of the storage vault using the SHOW STORAGE VAULT statement.
  • Specify that storage vault in the PROPERTIES when creating a table.

Example

  1. grant usage_priv on storage vault my_storage_vault to user1

Revoke storage vault privileges for a MySQL user.

Syntax

  1. REVOKE
  2. USAGE_PRIV
  3. ON STORAGE VAULT <vault_name>
  4. FROM { ROLE | USER } {<role> | <user>}

Only the Admin user has the privilege to execute the REVOKE statement, which is used to revoke the privileges that a User/Role has on a specific storage vault.

Example

  1. revoke usage_priv on storage vault my_storage_vault from user1

Add FE

In the compute-storage decoupled mode, the node management interfaces for FE and BE are the same, with only the parameter configurations differing.

The initial FE and BE nodes can be added through the Meta Service add_cluster interface.

The parameter list for the add_cluster interface is as follows:

ParameterDescriptionRequired/OptionalNotes
instanceidID of the data warehouse instance in the compute-storage decoupled mode, normally a UUID string. It should conform to the format of [0-9a-zA-Z-]+.RequiredGlobally and historically unique, normally a UUID string. Users should use a different instanceid each time they call this interface.
clusterCluster objectRequired
cluster.cluster_nameCluster name. It should conform to the format of [a-zA-Z][0-9a-zA-Z]+.RequiredThe FE cluster name is special. The default value of it is RESERVEDCLUSTER_NAME_FOR_SQL_SERVER. This can be modified by configuring cloud_observer_cluster_name in the fe.conf file.
cluster.cluster_idCluster IDRequiredThe FE cluster ID is special. The default value of it is RESERVED_CLUSTER_ID_FOR_SQL_SERVER. This can be modified by configuring cloud_observer_cluster_id in the fe.conf file.
cluster.typeCluster node typeRequiredTwo types are supported: SQL and COMPUTE. SQL represents the SQL Service corresponding to FE, while COMPUTE means that the compute nodes are corresponding to BE.
cluster.nodesNodes in the clusterRequiredArray
cluster.nodes.cloud_unique_idcloud_unique_id of nodes. It should conform to the format of 1:<instance_id>:<string>, in which the string should conform to the format of [0-9a-zA-Z-]+ . The value for each node should be different.Requiredcloud_unique_id in fe.conf and be.conf
cluster.nodes.ipNode IPRequiredWhen deploying FE/BE in FQDN mode, this field should be the domain name.
cluster.nodes.hostNode domain nameOptionalThis field is required when deploying FE/BE in FQDN mode.
cluster.nodes.heartbeat_portHeartbeat port of BERequired for BEheartbeat_service_port in be.conf
cluster.nodes.edit_log_portEdit log port of FERequired for FEedit_log_port in fe.conf
cluster.nodes.node_typeFE node typeRequiredThis field is required when the cluster type is SQL. It can be either FE_MASTER or FE_OBSERVER. FE_MASTER indicates that the node is of Master role, and FE_OBSERVER indicates that the node is an Observer. Note that in an SQL type cluster, the nodes array can only have one FE_MASTER node, but it can include multiple FE_OBSERVER nodes.

This is an example of adding one FE:

  1. # Add FE
  2. curl '127.0.0.1:5000/MetaService/http/add_cluster?token=greedisgood9999' -d '{
  3. "instance_id":"sample_instance_id",
  4. "cluster":{
  5. "type":"SQL",
  6. "cluster_name":"RESERVED_CLUSTER_NAME_FOR_SQL_SERVER",
  7. "cluster_id":"RESERVED_CLUSTER_ID_FOR_SQL_SERVER",
  8. "nodes":[
  9. {
  10. "cloud_unique_id":"1:sample_instance_id:cloud_unique_id_sql_server00",
  11. "ip":"172.21.16.21",
  12. "edit_log_port":12103,
  13. "node_type":"FE_MASTER"
  14. }
  15. ]
  16. }
  17. }'
  18. # Confirm successful creation based on the returned result of the get_cluster command.
  19. curl '127.0.0.1:5000/MetaService/http/get_cluster?token=greedisgood9999' -d '{
  20. "instance_id":"sample_instance_id",
  21. "cloud_unique_id":"1:sample_instance_id:cloud_unique_id_sql_server00",
  22. "cluster_name":"RESERVED_CLUSTER_NAME_FOR_SQL_SERVER",
  23. "cluster_id":"RESERVED_CLUSTER_ID_FOR_SQL_SERVER"
  24. }'

If you need to add 2 FE nodes during the initial operation using the interface mentioned above, you can add configurations for the additional node in the nodes array.

This is an example of adding an observer node:

  1. ...
  2. "nodes":[
  3. {
  4. "cloud_unique_id":"1:sample_instance_id:cloud_unique_id_sql_server00",
  5. "ip":"172.21.16.21",
  6. "edit_log_port":12103,
  7. "node_type":"FE_MASTER"
  8. },
  9. {
  10. "cloud_unique_id":"1:sample_instance_id:cloud_unique_id_sql_server00",
  11. "ip":"172.21.16.22",
  12. "edit_log_port":12103,
  13. "node_type":"FE_OBSERVER"
  14. }
  15. ]
  16. ...

If you need to add or drop FE nodes, you may refer to the “Manage compute cluster” section on this page.

Create compute cluster

Users can create one or more compute clusters, and a compute cluster can consist of any number of BE nodes. This is also performed via the Meta Service add_cluster interface.

See the “Add FE” section above for more information of the interface.

Users can adjust the number of compute clusters and the number of nodes within each cluster based on their needs. Each compute cluster should have a unique cluster_name and cluster_id.

This is an example of adding a compute cluster that consists of 1 BE node.

  1. # 172.19.0.11
  2. # Add BE
  3. curl '127.0.0.1:5000/MetaService/http/add_cluster?token=greedisgood9999' -d '{
  4. "instance_id":"sample_instance_id",
  5. "cluster":{
  6. "type":"COMPUTE",
  7. "cluster_name":"cluster_name0",
  8. "cluster_id":"cluster_id0",
  9. "nodes":[
  10. {
  11. "cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node0",
  12. "ip":"172.21.16.21",
  13. "heartbeat_port":9455
  14. }
  15. ]
  16. }
  17. }'
  18. # Confirm successful creation using get_cluster
  19. curl '127.0.0.1:5000/MetaService/http/get_cluster?token=greedisgood9999' -d '{
  20. "instance_id":"sample_instance_id",
  21. "cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node0",
  22. "cluster_name":"cluster_name0",
  23. "cluster_id":"cluster_id0"
  24. }'

If you need to add 2 BE nodes during the initial operation using the interface mentioned above, you can add the configurations for the additional node in the nodes array.

This is an example of specifying a compute cluster with 2 BE nodes:

  1. ...
  2. "nodes":[
  3. {
  4. "cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node0",
  5. "ip":"172.21.16.21",
  6. "heartbeat_port":9455
  7. },
  8. {
  9. "cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node0",
  10. "ip":"172.21.16.22",
  11. "heartbeat_port":9455
  12. }
  13. ]
  14. ...

For instructions on adding or dropping BE nodes, refer to the “Manage compute cluster” section on this page.

If you need to continue adding more compute clusters, you can simply repeat the operations described in this section.

FE/BE configuration

Compared to the compute-storage coupled mode, the compute-storage decoupled mode requires additional configurations for the FE and BE:

  • meta_service_endpoint: The address of Meta Service, which needs to be filled in both the FE and BE.
  • cloud_unique_id: This should be filled with the corresponding value from the add_cluster request sent to Meta Service when creating the cluster. Doris determines whether it is operating in the compute-storage decoupled mode based on this configuration.

fe.conf

  1. meta_service_endpoint = 127.0.0.1:5000
  2. cloud_unique_id = 1:sample_instance_id:cloud_unique_id_sql_server00

be.conf

In the following example, meta_service_use_load_balancer and enable_file_cache can be copied for your use case. However, you might need to modify the other configuration items.

The file_cache_path is a JSON array (configured according to the actual number of cache disks), and the definition of each field is as follows:

  • path: The path to store the cached data, similar to the storage_root_path in the compute-storage coupled mode.
  • total_size: The expected upper limit of the cache space to be used.
  • query_limit: The maximum amount of cache data that can be evicted when a single query misses the cache (to prevent large queries from evicting all the cache). Since the cache needs to store data, it is best to use high-performance disks such as SSDs as the cache storage medium.
  1. meta_service_endpoint = 127.0.0.1:5000
  2. cloud_unique_id = 1:sample_instance_id:cloud_unique_id_compute_node0
  3. meta_service_use_load_balancer = false
  4. enable_file_cache = true
  5. file_cache_path = [{"path":"/mnt/disk1/doris_cloud/file_cache","total_size":104857600000,"query_limit":10485760000}, {"path":"/mnt/disk2/doris_cloud/file_cache","total_size":104857600000,"query_limit":10485760000}]

Start/stop FE/BE

In the compute-storage decoupled mode of Doris, the startup and shutdown processes for the FE/BE is the same as those in the compute-storage coupled mode.

In the compute-storage decoupled mode, which follows a service discovery model, there is no need to use commands like alter system add/drop frontend/backend to manage the nodes.

  1. bin/start_be.sh --daemon
  2. bin/stop_be.sh
  3. bin/start_fe.sh --daemon
  4. bin/stop_fe.sh

After startup, if the above configuration items are all correct in the logs, it indicates that the system has started to function normally, and you can connect to the FE through a MySQL client for access.

Manage compute cluster

Add/drop FE/BE node

These steps are similar to those in creating a compute cluster. Specify the new nodes in Meta Service, and then start the corresponding nodes (ensure correct configuration of the new nodes). There is no need to use the alter system add/drop statements for additional operations.

In the compute-storage decoupled mode, you can increase/decrease multiple nodes at a time. However, it is recommended to add or drop the nodes one by one.

Example

Add two BE nodes to compute cluster cluster_name0.

  1. curl '127.0.0.1:5000/MetaService/http/add_node?token=greedisgood9999' -d '{
  2. "instance_id":"sample_instance_id",
  3. "cluster":{
  4. "type":"COMPUTE",
  5. "cluster_name":"cluster_name0",
  6. "cluster_id":"cluster_id0",
  7. "nodes":[
  8. {
  9. "cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node1",
  10. "ip":"172.21.16.22",
  11. "heartbeat_port":9455
  12. },
  13. {
  14. "cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node2",
  15. "ip":"172.21.16.23",
  16. "heartbeat_port":9455
  17. }
  18. ]
  19. }
  20. }'

Remove two BE nodes from compute cluster cluster_name0.

  1. curl '127.0.0.1:5000/MetaService/http/drop_node?token=greedisgood9999' -d '{
  2. "instance_id":"sample_instance_id",
  3. "cluster":{
  4. "type":"COMPUTE",
  5. "cluster_name":"cluster_name0",
  6. "cluster_id":"cluster_id0",
  7. "nodes":[
  8. {
  9. "cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node1",
  10. "ip":"172.21.16.22",
  11. "heartbeat_port":9455
  12. },
  13. {
  14. "cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node2",
  15. "ip":"172.21.16.23",
  16. "heartbeat_port":9455
  17. }
  18. ]
  19. }
  20. }'

Add an FE Follower. In the following example, node_type = FE_OBSERVER.

Currently, Doris does not support adding FE Follower in the compute-storage decoupled mode.

  1. curl '127.0.0.1:5000/MetaService/http/add_node?token=greedisgood9999' -d '{
  2. "instance_id":"sample_instance_id",
  3. "cluster":{
  4. "type":"SQL",
  5. "cluster_name":"RESERVED_CLUSTER_NAME_FOR_SQL_SERVER",
  6. "cluster_id":"RESERVED_CLUSTER_ID_FOR_SQL_SERVER",
  7. "nodes":[
  8. {
  9. "cloud_unique_id":"1:sample_instance_id:cloud_unique_id_sql_server00",
  10. "ip":"172.21.16.22",
  11. "edit_log_port":12103,
  12. "node_type":"FE_OBSERVER"
  13. }
  14. ]
  15. }
  16. }'

Remove an FE node.

  1. curl '127.0.0.1:5000/MetaService/http/drop_node?token=greedisgood9999' -d '{
  2. "instance_id":"sample_instance_id",
  3. "cluster":{
  4. "type":"SQL",
  5. "cluster_name":"RESERVED_CLUSTER_NAME_FOR_SQL_SERVER",
  6. "cluster_id":"RESERVED_CLUSTER_ID_FOR_SQL_SERVER",
  7. "nodes":[
  8. {
  9. "cloud_unique_id":"1:sample_instance_id:cloud_unique_id_sql_server00",
  10. "ip":"172.21.16.22",
  11. "edit_log_port":12103,
  12. "node_type":"FE_MASTER"
  13. }
  14. ]
  15. }
  16. }'

Add/drop compute cluster

To add a new compute cluster, you can refer to the “Create compute cluster” section on this page.

To drop a compute cluster, you can call the Meta Service API and shut down the corresponding nodes.

Example

Drop the compute cluster cluster_name0. (All parameters below are required.)

  1. curl '127.0.0.1:5000/MetaService/http/add_cluster?token=greedisgood9999' -d '{
  2. "instance_id":"sample_instance_id",
  3. "cluster":{
  4. "type":"COMPUTE",
  5. "cluster_name":"cluster_name0",
  6. "cluster_id":"cluster_id0"
  7. }
  8. }'