Configuration

GreptimeDB supports layered configuration and uses the following precedence order(each item takes precedence over the item below it):

  • Command-line options
  • Configuration file
  • Environment variables
  • Default values

This page describes methods for configuring GreptimeDB server settings. Configuration can be set in TOML file.

The system assigns a default value for missing parameters in the configuration file.

All sample configuration files are in the project’s config folder.

Command-line options

See Command lines to learn how to use the greptime command line.

Global options

  • -h/--help: Print help information;
  • -V/--version: Print version information;
  • --log-dir <LOG_DIR>: The logging directory;
  • --log-level <LOG_LEVEL>: The logging level;

Datanode subcommand options

You can list all the options from the following command:

  1. greptime datanode start --help
  • -c/--config-file: The configuration file for datanode;
  • --data-home: Database storage root directory;
  • --env-prefix <ENV_PREFIX>: The prefix of environment variables, default is GREPTIMEDB_DATANODE;
  • --http-addr <HTTP_ADDR>: HTTP server address;
  • --http-timeout <HTTP_TIMEOUT>: HTTP request timeout in seconds.
  • --metasrv-addr <METASRV_ADDR>: Metasrv address list;
  • --node-id <NODE_ID>: The datanode ID;
  • --rpc-addr <RPC_ADDR>: The datanode RPC addr;
  • --rpc-hostname <RPC_HOSTNAME>: The datanode hostname;
  • --wal-dir <WAL_DIR>: The directory of WAL;

All the addr options are in the form of ip:port.

Metasrv subcommand options

You can list all the options from the following command:

  1. greptime metasrv start --help
  • -c/--config-file: The configuration file for metasrv;
  • --enable-region-failover: Whether to enable region failover, default is false.
  • --env-prefix <ENV_PREFIX>: The prefix of environment variables, default is GREPTIMEDB_METASRV;
  • --bind-addr <BIND_ADDR>: The bind address of metasrv;
  • --http-addr <HTTP_ADDR>: HTTP server address;
  • --http-timeout <HTTP_TIMEOUT>: HTTP request timeout in seconds.
  • --selector <SELECTOR>: You can refer selector-type;
  • --server-addr <SERVER_ADDR>: The communication server address for frontend and datanode to connect to metasrv;
  • --store-addr <STORE_ADDR>: Comma seperated etcd server addresses to store metadata;
  • --use-memory-store: Use memory store instead of etcd, for test purpose only;

Frontend subcommand options

You can list all the options from the following command:

  1. greptime frontend start --help
  • -c/--config-file: The configuration file for frontend;
  • --disable-dashboard: Disable dashboard http service, default is false.
  • --env-prefix <ENV_PREFIX>: The prefix of environment variables, default is GREPTIMEDB_FRONTEND;
  • --rpc-addr <RPC_ADDR>: GRPC server address;
  • --http-addr <HTTP_ADDR>: HTTP server address;
  • --http-timeout <HTTP_TIMEOUT>: HTTP request timeout in seconds.
  • --influxdb-enable: Whether to enable InfluxDB protocol in HTTP API;
  • --metasrv-addr <METASRV_ADDR>: Metasrv address list;
  • --mysql-addr <MYSQL_ADDR>: MySQL server address;
  • --opentsdb-addr <OPENTSDB_ADDR>: OpenTSDB server address;
  • --postgres-addr <POSTGRES_ADDR>: Postgres server address;
  • --tls-cert-path <TLS_CERT_PATH>: The TLS public key file path;
  • --tls-key-path <TLS_KEY_PATH>: The TLS private key file path;
  • --tls-mode <TLS_MODE>: TLS Mode;
  • --user-provider <USER_PROVIDER>: You can refer authentication;

Standalone subcommand options

You can list all the options from the following command:

  1. greptime standalone start --help
  • -c/--config-file: The configuration file for frontend;
  • --env-prefix <ENV_PREFIX>: The prefix of environment variables, default is GREPTIMEDB_STANDALONE;
  • --http-addr <HTTP_ADDR>: HTTP server address;
  • --influxdb-enable: Whether to enable InfluxDB protocol in HTTP API;
  • --mysql-addr <MYSQL_ADDR>: MySQL server address;
  • --opentsdb-addr <OPENTSDB_ADDR>: OpenTSDB server address;
  • --postgres-addr <POSTGRES_ADDR>: Postgres server address;
  • --rpc-addr <RPC_ADDR>: gRPC server address;

Configuration File

Specify configuration file

You can specify the configuration file by using the command line arg -c [file_path], for example:

sh

  1. greptime [standalone | frontend | datanode | metasrv] start -c config/standalone.example.toml

Common configurations

Common protocol configurations in frontend and standalone sub command:

toml

  1. [http]
  2. addr = "127.0.0.1:4000"
  3. timeout = "30s"
  4. body_limit = "64MB"
  5. [grpc]
  6. addr = "127.0.0.1:4001"
  7. runtime_size = 8
  8. [mysql]
  9. enable = true
  10. addr = "127.0.0.1:4002"
  11. runtime_size = 2
  12. [mysql.tls]
  13. mode = "disable"
  14. cert_path = ""
  15. key_path = ""
  16. [postgres]
  17. enable = true
  18. addr = "127.0.0.1:4003"
  19. runtime_size = 2
  20. [postgres.tls]
  21. mode = "disable"
  22. cert_path = ""
  23. key_path = ""
  24. [opentsdb]
  25. enable = true
  26. addr = "127.0.0.1:4242"
  27. runtime_size = 2
  28. [influxdb]
  29. enable = true
  30. [prom_store]
  31. enable = true

All of these protocols except HTTP and gRPC are optional, the default values are listed above. If you want to disable some options, such as OpenTSDB protocol support, you can set the enable to false.

Protocol options

OptionKeyTypeDescription
httpHTTP server options
addrStringServer address, “127.0.0.1:4000” by default
timeoutStringHTTP request timeout, “30s” by default
body_limitStringHTTP max body size, “64MB” by default
grpcgRPC server options
addrStringServer address, “127.0.0.1:4001” by default
runtime_sizeIntegerThe number of server worker threads, 8 by default
mysqlMySQL server options
enableBooleanWhether to enable MySQL protocol, true by default
addStringServer address, “127.0.0.1:4002” by default
runtime_sizeIntegerThe number of server worker threads, 2 by default
influxdbInfluxDB Protocol options
enableBooleanWhether to enable InfluxDB protocol in HTTP API, true by default
opentsdbOpenTSDB Protocol options
enableBooleanWhether to enable OpenTSDB protocol, true by default
addrStringOpenTSDB telnet API server address, “127.0.0.1:4242” by default
runtime_sizeIntegerThe number of server worker threads, 2 by default
prom_storePrometheus remote storage options
enableBooleanWhether to enable Prometheus remote write and read in HTTP API, true by default
postgresPostgresSQL server options
enableBooleanWhether to enable PostgresSQL protocol, true by default
addrStringServer address, “127.0.0.1:4003” by default
runtime_sizeIntegerThe number of server worker threads, 2 by default

Node options

There are also some node options in common:

OptionKeyTypeDescription
modeStringNode running mode, includes “standalone” and “distributed”

Storage options

The storage options are valid in datanode and standalone mode, which specify the database data directory and other storage-related options.

GreptimeDB supports storing data in local file system, AWS S3 and compatible services (including MinIO, digitalocean space, Tencent Cloud Object Storage(COS), Baidu Object Storage(BOS) and so on), Azure Blob Storage and Aliyun OSS.

OptionKeyTypeDescription
storageStorage options
typeStringStorage type, supports “File”, “S3” and “Oss” etc.
FileLocal file storage options, valid when type=”File”
data_homeStringDatabase storage root directory, “/tmp/greptimedb” by default
S3AWS S3 storage options, valid when type=”S3”
bucketStringThe S3 bucket name
rootStringThe root path in S3 bucket
endpointStringThe API endpoint of S3
regionStringThe S3 region
access_key_idStringThe S3 access key id
secret_access_keyStringThe S3 secret access key
OssAliyun OSS storage options, valid when type=”Oss”
bucketStringThe OSS bucket name
rootStringThe root path in OSS bucket
endpointStringThe API endpoint of OSS
access_key_idStringThe OSS access key id
secret_access_keyStringThe OSS secret access key
AzblobAzure Blob Storage options, valid when type=”Azblob”
containerStringThe container name
rootStringThe root path in container
endpointStringThe API endpoint of Azure Blob Storage
account_nameStringThe account name of Azure Blob Storage
account_keyStringThe access key
sas_tokenStringThe shared access signature
GscGoogle Cloud Storage options, valid when type=”Gsc”
rootStringThe root path in Gsc bucket
bucketStringThe Gsc bucket name
scopeStringThe Gsc service scope
credential_pathStringThe Gsc credentials path
endpointStringThe API endpoint of Gsc

A file storage sample configuration:

toml

  1. [storage]
  2. type = "File"
  3. data_home = "/tmp/greptimedb/"

A S3 storage sample configuration:

toml

  1. [storage]
  2. type = "S3"
  3. bucket = "test_greptimedb"
  4. root = "/greptimedb"
  5. access_key_id = "<access key id>"
  6. secret_access_key = "<secret access key>"

Custom multiple storage engines

[[storage.providers]] setups the table storage engine providers. Based on these providers, you can create a table with a specified storage, see create table:

toml

  1. # Allows using multiple storages
  2. [[storage.providers]]
  3. type = "S3"
  4. bucket = "test_greptimedb"
  5. root = "/greptimedb"
  6. access_key_id = "<access key id>"
  7. secret_access_key = "<secret access key>"
  8. [[storage.providers]]
  9. type = "Gcs"
  10. bucket = "test_greptimedb"
  11. root = "/greptimedb"
  12. credential_path = "<gcs credential path>"

All configured providers can be used as the storage option when creating tables.

Object storage cache

When using S3, OSS or Azure Blob Storage, it’s better to enable object storage caching for speedup data querying:

toml

  1. [storage]
  2. type = "S3"
  3. bucket = "test_greptimedb"
  4. root = "/greptimedb"
  5. access_key_id = "<access key id>"
  6. secret_access_key = "<secret access key>"
  7. ## Enable object storage caching
  8. cache_path = "/var/data/s3_local_cache"
  9. cache_capacity = "256MiB"

The cache_path is the local file directory that keeps cache files, and the cache_capacity is the maximum total file size in the cache directory.

WAL options

The [wal] section in datanode or standalone config file configures the options of Write-Ahead-Log:

toml

  1. [wal]
  2. file_size = "256MB"
  3. purge_threshold = "4GB"
  4. purge_interval = "10m"
  5. read_batch_size = 128
  6. sync_write = false
  • dir: is the directory where to write logs. When using File storage, it’s {data_home}/wal by default. It must be configured explicitly when using other storage types such as S3 etc.
  • file_size: the maximum size of the WAL log file, default is 256MB.
  • purge_threshold and purge_interval: control the purging of wal files, default is 4GB.
  • sync_write: whether to call fsync when writing every log.

Storage engine options

The parameters corresponding to different storage engines can be configured for datanode and standalone in the [region_engine] section. Currently, there is only one storage engine available, which is mito.

toml

  1. [[region_engine]]
  2. [region_engine.mito]
  3. num_workers = 1
  4. manifest_checkpoint_distance = 10
  5. max_background_jobs = 4
  6. global_write_buffer_size = "1GB"
  7. global_write_buffer_reject_size = "2GB"
  • num_workers: Number of write threads
  • manifest_checkpoint_distance: Create a checkpoint every manifest_checkpoint_distance manifest files are written
  • max_background_jobs: Number of background threads
  • global_write_buffer_size: Size of the write buffer, default is 1GB
  • global_write_buffer_reject_size: Reject write requests when the size of data in the write buffer exceeds global_write_buffer_reject_size. It needs to be larger than global_write_buffer_size, default is 2GB

Standalone

A sample standalone configuration can be found at standalone.example.toml.

Start the standalone mode as below:

  1. greptime standalone start -c standalone.example.toml

Frontend in distributed mode

Configure frontend in distributed mode:

toml

  1. mode = "distributed"
  2. [meta_client]
  3. metasrv_addrs = ["127.0.0.1:3002"]
  4. timeout = "3s"
  5. connect_timeout = "1s"
  6. ddl_timeout = "10s"
  7. tcp_nodelay = true

Specify the running mode to be "distributed".

The meta_client configures the Metasrv client, including:

  • metasrv_addrs, The Metasrv address list.
  • timeout, operation timeout, 3s by default.
  • connect_timeout, connect server timeout, 1s by default.
  • ddl_timeout, DDL execution timeout, 10s by default.
  • tcp_nodelay, TCP_NODELAY option for accepted connections, true by default.

A sample frontend configuration for distributed mode can be found at frontend.example.toml.

Datanode in distributed mode

Configure datanode in distributed mode:

toml

  1. node_id = 42
  2. mode = "distributed"
  3. rpc_hostname = "127.0.0.1"
  4. rpc_addr = "127.0.0.1:3001"
  5. rpc_runtime_size = 8
  6. [meta_client]
  7. metasrv_addrs = ["127.0.0.1:3002"]
  8. timeout = "3s"
  9. connect_timeout = "1s"
  10. tcp_nodelay = false

Datanode in distributed mode should set different node_id in different nodes.

A sample datanode configuration for distributed mode can be found at datanode.example.toml.

Metasrv configuration

A sample configuration can be found at metasrv.example.toml.

toml

  1. # The working home directory.
  2. data_home = "/tmp/metasrv/"
  3. # The bind address of metasrv, "127.0.0.1:3002" by default.
  4. bind_addr = "127.0.0.1:3002"
  5. # The communication server address for frontend and datanode to connect to metasrv, "127.0.0.1:3002" by default for localhost.
  6. server_addr = "127.0.0.1:3002"
  7. # Etcd server addresses, "127.0.0.1:2379" by default.
  8. store_addr = "127.0.0.1:2379"
  9. # Datanode selector type.
  10. # - "lease_based" (default value).
  11. # - "load_based"
  12. # For details, please see "https://docs.greptime.com/contributor-guide/meta/selector".
  13. selector = "LeaseBased"
  14. # Store data in memory, false by default.
  15. use_memory_store = false
KeyTypeDescription
data_homeStringThe working home of Metasrv, “/tmp/metasrv/“ by default
bind_addrStringThe bind address of Metasrv, “127.0.0.1:3002” by default.
server_addrStringThe communication server address for frontend and datanode to connect to Metasrv, “127.0.0.1:3002” by default for localhost
store_addrStringetcd server addresses, “127.0.0.1:2379” by default, server address separated by commas, in the format of “ip1:port1,ip2:port2,…”.
selectorStringLoad balance strategy to choose datanode when creating new tables, see Selector
use_memory_storeBooleanOnly used for testing when you don’t have an etcd cluster, store data in memory, false by default.

Logging options

frontend, metasrv, datanode and standalone can all configure log and tracing related parameters in the [logging] section:

toml

  1. [logging]
  2. dir = "/tmp/greptimedb/logs"
  3. level = "info"
  4. enable_otlp_tracing = false
  5. otlp_endpoint = "localhost:4317"
  6. tracing_sample_ratio = 1.0
  7. append_stdout = true
  • dir: log output directory.
  • level: output log level, available log level are info, debug, error, warn, the default level is info.
  • enable_otlp_tracing: whether to turn on distributed tracing, not turned on by default.
  • otlp_endpoint: Export the target endpoint of tracing using gRPC-based OTLP protocol, the default value is localhost:4317.
  • tracing_sample_ratio: The percentage of sampling tracing, the value range is [0,1], the default value is 1, which means sampling all tracing.
  • append_stdout: Whether to append logs to stdout. Defaults to true.

How to use distributed tracing, please reference Tracing

Environment variable

Every item in the configuration file can be mapped into environment variables. For example, if we want to set the configuration item max_inflight_tasks of datanode by environment variable:

toml

  1. # ...
  2. [storage]
  3. data_home = "/data/greptimedb"
  4. # ...

You can use the following shell command to setup the environment variable as the following format:

  1. export GREPTIMEDB_DATANODE__STORAGE__DATA_HOME=/data/greptimedb

Environment Variable Rules

  • Every environment variable should have the component prefix, for example:

    • GREPTIMEDB_FRONTEND
    • GREPTIMEDB_METASRV
    • GREPTIMEDB_DATANODE
    • GREPTIMEDB_STANDALONE
  • We use double underscore __ as a separator. For example, the above data structure storage.data_home will be transformed to STORAGE__DATA_HOME.

The environment variable also accepts list that are separated by a comma ,, for example:

  1. GREPTIMEDB_METASRV__META_CLIENT__METASRV_ADDRS=127.0.0.1:3001,127.0.0.1:3002,127.0.0.1:3003