Config - Config Options - 《图数据库 HugeGraph v0.12 使用手册》

HugeGraph 配置项

HugeGraph 配置项

Gremlin Server 配置项

对应配置文件gremlin-server.yaml

config option	default value	descrition
host	127.0.0.1	The host or ip of Gremlin Server.
port	8182	The listening port of Gremlin Server.
graphs	hugegraph: conf/hugegraph.properties	The map of graphs with name and config file path.
scriptEvaluationTimeout	30000	The timeout for gremlin script execution(millisecond).
channelizer	org.apache.tinkerpop.gremlin.server.channel.HttpChannelizer	Indicates the protocol which the Gremlin Server provides service.
authentication	authenticator: com.baidu.hugegraph.auth.StandardAuthenticator, config: {tokens: conf/rest-server.properties}	The authenticator and config(contains tokens path) of authentication mechanism.

Rest Server & API 配置项

对应配置文件rest-server.properties

config option	default value	descrition
graphs	[hugegraph:conf/hugegraph.properties]	The map of graphs’ name and config file.
server.id	server-1	The id of rest server, used for license verification.
server.role	master	The role of nodes in the cluster, available types are [master, worker, computer]
restserver.url	http://127.0.0.1:8080	The url for listening of rest server.
ssl.keystore_file	server.keystore	The path of server keystore file used when https protocol is enabled.
ssl.keystore_password		The password of the path of the server keystore file used when the https protocol is enabled.
restserver.max_worker_threads	2 * CPUs	The maximum worker threads of rest server.
restserver.min_free_memory	64	The minimum free memory(MB) of rest server, requests will be rejected when the available memory of system is lower than this value.
restserver.request_timeout	30	The time in seconds within which a request must complete, -1 means no timeout.
restserver.connection_idle_timeout	30	The time in seconds to keep an inactive connection alive, -1 means no timeout.
restserver.connection_max_requests	256	The max number of HTTP requests allowed to be processed on one keep-alive connection, -1 means unlimited.
gremlinserver.url	http://127.0.0.1:8182	The url of gremlin server.
gremlinserver.max_route	8	The max route number for gremlin server.
gremlinserver.timeout	30	The timeout in seconds of waiting for gremlin server.
batch.max_edges_per_batch	500	The maximum number of edges submitted per batch.
batch.max_vertices_per_batch	500	The maximum number of vertices submitted per batch.
batch.max_write_ratio	50	The maximum thread ratio for batch writing, only take effect if the batch.max_write_threads is 0.
batch.max_write_threads	0	The maximum threads for batch writing, if the value is 0, the actual value will be set to batch.max_write_ratio * restserver.max_worker_threads.
auth.authenticator		The class path of authenticator implemention. e.g., com.baidu.hugegraph.auth.StandardAuthenticator, or com.baidu.hugegraph.auth.ConfigAuthenticator.
auth.admin_token	162f7848-0b6d-4faf-b557-3a0797869c55	Token for administrator operations, only for com.baidu.hugegraph.auth.ConfigAuthenticator.
auth.graph_store	hugegraph	The name of graph used to store authentication information, like users, only for com.baidu.hugegraph.auth.StandardAuthenticator.
auth.user_tokens	[hugegraph:9fd95c9c-711b-415b-b85f-d4df46ba5c31]	The map of user tokens with name and password, only for com.baidu.hugegraph.auth.ConfigAuthenticator.
auth.audit_log_rate	1000.0	The max rate of audit log output per user, default value is 1000 records per second.
auth.cache_capacity	10240	The max cache capacity of each auth cache item.
auth.cache_expire	600	The expiration time in seconds of vertex cache.
auth.remote_url		If the address is empty, it provide auth service, otherwise it is auth client and also provide auth service through rpc forwarding. The remote url can be set to multiple addresses, which are concat by ‘,’.
auth.token_expire	86400	The expiration time in seconds after token created
auth.token_secret	FXQXbJtbCLxODc6tGci732pkH1cyf8Qg	Secret key of HS256 algorithm.
exception.allow_trace	false	Whether to allow exception trace stack.

基本配置项

基本配置项及后端配置项对应配置文件：{graph-name}.properties，如hugegraph.properties

config option	default value	descrition
gremlin.graph	com.baidu.hugegraph.HugeFactory	Gremlin entrance to create graph.
backend	rocksdb	The data store type, available values are [memory, rocksdb, cassandra, scylladb, hbase, mysql].
serializer	binary	The serializer for backend store, available values are [text, binary, cassandra, hbase, mysql].
store	hugegraph	The database name like Cassandra Keyspace.
store.connection_detect_interval	600	The interval in seconds for detecting connections, if the idle time of a connection exceeds this value, detect it and reconnect if needed before using, value 0 means detecting every time.
store.graph	g	The graph table name, which store vertex, edge and property.
store.schema	m	The schema table name, which store meta data.
store.system	s	The system table name, which store system data.
schema.illegal_name_regex	.\s+$\|~.	The regex specified the illegal format for schema name.
schema.cache_capacity	10000	The max cache size(items) of schema cache.
vertex.cache_type	l2	The type of vertex cache, allowed values are [l1, l2].
vertex.cache_capacity	10000000	The max cache size(items) of vertex cache.
vertex.cache_expire	600	The expire time in seconds of vertex cache.
vertex.check_customized_id_exist	false	Whether to check the vertices exist for those using customized id strategy.
vertex.default_label	vertex	The default vertex label.
vertex.tx_capacity	10000	The max size(items) of vertices(uncommitted) in transaction.
vertex.check_adjacent_vertex_exist	false	Whether to check the adjacent vertices of edges exist.
vertex.lazy_load_adjacent_vertex	true	Whether to lazy load adjacent vertices of edges.
vertex.part_edge_commit_size	5000	Whether to enable the mode to commit part of edges of vertex, enabled if commit size > 0, 0 means disabled.
vertex.encode_primary_key_number	true	Whether to encode number value of primary key in vertex id.
vertex.remove_left_index_at_overwrite	false	Whether remove left index at overwrite.
edge.cache_type	l2	The type of edge cache, allowed values are [l1, l2].
edge.cache_capacity	1000000	The max cache size(items) of edge cache.
edge.cache_expire	600	The expiration time in seconds of edge cache.
edge.tx_capacity	10000	The max size(items) of edges(uncommitted) in transaction.
query.page_size	500	The size of each page when querying by paging.
query.batch_size	1000	The size of each batch when querying by batch.
query.ignore_invalid_data	true	Whether to ignore invalid data of vertex or edge.
query.index_intersect_threshold	1000	The maximum number of intermediate results to intersect indexes when querying by multiple single index properties.
query.ramtable_edges_capacity	20000000	The maximum number of edges in ramtable, include OUT and IN edges.
query.ramtable_enable	false	Whether to enable ramtable for query of adjacent edges.
query.ramtable_vertices_capacity	10000000	The maximum number of vertices in ramtable, generally the largest vertex id is used as capacity.
query.optimize_aggregate_by_index	false	Whether to optimize aggregate query(like count) by index.
oltp.concurrent_depth	10	The min depth to enable concurrent oltp algorithm.
oltp.concurrent_threads	10	Thread number to concurrently execute oltp algorithm.
oltp.collection_type	EC	The implementation type of collections used in oltp algorithm.
rate_limit.read	0	The max rate(times/s) to execute query of vertices/edges.
rate_limit.write	0	The max rate(items/s) to add/update/delete vertices/edges.
task.wait_timeout	10	Timeout in seconds for waiting for the task to complete,such as when truncating or clearing the backend.
task.input_size_limit	16777216	The job input size limit in bytes.
task.result_size_limit	16777216	The job result size limit in bytes.
task.sync_deletion	false	Whether to delete schema or expired data synchronously.
task.ttl_delete_batch	1	The batch size used to delete expired data.
computer.config	/conf/computer.yaml	The config file path of computer job.
search.text_analyzer	ikanalyzer	Choose a text analyzer for searching the vertex/edge properties, available type are [word, ansj, hanlp, smartcn, jieba, jcseg, mmseg4j, ikanalyzer].
search.text_analyzer_mode	smart	Specify the mode for the text analyzer, the available mode of analyzer are {word: [MaximumMatching, ReverseMaximumMatching, MinimumMatching, ReverseMinimumMatching, BidirectionalMaximumMatching, BidirectionalMinimumMatching, BidirectionalMaximumMinimumMatching, FullSegmentation, MinimalWordCount, MaxNgramScore, PureEnglish], ansj: [BaseAnalysis, IndexAnalysis, ToAnalysis, NlpAnalysis], hanlp: [standard, nlp, index, nShort, shortest, speed], smartcn: [], jieba: [SEARCH, INDEX], jcseg: [Simple, Complex], mmseg4j: [Simple, Complex, MaxWord], ikanalyzer: [smart, max_word]}.
snowflake.datecenter_id	0	The datacenter id of snowflake id generator.
snowflake.force_string	false	Whether to force the snowflake long id to be a string.
snowflake.worker_id	0	The worker id of snowflake id generator.
raft.mode	false	Whether the backend storage works in raft mode.
raft.safe_read	false	Whether to use linearly consistent read.
raft.use_snapshot	false	Whether to use snapshot.
raft.endpoint	127.0.0.1:8281	The peerid of current raft node.
raft.group_peers	127.0.0.1:8281,127.0.0.1:8282,127.0.0.1:8283	The peers of current raft group.
raft.path	./raft-log	The log path of current raft node.
raft.use_replicator_pipeline	true	Whether to use replicator line, when turned on it multiple logs can be sent in parallel, and the next log doesn’t have to wait for the ack message of the current log to be sent.
raft.election_timeout	10000	Timeout in milliseconds to launch a round of election.
raft.snapshot_interval	3600	The interval in seconds to trigger snapshot save.
raft.backend_threads	current CPU vcores	The thread number used to apply task to bakcend.
raft.read_index_threads	8	The thread number used to execute reading index.
raft.apply_batch	1	The apply batch size to trigger disruptor event handler.
raft.queue_size	16384	The disruptor buffers size for jraft RaftNode, StateMachine and LogManager.
raft.queue_publish_timeout	60	The timeout in second when publish event into disruptor.
raft.rpc_threads	80	The rpc threads for jraft RPC layer.
raft.rpc_connect_timeout	5000	The rpc connect timeout for jraft rpc.
raft.rpc_timeout	60000	The rpc timeout for jraft rpc.
raft.rpc_buf_low_water_mark	10485760	The ChannelOutboundBuffer’s low water mark of netty, when buffer size less than this size, the method ChannelOutboundBuffer.isWritable() will return true, it means that low downstream pressure or good network.
raft.rpc_buf_high_water_mark	20971520	The ChannelOutboundBuffer’s high water mark of netty, only when buffer size exceed this size, the method ChannelOutboundBuffer.isWritable() will return false, it means that the downstream pressure is too great to process the request or network is very congestion, upstream needs to limit rate at this time.
raft.read_strategy	ReadOnlyLeaseBased	The linearizability of read strategy.

RPC server 配置

config option	default value	descrition
rpc.client_connect_timeout	20	The timeout(in seconds) of rpc client connect to rpc server.
rpc.client_load_balancer	consistentHash	The rpc client uses a load-balancing algorithm to access multiple rpc servers in one cluster. Default value is ‘consistentHash’, means forwording by request parameters.
rpc.client_read_timeout	40	The timeout(in seconds) of rpc client read from rpc server.
rpc.client_reconnect_period	10	The period(in seconds) of rpc client reconnect to rpc server.
rpc.client_retries	3	Failed retry number of rpc client calls to rpc server.
rpc.config_order	999	Sofa rpc configuration file loading order, the larger the more later loading.
rpc.logger_impl	com.alipay.sofa.rpc.log.SLF4JLoggerImpl	Sofa rpc log implementation class.
rpc.protocol	bolt	Rpc communication protocol, client and server need to be specified the same value.
rpc.remote_url		The remote urls of rpc peers, it can be set to multiple addresses, which are concat by ‘,’, empty value means not enabled.
rpc.server_adaptive_port	false	Whether the bound port is adaptive, if it’s enabled, when the port is in use, automatically +1 to detect the next available port. Note that this process is not atomic, so there may still be port conflicts.
rpc.server_host		The hosts/ips bound by rpc server to provide services, empty value means not enabled.
rpc.server_port	8090	The port bound by rpc server to provide services.
rpc.server_timeout	30	The timeout(in seconds) of rpc server execution.

Cassandra 后端配置项

config option	default value	descrition
backend		Must be set to `cassandra`.
serializer		Must be set to `cassandra`.
cassandra.host	localhost	The seeds hostname or ip address of cassandra cluster.
cassandra.port	9042	The seeds port address of cassandra cluster.
cassandra.connect_timeout	5	The cassandra driver connect server timeout(seconds).
cassandra.read_timeout	20	The cassandra driver read from server timeout(seconds).
cassandra.keyspace.strategy	SimpleStrategy	The replication strategy of keyspace, valid value is SimpleStrategy or NetworkTopologyStrategy.
cassandra.keyspace.replication	[3]	The keyspace replication factor of SimpleStrategy, like ‘[3]’.Or replicas in each datacenter of NetworkTopologyStrategy, like ‘[dc1:2,dc2:1]’.
cassandra.username		The username to use to login to cassandra cluster.
cassandra.password		The password corresponding to cassandra.username.
cassandra.compression_type	none	The compression algorithm of cassandra transport: none/snappy/lz4.
cassandra.jmx_port=7199	7199	The port of JMX API service for cassandra.
cassandra.aggregation_timeout	43200	The timeout in seconds of waiting for aggregation.

ScyllaDB 后端配置项

config option	default value	descrition
backend		Must be set to `scylladb`.
serializer		Must be set to `scylladb`.

其它与 Cassandra 后端一致。

RocksDB 后端配置项

config option	default value	descrition
backend		Must be set to `rocksdb`.
serializer		Must be set to `binary`.
rocksdb.data_disks	[]	The optimized disks for storing data of RocksDB. The format of each element: `STORE/TABLE: /path/disk`.Allowed keys are [g/vertex, g/edge_out, g/edge_in, g/vertex_label_index, g/edge_label_index, g/range_int_index, g/range_float_index, g/range_long_index, g/range_double_index, g/secondary_index, g/search_index, g/shard_index, g/unique_index, g/olap]
rocksdb.data_path	rocksdb-data	The path for storing data of RocksDB.
rocksdb.wal_path	rocksdb-data	The path for storing WAL of RocksDB.
rocksdb.allow_mmap_reads	false	Allow the OS to mmap file for reading sst tables.
rocksdb.allow_mmap_writes	false	Allow the OS to mmap file for writing.
rocksdb.block_cache_capacity	8388608	The amount of block cache in bytes that will be used by RocksDB, 0 means no block cache.
rocksdb.bloom_filter_bits_per_key	-1	The bits per key in bloom filter, a good value is 10, which yields a filter with ~ 1% false positive rate, -1 means no bloom filter.
rocksdb.bloom_filter_block_based_mode	false	Use block based filter rather than full filter.
rocksdb.bloom_filter_whole_key_filtering	true	True if place whole keys in the bloom filter, else place the prefix of keys.
rocksdb.bottommost_compression	NO_COMPRESSION	The compression algorithm for the bottommost level of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd.
rocksdb.bulkload_mode	false	Switch to the mode to bulk load data into RocksDB.
rocksdb.cache_index_and_filter_blocks	false	Indicating if we’d put index/filter blocks to the block cache.
rocksdb.compaction_style	LEVEL	Set compaction style for RocksDB: LEVEL/UNIVERSAL/FIFO.
rocksdb.compression	SNAPPY_COMPRESSION	The compression algorithm for compressing blocks of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd.
rocksdb.compression_per_level	[NO_COMPRESSION, NO_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION]	The compression algorithms for different levels of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd.
rocksdb.delayed_write_rate	16777216	The rate limit in bytes/s of user write requests when need to slow down if the compaction gets behind.
rocksdb.log_level	INFO	The info log level of RocksDB.
rocksdb.max_background_jobs	8	Maximum number of concurrent background jobs, including flushes and compactions.
rocksdb.level_compaction_dynamic_level_bytes	false	Whether to enable level_compaction_dynamic_level_bytes, if it’s enabled we give max_bytes_for_level_multiplier a priority against max_bytes_for_level_base, the bytes of base level is dynamic for a more predictable LSM tree, it is useful to limit worse case space amplification. Turning this feature on/off for an existing DB can cause unexpected LSM tree structure so it’s not recommended.
rocksdb.max_bytes_for_level_base	536870912	The upper-bound of the total size of level-1 files in bytes.
rocksdb.max_bytes_for_level_multiplier	10.0	The ratio between the total size of level (L+1) files and the total size of level L files for all L.
rocksdb.max_open_files	-1	The maximum number of open files that can be cached by RocksDB, -1 means no limit.
rocksdb.max_subcompactions	4	The value represents the maximum number of threads per compaction job.
rocksdb.max_write_buffer_number	6	The maximum number of write buffers that are built up in memory.
rocksdb.max_write_buffer_number_to_maintain	0	The total maximum number of write buffers to maintain in memory.
rocksdb.min_write_buffer_number_to_merge	2	The minimum number of write buffers that will be merged together.
rocksdb.num_levels	7	Set the number of levels for this database.
rocksdb.optimize_filters_for_hits	false	This flag allows us to not store filters for the last level.
rocksdb.optimize_mode	true	Optimize for heavy workloads and big datasets.
rocksdb.pin_l0_filter_and_index_blocks_in_cache	false	Indicating if we’d put index/filter blocks to the block cache.
rocksdb.sst_path		The path for ingesting SST file into RocksDB.
rocksdb.target_file_size_base	67108864	The target file size for compaction in bytes.
rocksdb.target_file_size_multiplier	1	The size ratio between a level L file and a level (L+1) file.
rocksdb.use_direct_io_for_flush_and_compaction	false	Enable the OS to use direct read/writes in flush and compaction.
rocksdb.use_direct_reads	false	Enable the OS to use direct I/O for reading sst tables.
rocksdb.write_buffer_size	134217728	Amount of data in bytes to build up in memory.
rocksdb.max_manifest_file_size	104857600	The max size of manifest file in bytes.
rocksdb.skip_stats_update_on_db_open	false	Whether to skip statistics update when opening the database, setting this flag true allows us to not update statistics.
rocksdb.max_file_opening_threads	16	The max number of threads used to open files.
rocksdb.max_total_wal_size	0	Total size of WAL files in bytes. Once WALs exceed this size, we will start forcing the flush of column families related, 0 means no limit.
rocksdb.db_write_buffer_size	0	Total size of write buffers in bytes across all column families, 0 means no limit.
rocksdb.delete_obsolete_files_period	21600	The periodicity in seconds when obsolete files get deleted, 0 means always do full purge.
rocksdb.hard_pending_compaction_bytes_limit	274877906944	The hard limit to impose on pending compaction in bytes.
rocksdb.level0_file_num_compaction_trigger	2	Number of files to trigger level-0 compaction.
rocksdb.level0_slowdown_writes_trigger	20	Soft limit on number of level-0 files for slowing down writes.
rocksdb.level0_stop_writes_trigger	36	Hard limit on number of level-0 files for stopping writes.
rocksdb.soft_pending_compaction_bytes_limit	68719476736	The soft limit to impose on pending compaction in bytes.

HBase 后端配置项

config option	default value	descrition
backend		Must be set to `hbase`.
serializer		Must be set to `hbase`.
hbase.hosts	localhost	The hostnames or ip addresses of HBase zookeeper, separated with commas.
hbase.port	2181	The port address of HBase zookeeper.
hbase.threads_max	64	The max threads num of hbase connections.
hbase.znode_parent	/hbase	The znode parent path of HBase zookeeper.
hbase.zk_retry	3	The recovery retry times of HBase zookeeper.
hbase.aggregation_timeout	43200	The timeout in seconds of waiting for aggregation.
hbase.kerberos_enable	false	Is Kerberos authentication enabled for HBase.
hbase.kerberos_keytab		The HBase’s key tab file for kerberos authentication.
hbase.kerberos_principal		The HBase’s principal for kerberos authentication.
hbase.krb5_conf	etc/krb5.conf	Kerberos configuration file, including KDC IP, default realm, etc.
hbase.hbase_site	/etc/hbase/conf/hbase-site.xml	The HBase’s configuration file

MySQL & PostgreSQL 后端配置项

config option	default value	descrition
backend		Must be set to `mysql`.
serializer		Must be set to `mysql`.
jdbc.driver	com.mysql.jdbc.Driver	The JDBC driver class to connect database.
jdbc.url	jdbc:mysql://127.0.0.1:3306	The url of database in JDBC format.
jdbc.username	root	The username to login database.
jdbc.password	**	The password corresponding to jdbc.username.
jdbc.ssl_mode	false	The SSL mode of connections with database.
jdbc.reconnect_interval	3	The interval(seconds) between reconnections when the database connection fails.
jdbc.reconnect_max_times	3	The reconnect times when the database connection fails.
jdbc.storage_engine	InnoDB	The storage engine of backend store database, like InnoDB/MyISAM/RocksDB for MySQL.
jdbc.postgresql.connect_database	template1	The database used to connect when init store, drop store or check store exist.

PostgreSQL 后端配置项

config option	default value	descrition
backend		Must be set to `postgresql`.
serializer		Must be set to `postgresql`.

其它与 MySQL 后端一致。

PostgreSQL 后端的 driver 和 url 应该设置为:

jdbc.driver=org.postgresql.Driver
jdbc.url=jdbc:postgresql://localhost:5432/