HugeGraph 配置

1 概述

配置文件的目录为 hugegraph-release/conf,所有关于服务和图本身的配置都在此目录下。

主要的配置文件包括:gremlin-server.yaml、rest-server.properties 和 hugegraph.properties

HugeGraphServer 内部集成了 GremlinServer 和 RestServer,而 gremlin-server.yaml 和 rest-server.properties 就是用来配置这两个 Server 的。

  • GremlinServer:GremlinServer 接受用户的 gremlin 语句,解析后转而调用 Core 的代码。
  • RestServer:提供 RESTful API,根据不同的 HTTP 请求,调用对应的 Core API,如果用户请求体是 gremlin 语句,则会转发给 GremlinServer,实现对图数据的操作。

下面对这三个配置文件逐一介绍。

2 gremlin-server.yaml

gremlin-server.yaml 文件默认的内容如下:

  1. # host and port of gremlin server, need to be consistent with host and port in rest-server.properties
  2. #host: 127.0.0.1
  3. #port: 8182
  4. # Gremlin 查询中的超时时间(以毫秒为单位)
  5. evaluationTimeout: 30000
  6. channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer
  7. # 不要在此处设置图形,此功能将在支持动态添加图形后再进行处理
  8. graphs: {
  9. }
  10. scriptEngines: {
  11. gremlin-groovy: {
  12. staticImports: [
  13. org.opencypher.gremlin.process.traversal.CustomPredicates.*',
  14. org.opencypher.gremlin.traversal.CustomFunctions.*
  15. ],
  16. plugins: {
  17. org.apache.hugegraph.plugin.HugeGraphGremlinPlugin: {},
  18. org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
  19. org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {
  20. classImports: [
  21. java.lang.Math,
  22. org.apache.hugegraph.backend.id.IdGenerator,
  23. org.apache.hugegraph.type.define.Directions,
  24. org.apache.hugegraph.type.define.NodeRole,
  25. org.apache.hugegraph.traversal.algorithm.CollectionPathsTraverser,
  26. org.apache.hugegraph.traversal.algorithm.CountTraverser,
  27. org.apache.hugegraph.traversal.algorithm.CustomizedCrosspointsTraverser,
  28. org.apache.hugegraph.traversal.algorithm.CustomizePathsTraverser,
  29. org.apache.hugegraph.traversal.algorithm.FusiformSimilarityTraverser,
  30. org.apache.hugegraph.traversal.algorithm.HugeTraverser,
  31. org.apache.hugegraph.traversal.algorithm.JaccardSimilarTraverser,
  32. org.apache.hugegraph.traversal.algorithm.KneighborTraverser,
  33. org.apache.hugegraph.traversal.algorithm.KoutTraverser,
  34. org.apache.hugegraph.traversal.algorithm.MultiNodeShortestPathTraverser,
  35. org.apache.hugegraph.traversal.algorithm.NeighborRankTraverser,
  36. org.apache.hugegraph.traversal.algorithm.PathsTraverser,
  37. org.apache.hugegraph.traversal.algorithm.PersonalRankTraverser,
  38. org.apache.hugegraph.traversal.algorithm.SameNeighborTraverser,
  39. org.apache.hugegraph.traversal.algorithm.ShortestPathTraverser,
  40. org.apache.hugegraph.traversal.algorithm.SingleSourceShortestPathTraverser,
  41. org.apache.hugegraph.traversal.algorithm.SubGraphTraverser,
  42. org.apache.hugegraph.traversal.algorithm.TemplatePathsTraverser,
  43. org.apache.hugegraph.traversal.algorithm.steps.EdgeStep,
  44. org.apache.hugegraph.traversal.algorithm.steps.RepeatEdgeStep,
  45. org.apache.hugegraph.traversal.algorithm.steps.WeightedEdgeStep,
  46. org.apache.hugegraph.traversal.optimize.ConditionP,
  47. org.apache.hugegraph.traversal.optimize.Text,
  48. org.apache.hugegraph.traversal.optimize.TraversalUtil,
  49. org.apache.hugegraph.util.DateUtil,
  50. org.opencypher.gremlin.traversal.CustomFunctions,
  51. org.opencypher.gremlin.traversal.CustomPredicate
  52. ],
  53. methodImports: [
  54. java.lang.Math#*,
  55. org.opencypher.gremlin.traversal.CustomPredicate#*,
  56. org.opencypher.gremlin.traversal.CustomFunctions#*
  57. ]
  58. },
  59. org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {
  60. files: [scripts/empty-sample.groovy]
  61. }
  62. }
  63. }
  64. }
  65. serializers:
  66. - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1,
  67. config: {
  68. serializeResultToString: false,
  69. ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
  70. }
  71. }
  72. - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0,
  73. config: {
  74. serializeResultToString: false,
  75. ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
  76. }
  77. }
  78. - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV2d0,
  79. config: {
  80. serializeResultToString: false,
  81. ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
  82. }
  83. }
  84. - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0,
  85. config: {
  86. serializeResultToString: false,
  87. ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
  88. }
  89. }
  90. metrics: {
  91. consoleReporter: {enabled: false, interval: 180000},
  92. csvReporter: {enabled: false, interval: 180000, fileName: ./metrics/gremlin-server-metrics.csv},
  93. jmxReporter: {enabled: false},
  94. slf4jReporter: {enabled: false, interval: 180000},
  95. gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  96. graphiteReporter: {enabled: false, interval: 180000}
  97. }
  98. maxInitialLineLength: 4096
  99. maxHeaderSize: 8192
  100. maxChunkSize: 8192
  101. maxContentLength: 65536
  102. maxAccumulationBufferComponents: 1024
  103. resultIterationBatchSize: 64
  104. writeBufferLowWaterMark: 32768
  105. writeBufferHighWaterMark: 65536
  106. ssl: {
  107. enabled: false
  108. }

上面的配置项很多,但目前只需要关注如下几个配置项:channelizer 和 graphs。

  • graphs:GremlinServer 启动时需要打开的图,该项是一个 map 结构,key 是图的名字,value 是该图的配置文件路径;
  • channelizer:GremlinServer 与客户端有两种通信方式,分别是 WebSocket 和 HTTP(默认)。如果选择 WebSocket, 用户可以通过 Gremlin-Console 快速体验 HugeGraph 的特性,但是不支持大规模数据导入, 推荐使用 HTTP 的通信方式,HugeGraph 的外围组件都是基于 HTTP 实现的;

默认 GremlinServer 是服务在 localhost:8182,如果需要修改,配置 host、port 即可

  • host:部署 GremlinServer 机器的机器名或 IP,目前 HugeGraphServer 不支持分布式部署,且 GremlinServer 不直接暴露给用户;
  • port:部署 GremlinServer 机器的端口;

同时需要在 rest-server.properties 中增加对应的配置项 gremlinserver.url=http://host:port

3 rest-server.properties

rest-server.properties 文件的默认内容如下:

  1. # bind url
  2. restserver.url=http://127.0.0.1:8080
  3. # gremlin server url, need to be consistent with host and port in gremlin-server.yaml
  4. #gremlinserver.url=http://127.0.0.1:8182
  5. # graphs list with pair NAME:CONF_PATH
  6. graphs=[hugegraph:conf/hugegraph.properties]
  7. # authentication
  8. #auth.authenticator=
  9. #auth.admin_token=
  10. #auth.user_tokens=[]
  11. server.id=server-1
  12. server.role=master
  • restserver.url:RestServer 提供服务的 url,根据实际环境修改;
  • graphs:RestServer 启动时也需要打开图,该项为 map 结构,key 是图的名字,value 是该图的配置文件路径;

注意:gremlin-server.yaml 和 rest-server.properties 都包含 graphs 配置项,而 init-store 命令是根据 gremlin-server.yaml 的 graphs 下的图进行初始化的。

配置项 gremlinserver.url 是 GremlinServer 为 RestServer 提供服务的 url,该配置项默认为 http://localhost:8182,如需修改,需要和 gremlin-server.yaml 中的 host 和 port 相匹配;

4 hugegraph.properties

hugegraph.properties 是一类文件,因为如果系统存在多个图,则会有多个相似的文件。该文件用来配置与图存储和查询相关的参数,文件的默认内容如下:

  1. # gremlin entrence to create graph
  2. gremlin.graph=org.apache.hugegraph.HugeFactory
  3. # cache config
  4. #schema.cache_capacity=100000
  5. # vertex-cache default is 1000w, 10min expired
  6. #vertex.cache_capacity=10000000
  7. #vertex.cache_expire=600
  8. # edge-cache default is 100w, 10min expired
  9. #edge.cache_capacity=1000000
  10. #edge.cache_expire=600
  11. # schema illegal name template
  12. #schema.illegal_name_regex=\s+|~.*
  13. #vertex.default_label=vertex
  14. backend=rocksdb
  15. serializer=binary
  16. store=hugegraph
  17. raft.mode=false
  18. raft.safe_read=false
  19. raft.use_snapshot=false
  20. raft.endpoint=127.0.0.1:8281
  21. raft.group_peers=127.0.0.1:8281,127.0.0.1:8282,127.0.0.1:8283
  22. raft.path=./raft-log
  23. raft.use_replicator_pipeline=true
  24. raft.election_timeout=10000
  25. raft.snapshot_interval=3600
  26. raft.backend_threads=48
  27. raft.read_index_threads=8
  28. raft.queue_size=16384
  29. raft.queue_publish_timeout=60
  30. raft.apply_batch=1
  31. raft.rpc_threads=80
  32. raft.rpc_connect_timeout=5000
  33. raft.rpc_timeout=60000
  34. # if use 'ikanalyzer', need download jar from 'https://github.com/apache/hugegraph-doc/raw/ik_binary/dist/server/ikanalyzer-2012_u6.jar' to lib directory
  35. search.text_analyzer=jieba
  36. search.text_analyzer_mode=INDEX
  37. # rocksdb backend config
  38. #rocksdb.data_path=/path/to/disk
  39. #rocksdb.wal_path=/path/to/disk
  40. # cassandra backend config
  41. cassandra.host=localhost
  42. cassandra.port=9042
  43. cassandra.username=
  44. cassandra.password=
  45. #cassandra.connect_timeout=5
  46. #cassandra.read_timeout=20
  47. #cassandra.keyspace.strategy=SimpleStrategy
  48. #cassandra.keyspace.replication=3
  49. # hbase backend config
  50. #hbase.hosts=localhost
  51. #hbase.port=2181
  52. #hbase.znode_parent=/hbase
  53. #hbase.threads_max=64
  54. # mysql backend config
  55. #jdbc.driver=com.mysql.jdbc.Driver
  56. #jdbc.url=jdbc:mysql://127.0.0.1:3306
  57. #jdbc.username=root
  58. #jdbc.password=
  59. #jdbc.reconnect_max_times=3
  60. #jdbc.reconnect_interval=3
  61. #jdbc.ssl_mode=false
  62. # postgresql & cockroachdb backend config
  63. #jdbc.driver=org.postgresql.Driver
  64. #jdbc.url=jdbc:postgresql://localhost:5432/
  65. #jdbc.username=postgres
  66. #jdbc.password=
  67. # palo backend config
  68. #palo.host=127.0.0.1
  69. #palo.poll_interval=10
  70. #palo.temp_dir=./palo-data
  71. #palo.file_limit_size=32

重点关注未注释的几项:

  • gremlin.graph:GremlinServer 的启动入口,用户不要修改此项;
  • backend:使用的后端存储,可选值有 memory、cassandra、scylladb、mysql、hbase、postgresql 和 rocksdb;
  • serializer:主要为内部使用,用于将 schema、vertex 和 edge 序列化到后端,对应的可选值为 text、cassandra、scylladb 和 binary;(注:rocksdb 后端值需是 binary,其他后端 backend 与 serializer 值需保持一致,如 hbase 后端该值为 hbase)
  • store:图存储到后端使用的数据库名,在 cassandra 和 scylladb 中就是 keyspace 名,此项的值与 GremlinServer 和 RestServer 中的图名并无关系,但是出于直观考虑,建议仍然使用相同的名字;
  • cassandra.host:backend 为 cassandra 或 scylladb 时此项才有意义,cassandra/scylladb 集群的 seeds;
  • cassandra.port:backend 为 cassandra 或 scylladb 时此项才有意义,cassandra/scylladb 集群的 native port;
  • rocksdb.data_path:backend 为 rocksdb 时此项才有意义,rocksdb 的数据目录
  • rocksdb.wal_path:backend 为 rocksdb 时此项才有意义,rocksdb 的日志目录
  • admin.token: 通过一个 token 来获取服务器的配置信息,例如:http://localhost:8080/graphs/hugegraph/conf?token=162f7848-0b6d-4faf-b557-3a0797869c55

5 多图配置

我们的系统是可以存在多个图的,并且各个图的后端可以不一样,比如图 hugegraph_rocksdbhugegraph_mysql,其中 hugegraph_rocksdbRocksDB 作为后端,hugegraph_mysqlMySQL 作为后端。

配置方法也很简单:

[可选]:修改 rest-server.properties

通过修改 rest-server.properties 中的 graphs 配置项来设置图的配置文件目录。默认配置为 graphs=./conf/graphs,如果想要修改为其它目录则调整 graphs 配置项,比如调整为 graphs=/etc/hugegraph/graphs,示例如下:

  1. graphs=./conf/graphs

conf/graphs 路径下基于 hugegraph.properties 修改得到 hugegraph_mysql_backend.propertieshugegraph_rocksdb_backend.properties

hugegraph_mysql_backend.properties 修改的部分如下:

  1. backend=mysql
  2. serializer=mysql
  3. store=hugegraph_mysql
  4. # mysql backend config
  5. jdbc.driver=com.mysql.cj.jdbc.Driver
  6. jdbc.url=jdbc:mysql://127.0.0.1:3306
  7. jdbc.username=root
  8. jdbc.password=123456
  9. jdbc.reconnect_max_times=3
  10. jdbc.reconnect_interval=3
  11. jdbc.ssl_mode=false

hugegraph_rocksdb_backend.properties 修改的部分如下:

  1. backend=rocksdb
  2. serializer=binary
  3. store=hugegraph_rocksdb

停止 Server,初始化执行 init-store.sh(为新的图创建数据库),重新启动 Server

  1. $ ./bin/stop-hugegraph.sh
  1. $ ./bin/init-store.sh
  2. Initializing HugeGraph Store...
  3. 2023-06-11 14:16:14 [main] [INFO] o.a.h.u.ConfigUtil - Scanning option 'graphs' directory './conf/graphs'
  4. 2023-06-11 14:16:14 [main] [INFO] o.a.h.c.InitStore - Init graph with config file: ./conf/graphs/hugegraph_rocksdb_backend.properties
  5. ...
  6. 2023-06-11 14:16:15 [main] [INFO] o.a.h.StandardHugeGraph - Graph 'hugegraph_rocksdb' has been initialized
  7. 2023-06-11 14:16:15 [main] [INFO] o.a.h.c.InitStore - Init graph with config file: ./conf/graphs/hugegraph_mysql_backend.properties
  8. ...
  9. 2023-06-11 14:16:16 [main] [INFO] o.a.h.StandardHugeGraph - Graph 'hugegraph_mysql' has been initialized
  10. 2023-06-11 14:16:16 [main] [INFO] o.a.h.StandardHugeGraph - Close graph standardhugegraph[hugegraph_rocksdb]
  11. ...
  12. 2023-06-11 14:16:16 [main] [INFO] o.a.h.HugeFactory - HugeFactory shutdown
  13. 2023-06-11 14:16:16 [hugegraph-shutdown] [INFO] o.a.h.HugeFactory - HugeGraph is shutting down
  14. Initialization finished.
  1. $ ./bin/start-hugegraph.sh
  2. Starting HugeGraphServer...
  3. Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)...OK
  4. Started [pid 21614]

查看创建的图:

  1. curl http://127.0.0.1:8080/graphs/
  2. {"graphs":["hugegraph_rocksdb","hugegraph_mysql"]}

查看某个图的信息:

  1. curl http://127.0.0.1:8080/graphs/hugegraph_mysql_backend
  2. {"name":"hugegraph_mysql","backend":"mysql"}
  1. curl http://127.0.0.1:8080/graphs/hugegraph_rocksdb_backend
  2. {"name":"hugegraph_rocksdb","backend":"rocksdb"}

Last modified November 1, 2023: doc: optimize description about preload, init-store and others (#293) (62101543)