- TiSpark 部署拓扑
- # Global variables are applied to all deployments and used as the default value of
- # the deployments if a specific deployment value is missing.
- # Monitored variables are applied to all the machines.
- deploy_dir: “/tidb-deploy/monitored-9100”
- data_dir: “/tidb-data/monitored-9100”
- log_dir: “/tidb-deploy/monitored-9100/log”
- # Server configs are used to specify the runtime configuration of TiDB components.
- # All configuration items can be found in TiDB docs:
- # - TiDB: https://docs.pingcap.com/zh/tidb/stable/tidb-configuration-file
- # - TiKV: https://docs.pingcap.com/zh/tidb/stable/tikv-configuration-file
- # - PD: https://docs.pingcap.com/zh/tidb/stable/pd-configuration-file
- # All configuration items use points to represent the hierarchy, e.g:
- # readpool.storage.use-unified-pool
- #
- # You can overwrite this configuration via the instance-level
config
field. - ssh_port: 22
- name: “pd-1”
- client_port: 2379
- peer_port: 2380
- deploy_dir: “/tidb-deploy/pd-2379”
- data_dir: “/tidb-data/pd-2379”
- log_dir: “/tidb-deploy/pd-2379/log”
- numa_node: “0,1”
- # The following configs are used to overwrite the
server_configs.pd
values. - config:
- schedule.max-merge-region-size: 20
- schedule.max-merge-region-keys: 200000
- ssh_port: 22
- port: 4000
- status_port: 10080
- deploy_dir: “/tidb-deploy/tidb-4000”
- log_dir: “/tidb-deploy/tidb-4000/log”
- numa_node: “0,1”
- # The following configs are used to overwrite the
server_configs.tidb
values. - config:
- log.slow-query-file: tidb-slow-overwrited.log
- ssh_port: 22
- port: 20160
- status_port: 20180
- deploy_dir: “/tidb-deploy/tikv-20160”
- data_dir: “/tidb-data/tikv-20160”
- log_dir: “/tidb-deploy/tikv-20160/log”
- numa_node: “0,1”
- # The following configs are used to overwrite the
server_configs.tikv
values. - config:
- server.grpc-concurrency: 4
- server.labels: { zone: “zone1”, dc: “dc1”, host: “host1” }
- NOTE: TiSpark support is an experimental feature, it’s not recommend to be used in
- production at present.
- To use TiSpark, you need to manually install Java Runtime Environment (JRE) 8 on the
- host, see the OpenJDK doc for a reference: https://openjdk.java.net/install/
- If you have already installed JRE 1.8 at a location other than the default of system’s
- package management system, you may use the “java_home” field to set the JAVA_HOME variable.
- NOTE: Only 1 master node is supported for now
- ssh_port: 22
- port: 7077
- web_port: 8080
- deploy_dir: “/tidb-deploy/tispark-master-7077”
- java_home: “/usr/local/bin/java-1.8.0”
- spark_config:
- spark.driver.memory: “2g”
- spark.eventLog.enabled: “False”
- spark.tispark.grpc.framesize: 268435456
- spark.tispark.grpc.timeout_in_sec: 100
- spark.tispark.meta.reload_period_in_sec: 60
- spark.tispark.request.command.priority: “Low”
- spark.tispark.table.scan_concurrency: 256
- spark_env:
- SPARK_EXECUTOR_CORES: 5
- SPARK_EXECUTOR_MEMORY: “10g”
- SPARK_WORKER_CORES: 5
- SPARK_WORKER_MEMORY: “10g”
- NOTE: multiple worker nodes on the same host is not supported by Spark
- ssh_port: 22
- port: 7078
- web_port: 8081
- deploy_dir: “/tidb-deploy/tispark-worker-7078”
- java_home: “/usr/local/bin/java-1.8.0”
- ssh_port: 22
- port: 9090
- deploy_dir: “/tidb-deploy/prometheus-8249”
- data_dir: “/tidb-data/prometheus-8249”
- log_dir: “/tidb-deploy/prometheus-8249/log”
- port: 3000
- deploy_dir: /tidb-deploy/grafana-3000
- ssh_port: 22
- web_port: 9093
- cluster_port: 9094
- deploy_dir: “/tidb-deploy/alertmanager-9093”
- data_dir: “/tidb-data/alertmanager-9093”
- log_dir: “/tidb-deploy/alertmanager-9093/log”
TiSpark 部署拓扑
本文介绍 TiSpark 部署的拓扑,以及如何在最小拓扑的基础上同时部署 TiSpark。TiSpark 是 PingCAP 为解决用户复杂 OLAP 需求而推出的产品。它借助 Spark 平台,同时融合 TiKV 分布式集群的优势,和 TiDB 一起为用户一站式解决 HTAP (Hybrid Transactional/Analytical Processing) 的需求。
关于 TiSpark 的架构介绍与使用,参见 TiSpark 用户指南。
警告
TiUP Cluster 的 TiSpark 支持目前为废弃状态,不建议使用。
拓扑信息
实例 | 个数 | 物理机配置 | IP | 配置 |
---|---|---|---|---|
TiDB | 3 | 16 VCore 32GB 1 | 10.0.1.1 10.0.1.2 10.0.1.3 | 默认端口 全局目录配置 |
PD | 3 | 4 VCore 8GB 1 | 10.0.1.4 10.0.1.5 10.0.1.6 | 默认端口 全局目录配置 |
TiKV | 3 | 16 VCore 32GB 2TB (nvme ssd) 1 | 10.0.1.7 10.0.1.8 10.0.1.9 | 默认端口 全局目录配置 |
TiSpark | 3 | 8 VCore 16GB 1 | 10.0.1.21 (master) 10.0.1.22 (worker) 10.0.1.23 (worker) | 默认端口 全局目录配置 |
Monitoring & Grafana | 1 | 4 VCore 8GB * 1 500GB (ssd) | 10.0.1.11 | 默认端口 全局目录配置 |
拓扑模版
简单 TiSpark 配置模板
# # Global variables are applied to all deployments and used as the default value of
# # the deployments if a specific deployment value is missing.
global:
user: "tidb"
ssh_port: 22
deploy_dir: "/tidb-deploy"
data_dir: "/tidb-data"
pd_servers:
- host: 10.0.1.4
- host: 10.0.1.5
- host: 10.0.1.6
tidb_servers:
- host: 10.0.1.1
- host: 10.0.1.2
- host: 10.0.1.3
tikv_servers:
- host: 10.0.1.7
- host: 10.0.1.8
- host: 10.0.1.9
# NOTE: TiSpark support is an experimental feature, it's not recommend to be used in
# production at present.
# To use TiSpark, you need to manually install Java Runtime Environment (JRE) 8 on the
# host, see the OpenJDK doc for a reference: https://openjdk.java.net/install/
# NOTE: Only 1 master node is supported for now
tispark_masters:
- host: 10.0.1.21
# NOTE: multiple worker nodes on the same host is not supported by Spark
tispark_workers:
- host: 10.0.1.22
- host: 10.0.1.23
monitoring_servers:
- host: 10.0.1.10
grafana_servers:
- host: 10.0.1.10
alertmanager_servers:
- host: 10.0.1.10
``` 详细 TiSpark 配置模板
# Global variables are applied to all deployments and used as the default value of
# the deployments if a specific deployment value is missing.
global: user: “tidb” ssh_port: 22 deploy_dir: “/tidb-deploy” data_dir: “/tidb-data”
# Monitored variables are applied to all the machines.
monitored: node_exporter_port: 9100 blackbox_exporter_port: 9115
deploy_dir: “/tidb-deploy/monitored-9100”
data_dir: “/tidb-data/monitored-9100”
log_dir: “/tidb-deploy/monitored-9100/log”
# Server configs are used to specify the runtime configuration of TiDB components.
# All configuration items can be found in TiDB docs:
# - TiDB: https://docs.pingcap.com/zh/tidb/stable/tidb-configuration-file
# - TiKV: https://docs.pingcap.com/zh/tidb/stable/tikv-configuration-file
# - PD: https://docs.pingcap.com/zh/tidb/stable/pd-configuration-file
# All configuration items use points to represent the hierarchy, e.g:
# readpool.storage.use-unified-pool
#
# You can overwrite this configuration via the instance-level config
field.
server_configs: tidb: log.slow-threshold: 300 tikv:
# server.grpc-concurrency: 4
# raftstore.apply-pool-size: 2
# raftstore.store-pool-size: 2
# rocksdb.max-sub-compactions: 1
# storage.block-cache.capacity: "16GB"
# readpool.unified.max-thread-count: 12
readpool.storage.use-unified-pool: false
readpool.coprocessor.use-unified-pool: true
pd: schedule.leader-schedule-limit: 4 schedule.region-schedule-limit: 2048 schedule.replica-schedule-limit: 64
pd_servers:
- host: 10.0.1.4
ssh_port: 22
name: “pd-1”
client_port: 2379
peer_port: 2380
deploy_dir: “/tidb-deploy/pd-2379”
data_dir: “/tidb-data/pd-2379”
log_dir: “/tidb-deploy/pd-2379/log”
numa_node: “0,1”
# The following configs are used to overwrite the
server_configs.pd
values.config:
schedule.max-merge-region-size: 20
schedule.max-merge-region-keys: 200000
- host: 10.0.1.5
- host: 10.0.1.6
tidb_servers:
- host: 10.0.1.1
ssh_port: 22
port: 4000
status_port: 10080
deploy_dir: “/tidb-deploy/tidb-4000”
log_dir: “/tidb-deploy/tidb-4000/log”
numa_node: “0,1”
# The following configs are used to overwrite the
server_configs.tidb
values.config:
log.slow-query-file: tidb-slow-overwrited.log
- host: 10.0.1.2
- host: 10.0.1.3
tikv_servers:
host: 10.0.1.7
ssh_port: 22
port: 20160
status_port: 20180
deploy_dir: “/tidb-deploy/tikv-20160”
data_dir: “/tidb-data/tikv-20160”
log_dir: “/tidb-deploy/tikv-20160/log”
numa_node: “0,1”
# The following configs are used to overwrite the
server_configs.tikv
values.config:
server.grpc-concurrency: 4
server.labels: { zone: “zone1”, dc: “dc1”, host: “host1” }
host: 10.0.1.8
- host: 10.0.1.9
NOTE: TiSpark support is an experimental feature, it’s not recommend to be used in
production at present.
To use TiSpark, you need to manually install Java Runtime Environment (JRE) 8 on the
host, see the OpenJDK doc for a reference: https://openjdk.java.net/install/
If you have already installed JRE 1.8 at a location other than the default of system’s
package management system, you may use the “java_home” field to set the JAVA_HOME variable.
NOTE: Only 1 master node is supported for now
tispark_masters:
- host: 10.0.1.21
ssh_port: 22
port: 7077
web_port: 8080
deploy_dir: “/tidb-deploy/tispark-master-7077”
java_home: “/usr/local/bin/java-1.8.0”
spark_config:
spark.driver.memory: “2g”
spark.eventLog.enabled: “False”
spark.tispark.grpc.framesize: 268435456
spark.tispark.grpc.timeout_in_sec: 100
spark.tispark.meta.reload_period_in_sec: 60
spark.tispark.request.command.priority: “Low”
spark.tispark.table.scan_concurrency: 256
spark_env:
SPARK_EXECUTOR_CORES: 5
SPARK_EXECUTOR_MEMORY: “10g”
SPARK_WORKER_CORES: 5
SPARK_WORKER_MEMORY: “10g”
NOTE: multiple worker nodes on the same host is not supported by Spark
tispark_workers:
- host: 10.0.1.22
ssh_port: 22
port: 7078
web_port: 8081
deploy_dir: “/tidb-deploy/tispark-worker-7078”
java_home: “/usr/local/bin/java-1.8.0”
- host: 10.0.1.23
monitoring_servers:
- host: 10.0.1.10
ssh_port: 22
port: 9090
deploy_dir: “/tidb-deploy/prometheus-8249”
data_dir: “/tidb-data/prometheus-8249”
log_dir: “/tidb-deploy/prometheus-8249/log”
grafana_servers:
alertmanager_servers:
- host: 10.0.1.10
ssh_port: 22
web_port: 9093
cluster_port: 9094
deploy_dir: “/tidb-deploy/alertmanager-9093”
data_dir: “/tidb-data/alertmanager-9093”
log_dir: “/tidb-deploy/alertmanager-9093/log”
```
以上 TiDB 集群拓扑文件中,详细的配置项说明见通过 TiUP 部署 TiDB 集群的拓扑文件配置。
注意
- 无需手动创建配置文件中的
tidb
用户,TiUP cluster 组件会在目标主机上自动创建该用户。可以自定义用户,也可以和中控机的用户保持一致。 - 如果部署目录配置为相对路径,会部署在用户的 Home 目录下。
环境要求
由于 TiSpark 基于 Apache Spark 集群,在启动包含 TiSpark 组件的 TiDB 集群前,需要在部署了 TiSpark 组件的服务器上安装 Java 运行时环境(JRE) 8,否则将无法启动相关组件。
TiUP 不提供自动安装 JRE 的支持,该操作需要用户自行完成。JRE 8 的安装方法可以参考 OpenJDK 的文档说明。
如果部署服务器上已经安装有 JRE 8,但不在系统的默认包管理工具路径中,可以通过在拓扑配置中设置 java_home
参数来指定要使用的 JRE 环境所在的路径。该参数对应系统环境变量 JAVA_HOME
。