控制台

控制台是维护底层资源的管理平台,分为队列管理、资源管理、多集群管理三个模块

多集群管理

多集群管理用于配置任务实例运行所依赖的底层组件,比如资源调度组件 YARN、存储组件 HDFS、计算组件 Flink 等

将租户绑定到某个集群,该租户提交的任务就会使用该集群下对应的组件

tip

在实际的数据开发中,我们可以将资源进行合理的分配:不同的集群使用不同的资源,然后让不同环境下的租户绑定到不同的集群,比如测试环境租户绑定到测试集群、预发环境租户绑定预发集群,从而实现对资源的隔离使用

info

多集群管理按照公共组件—>资源调度组件—>存储组件—>计算组件的顺序进行配置

公共组件

公共组件用于配置 SFTP 组件,相关资源、配置文件上传下载都会使用对应的sftp组件来操作

tip

sftp组件用于保存yarn组件中的配置文件信息 以及kerberos文件信息 任意一台机器即可 但需要保证计算节点和Taier网络能访问

SFTP 配置

资源调度组件

资源调度组件主要用于配置 YARN 组件,因为某些计算组件需要依赖资源调度组件,比如 Flink 计算组件需要依赖 YARN,所以需要提前配置好资源调度组件

YARN 配置

tip

不同厂商的hadoop集群提交任务会依赖不同的参数,可以在适配hadoop集群的时候通过自定义参数来动态调整

这里默认以Apache Hadoop2 为例 如果没有对应hadoop版本 可以使用Apache Hadoop2 通过采用适配集群的方式来提交任务

存储组件

存储组件主要用于配置 HDFS 组件,因为某些计算组件需要依赖存储组件,比如 Flink 计算组件需要依赖 HDFS 存储组件,所以需要提前配置好存储组件

HDFS 配置

tip

为了保持 Hadoop 引擎下,YARN 和 HDFS 组件的版本一致性,当切换 YARN 的组件版本,进行保存后,存储组件 HDFS 的版本也将同步变更

计算组件

计算组件主要用于配置 Flink、Spark 等计算引擎

Flink 配置

部署模式分为 perjob、session 两种模式

公共参数
参数项默认值说明是否必填
clusterModeperjob模式为perjob;session模式为session任务执行模式:perjob, session
flinkJarPath/data/insight_plugin/flink110_libflink lib path
remoteFlinkJarPath/data/insight_plugin/flink110_libflink lib远程路径
flinkPluginRoot/data/insight_pluginflinkStreamSql和flinkx plugins父级本地目录
remotePluginRootDir/data/insight_pluginflinkStreamSql和flinkx plugins父级远程目录
pluginLoadModeshipfile插件加载类型
monitorAcceptedAppfalse是否监控yarn accepted状态任务
yarnAccepterTaskNumber3允许yarn accepter任务数量,达到这个值后不允许任务提交
prometheusHostprometheus地址,获取数据同步指标使用
prometheusPort9090prometheus,获取数据同步指标使用
classloader.dtstack-cachetrue是否缓存classloader
session
参数项默认值说明是否必填
checkSubmitJobGraphInterval60session check间隔(60 * 10s)
flinkSessionSlotCount10flink session允许的最大slot数
sessionRetryNum5session重试次数,达到后会放缓重试的频率
sessionStartAutotrue是否允许Taier启动flink session
flinkSessionNameflink_sessionflink session任务名
jobmanager.heap.mb2048jobmanager内存大小
taskmanager.heap.mb1024taskmanager内存大小
参数项默认值说明是否必填
env.java.opts-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing -XX:MaxMetaspaceSize=300m -Dfile.encoding=UTF-8jvm参数
classloader.resolve-orderperjob默认为child-first,session默认为parent-first类加载模式
high-availabilityZOOKEEPERflink ha类型
high-availability.zookeeper.quorumzookeeper地址,当ha选择是zookeeper时必填
high-availability.zookeeper.path.root/flink110ha节点路径
high-availability.storageDirhdfs://ns1/dtInsight/flink110/haha元数据存储路径
jobmanager.archive.fs.dirhdfs://ns1/dtInsight/flink110/completed-jobs 任务结束后任务信息存储路径 是
metrics.reporter.promgateway.classorg.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter用来推送指标类
metrics.reporter.promgateway.hostpromgateway地址
metrics.reporter.promgateway.port9091promgateway端口
metrics.reporter.promgateway.deleteOnShutdowntrue任务结束后是否删除指标 是
metrics.reporter.promgateway.jobName110job 指标任务名
metrics.reporter.promgateway.randomJobNameSuffixtrue是否在任务名上添加随机值
state.backendRocksDB状态后端
state.backend.incrementaltrue是否开启增量
state.checkpoints.dirhdfs://ns1/dtInsight/flink110/checkpointscheckpoint路径地址
state.checkpoints.num-retained11checkpoint保存个数
state.savepoints.dirhdfs://ns1/dtInsight/flink110/savepointssavepoint路径
yarn.application-attempts3重试次数
yarn.application-attempt-failures-validity-interval3600000重试窗口时间大小
akka.ask.timeout60 s
akka.tcp.timeout60 s

更多 Flink 参数项详见官方文档

prejob和session 都依赖chunjun的插件包和flink的lib包

flinkJarPath配置的是flink 的lib目录
如:flinkJarPath = /opt/dtstack/flink110_lib
/opt/dtstack/flink110_lib 目录结构为:

  1. ├── flink-dist_2.11-1.10.0.jar
  2. ├── flink-metrics-prometheus-1.10.0.jar
  3. ├── flink-shaded-hadoop-2-uber-2.7.5-10.0.jar
  4. ├── flink-streaming-java_2.11-1.10.0.jar
  5. ├── flink-table_2.11-1.10.0.jar
  6. ├── flink-table-blink_2.11-1.10.0.jar
  7. └── log4j-1.2.17.jar

flinkPluginRoot配置的是chunjun的插件包目录
如:flinkPluginRoot = /opt/dtstack/110_flinkplugin
/opt/dtstack/110_flinkplugin 目录结构为:

  1. └── syncplugin
  2. ├── adbpostgresqlreader
  3. └── flinkx-adbpostgresql-reader-feat_1.10_4.3.x_metadata.jar
  4. ├── adbpostgresqlwriter
  5. └── flinkx-adbpostgresql-writer-feat_1.10_4.3.x_metadata.jar
  6. ├── binlogreader
  7. └── flinkx-binlog-reader-feat_1.10_4.3.x_metadata.jar
  8. ├── carbondatareader
  9. └── flinkx-carbondata-reader.jar
  10. ├── carbondatawriter
  11. └── flinkx-carbondata-writer.jar
  12. ├── cassandrareader
  13. └── flinkx-cassandra-reader-feat_1.10_4.3.x_metadata.jar
  14. ├── cassandrawriter
  15. └── flinkx-cassandra-writer-feat_1.10_4.3.x_metadata.jar
  16. ├── clickhousereader
  17. └── flinkx-clickhouse-reader-feat_1.10_4.3.x_metadata.jar
  18. ├── clickhousewriter
  19. └── flinkx-clickhouse-writer-feat_1.10_4.3.x_metadata.jar
  20. ├── common
  21. ├── flinkx-rdb-core-feat_1.10_4.3.x_metadata.jar
  22. ├── flinkx-rdb-reader-feat_1.10_4.3.x_metadata.jar
  23. └── flinkx-rdb-writer-feat_1.10_4.3.x_metadata.jar
  24. ├── db2reader
  25. └── flinkx-db2-reader-feat_1.10_4.3.x_metadata.jar
  26. ├── db2writer
  27. └── flinkx-db2-writer-feat_1.10_4.3.x_metadata.jar
  28. ├── dmreader
  29. └── flinkx-dm-reader-feat_1.10_4.3.x_metadata.jar
  30. ├── dmwriter
  31. └── flinkx-dm-writer-feat_1.10_4.3.x_metadata.jar
  32. ├── doriswriter
  33. └── flinkx-doris-writer-feat_1.10_4.3.x_metadata.jar
  34. ├── emqxreader
  35. └── flinkx-emqx-reader-feat_1.10_4.3.x_metadata.jar
  36. ├── emqxwriter
  37. └── flinkx-emqx-writer-feat_1.10_4.3.x_metadata.jar
  38. ├── esreader
  39. └── flinkx-es-reader-feat_1.10_4.3.x_metadata.jar
  40. ├── eswriter
  41. └── flinkx-es-writer-feat_1.10_4.3.x_metadata.jar
  42. ├── ftpreader
  43. └── flinkx-ftp-reader-feat_1.10_4.3.x_metadata.jar
  44. ├── ftpwriter
  45. └── flinkx-ftp-writer-feat_1.10_4.3.x_metadata.jar
  46. ├── gbasereader
  47. └── flinkx-gbase-reader-feat_1.10_4.3.x_metadata.jar
  48. ├── gbasewriter
  49. └── flinkx-gbase-writer-feat_1.10_4.3.x_metadata.jar
  50. ├── greenplumreader
  51. └── flinkx-greenplum-reader-feat_1.10_4.3.x_metadata.jar
  52. ├── greenplumwriter
  53. └── flinkx-greenplum-writer-feat_1.10_4.3.x_metadata.jar
  54. ├── hbasereader
  55. └── flinkx-hbase-reader-feat_1.10_4.3.x_metadata.jar
  56. ├── hbasewriter
  57. └── flinkx-hbase-writer-feat_1.10_4.3.x_metadata.jar
  58. ├── hdfsreader
  59. └── flinkx-hdfs-reader-feat_1.10_4.3.x_metadata.jar
  60. ├── hdfswriter
  61. └── flinkx-hdfs-writer-feat_1.10_4.3.x_metadata.jar
  62. ├── hivewriter
  63. └── flinkx-hive-writer-feat_1.10_4.3.x_metadata.jar
  64. ├── inceptorreader
  65. └── flinkx-inceptor-reader-feat_1.10_4.3.x_metadata.jar
  66. ├── inceptorwriter
  67. └── flinkx-inceptor-writer-feat_1.10_4.3.x_metadata.jar
  68. ├── influxdbreader
  69. └── flinkx-influxdb-reader-feat_1.10_4.3.x_metadata.jar
  70. ├── kafka09reader
  71. └── flinkx-kafka09-reader-feat_1.10_4.3.x_metadata.jar
  72. ├── kafka09writer
  73. └── flinkx-kafka09-writer-feat_1.10_4.3.x_metadata.jar
  74. ├── kafka10reader
  75. └── flinkx-kafka10-reader-feat_1.10_4.3.x_metadata.jar
  76. ├── kafka10writer
  77. └── flinkx-kafka10-writer-feat_1.10_4.3.x_metadata.jar
  78. ├── kafka11reader
  79. └── flinkx-kafka11-reader-feat_1.10_4.3.x_metadata.jar
  80. ├── kafka11writer
  81. └── flinkx-kafka11-writer-feat_1.10_4.3.x_metadata.jar
  82. ├── kafkareader
  83. └── flinkx-kafka-reader-feat_1.10_4.3.x_metadata.jar
  84. ├── kafkawriter
  85. └── flinkx-kafka-writer-feat_1.10_4.3.x_metadata.jar
  86. ├── kingbasereader
  87. └── flinkx-kingbase-reader-feat_1.10_4.3.x_metadata.jar
  88. ├── kingbasewriter
  89. └── flinkx-kingbase-writer-feat_1.10_4.3.x_metadata.jar
  90. ├── kudureader
  91. └── flinkx-kudu-reader-feat_1.10_4.3.x_metadata.jar
  92. ├── kuduwriter
  93. └── flinkx-kudu-writer-feat_1.10_4.3.x_metadata.jar
  94. ├── metadatahbasereader
  95. └── flinkx-metadata-hbase-reader-feat_1.10_4.3.x_metadata.jar
  96. ├── metadatahive1reader
  97. └── flinkx-metadata-hive1-reader-feat_1.10_4.3.x_metadata.jar
  98. ├── metadatahive2reader
  99. └── flinkx-metadata-hive2-reader-feat_1.10_4.3.x_metadata.jar
  100. ├── metadatahivecdcreader
  101. └── flinkx-metadata-hivecdc-reader-feat_1.10_4.3.x_metadata.jar
  102. ├── metadatakafkareader
  103. └── flinkx-metadata-kafka-reader-feat_1.10_4.3.x_metadata.jar
  104. ├── metadatamysqlreader
  105. └── flinkx-metadata-mysql-reader-feat_1.10_4.3.x_metadata.jar
  106. ├── metadataoraclereader
  107. └── flinkx-metadata-oracle-reader-feat_1.10_4.3.x_metadata.jar
  108. ├── metadataphoenix5reader
  109. └── flinkx-metadata-phoenix5-reader-feat_1.10_4.3.x_metadata.jar
  110. ├── metadatasparkthriftreader
  111. └── flinkx-metadata-sparkthrift-reader-feat_1.10_4.3.x_metadata.jar
  112. ├── metadatasqlserverreader
  113. └── flinkx-metadata-sqlserver-reader-feat_1.10_4.3.x_metadata.jar
  114. ├── metadatatidbreader
  115. └── flinkx-metadata-tidb-reader-feat_1.10_4.3.x_metadata.jar
  116. ├── metadataverticareader
  117. └── flinkx-metadata-vertica-reader-feat_1.10_4.3.x_metadata.jar
  118. ├── mongodbreader
  119. └── flinkx-mongodb-reader-feat_1.10_4.3.x_metadata.jar
  120. ├── mongodbwriter
  121. └── flinkx-mongodb-writer-feat_1.10_4.3.x_metadata.jar
  122. ├── mysqlcdcreader
  123. └── flinkx-cdc-reader-feat_1.10_4.3.x_metadata.jar
  124. ├── mysqldreader
  125. └── flinkx-mysql-dreader-feat_1.10_4.3.x_metadata.jar
  126. ├── mysqlreader
  127. └── flinkx-mysql-reader-feat_1.10_4.3.x_metadata.jar
  128. ├── mysqlwriter
  129. └── flinkx-mysql-writer-feat_1.10_4.3.x_metadata.jar
  130. ├── odpsreader
  131. └── flinkx-odps-reader-feat_1.10_4.3.x_metadata.jar
  132. ├── odpswriter
  133. └── flinkx-odps-writer-feat_1.10_4.3.x_metadata.jar
  134. ├── opentsdbreader
  135. └── flinkx-opentsdb-reader-feat_1.10_4.3.x_metadata.jar
  136. ├── oracle9reader
  137. ├── flinkx-oracle9-reader-feat_1.10_4.3.x_metadata.jar
  138. └── flinkx-oracle9reader.zip
  139. ├── oracle9writer
  140. ├── flinkx-oracle9-writer-feat_1.10_4.3.x_metadata.jar
  141. └── flinkx-oracle9writer.zip
  142. ├── oraclelogminerreader
  143. └── flinkx-oraclelogminer-reader-feat_1.10_4.3.x_metadata.jar
  144. ├── oraclereader
  145. └── flinkx-oracle-reader-feat_1.10_4.3.x_metadata.jar
  146. ├── oraclewriter
  147. └── flinkx-oracle-writer-feat_1.10_4.3.x_metadata.jar
  148. ├── pgwalreader
  149. └── flinkx-pgwal-reader-feat_1.10_4.3.x_metadata.jar
  150. ├── phoenix5reader
  151. └── flinkx-phoenix5-reader-feat_1.10_4.3.x_metadata.jar
  152. ├── phoenix5writer
  153. └── flinkx-phoenix5-writer-feat_1.10_4.3.x_metadata.jar
  154. ├── polardbdreader
  155. └── flinkx-polardb-dreader-feat_1.10_4.3.x_metadata.jar
  156. ├── polardbreader
  157. └── flinkx-polardb-reader-feat_1.10_4.3.x_metadata.jar
  158. ├── polardbwriter
  159. └── flinkx-polardb-writer-feat_1.10_4.3.x_metadata.jar
  160. ├── postgresqlreader
  161. └── flinkx-postgresql-reader-feat_1.10_4.3.x_metadata.jar
  162. ├── postgresqlwriter
  163. └── flinkx-postgresql-writer-feat_1.10_4.3.x_metadata.jar
  164. ├── rediswriter
  165. └── flinkx-redis-writer-feat_1.10_4.3.x_metadata.jar
  166. ├── restapireader
  167. └── flinkx-restapi-reader-feat_1.10_4.3.x_metadata.jar
  168. ├── restapiwriter
  169. └── flinkx-restapi-writer-feat_1.10_4.3.x_metadata.jar
  170. ├── s3reader
  171. └── flinkx-s3-reader-feat_1.10_4.3.x_metadata.jar
  172. ├── s3writer
  173. └── flinkx-s3-writer-feat_1.10_4.3.x_metadata.jar
  174. ├── saphanareader
  175. └── flinkx-saphana-reader-feat_1.10_4.3.x_metadata.jar
  176. ├── saphanawriter
  177. └── flinkx-saphana-writer-feat_1.10_4.3.x_metadata.jar
  178. ├── socketreader
  179. └── flinkx-socket-reader-feat_1.10_4.3.x_metadata.jar
  180. ├── solrreader
  181. └── flinkx-solr-reader-feat_1.10_4.3.x_metadata.jar
  182. ├── solrwriter
  183. └── flinkx-solr-writer-feat_1.10_4.3.x_metadata.jar
  184. ├── sqlservercdcreader
  185. └── flinkx-sqlservercdc-reader-feat_1.10_4.3.x_metadata.jar
  186. ├── sqlserverreader
  187. └── flinkx-sqlserver-reader-feat_1.10_4.3.x_metadata.jar
  188. ├── sqlserverwriter
  189. └── flinkx-sqlserver-writer-feat_1.10_4.3.x_metadata.jar
  190. ├── streamreader
  191. └── flinkx-stream-reader-feat_1.10_4.3.x_metadata.jar
  192. ├── streamwriter
  193. └── flinkx-stream-writer-feat_1.10_4.3.x_metadata.jar
  194. └── websocketreader
  195. └── flinkx-websocket-reader-feat_1.10_4.3.x_metadata.jar
tip

配置好数据同步之后运行,如果一直提示等待运行,可以去monitor.log查看相应日志,确认flinkPluginRoot是否包含syncplugin的插件目录

Spark

Spark 配置

Spark 参数项
参数项默认值说明是否必填
spark.driver.extraJavaOptions-Dfile.encoding=UTF-8driver的jvm参数
spark.executor.extraJavaOptions-Dfile.encoding=UTF-8executor的jvm参数
spark.eventLog.compressfalse是否压缩日志
spark.eventLog.dirhdfs://ns1/tmp/logsspark日志存放路径
spark.eventLog.enabledtrue是否记录 Spark 日志
spark.executor.cores1每个执行程序上使用的内核数
spark.executor.heartbeatInterval10s每个执行程序对驱动程序的心跳之间的间隔
spark.executor.instances1启动执行程序进程的实例数
spark.executor.memory1g每个执行程序进程使用的内存量
spark.network.timeout600s所有网络交互的默认超时时长
spark.rpc.askTimeout600sRPC 请求操作在超时之前等待的持续时间
spark.submit.deployModeclusterspark任务提交模式
spark.yarn.appMasterEnv.PYSPARK_PYTHON/data/anaconda3/bin/python3python环境变量路径
spark.yarn.maxAppAttempts4提交申请的最大尝试次数

更多 Spark 参数项详见官方文档

自定义参数
参数项默认值说明是否必填
sparkPythonExtLibPathhdfs://ns1/dtInsight/spark240/pythons/pyspark.zip,hdfs://ns1/dtInsight/spark240/pythons/py4j-0.10.7-src.zippyspark.zip和py4j-0.10.7-src.zip路径
sparkSqlProxyPathhdfs://ns1/dtInsight/spark240/client/spark-sql-proxy.jarspark-sql-proxy.jar路径,用于执行spark sql
sparkYarnArchivehdfs://ns1/dtInsight/spark240/jarsspark jars路径
yarnAccepterTaskNumber3允许的accepter任务数量

sparkSqlProxyPath是Spark SQL任务运行的jar
需要将pluginLibs/yarn2-hdfs2-spark210/spark-sql-proxy.jar 手动上传到对应的目录 sparkYarnArchive是Spark SQL程序运行时加载的包 直接将spark目录下的jar包上传到对应目录

tip

Flink、Spark可以添加自定义参数,在自定义参数中添加Flink、Spark官方参数来调整任务提交参数信息