Linkis1.0 Configurations

The configuration of Linkis1.0 is simplified on the basis of Linkis0.x. A public configuration file linkis.properties is provided in the conf directory to avoid the need for common configuration parameters to be configured in multiple microservices at the same time. This document will list the parameters of Linkis1.0 in modules.

  1. Please be noticed: This article only lists all the configuration parameters related to Linkis that have an impact on operating performance or environment dependence. Many configuration parameters that do not need users to care about have been omitted. If users are interested, they can browse through the source code.

1 General configuration

  1. The general configuration can be set in the global linkis.properties, one setting, each microservice can take effect.

1.1 Global configurations

Parameter nameDefault valueDescription
wds.linkis.encodingutf-8Linkis default encoding format
wds.linkis.date.patternyyyy-MM-dd’T’HH:mm:ssZDefault date format
wds.linkis.test.modefalseWhether to enable debugging mode, if set to true, all microservices support password-free login, and all EngineConn open remote debugging ports
wds.linkis.test.userNoneWhen wds.linkis.test.mode=true, the default login user for password-free login
wds.linkis.home/appcom/Install/LinkisInstallLinkis installation directory, if it does not exist, it will automatically get the value of LINKIS_HOME
wds.linkis.httpclient.default.connect.timeOut50000Linkis HttpClient default connection timeout

1.2 LDAP configurations

Parameter nameDefault valueDescription
wds.linkis.ldap.proxy.urlNoneLDAP URL address
wds.linkis.ldap.proxy.baseDNNoneLDAP baseDN address
wds.linkis.ldap.proxy.userNameFormatNone

1.3 Hadoop configuration parameters

Parameter nameDefault valueDescription
wds.linkis.hadoop.root.userhadoopHDFS super user
wds.linkis.filesystem.hdfs.root.pathNoneUser’s HDFS default root path
wds.linkis.keytab.enablefalseWhether to enable kerberos
wds.linkis.keytab.file/appcom/keytabKerberos keytab path, effective only when wds.linkis.keytab.enable=true
wds.linkis.keytab.host.enabledfalse
wds.linkis.keytab.host127.0.0.1
hadoop.config.dirNoneIf not configured, it will be read from the environment variable HADOOP_CONF_DIR
wds.linkis.hadoop.external.conf.dir.prefix/appcom/config/external-conf/hadoophadoop additional configuration

1.4 Linkis RPC configuration parameters

Parameter nameDefault valueDescription
wds.linkis.rpc.broadcast.thread.num10Linkis RPC broadcast thread number (Recommended default value)
wds.linkis.ms.rpc.sync.timeout60000Linkis RPC Receiver’s default processing timeout time
wds.linkis.rpc.eureka.client.refresh.interval1sRefresh interval of Eureka client’s microservice list (Recommended default value)
wds.linkis.rpc.eureka.client.refresh.wait.time.max1mRefresh maximum waiting time (recommended default value)
wds.linkis.rpc.receiver.asyn.consumer.thread.max10Maximum number of Receiver Consumer threads (If there are many online users, it is recommended to increase this parameter appropriately)
wds.linkis.rpc.receiver.asyn.consumer.freeTime.max2mReceiver Consumer maximum idle time
wds.linkis.rpc.receiver.asyn.queue.size.max1000The maximum number of buffers in the receiver consumption queue (If there are many online users, it is recommended to increase this parameter appropriately)
wds.linkis.rpc.sender.asyn.consumer.thread.max”, 5Sender Consumer maximum number of threads
wds.linkis.rpc.sender.asyn.consumer.freeTime.max2mSender Consumer Maximum Free Time
wds.linkis.rpc.sender.asyn.queue.size.max300Sender consumption queue maximum buffer number

2. Calculate governance configuration parameters

2.1 Entrance configuration parameters

Parameter nameDefault valueDescription
wds.linkis.spark.engine.version2.4.3The default Spark version used when the user submits a script without specifying a version
wds.linkis.hive.engine.version1.2.1The default Hive version used when the user submits a script without a specified version
wds.linkis.python.engine.versionpython2The default Python version used when the user submits a script without specifying a version
wds.linkis.jdbc.engine.version4The default JDBC version used when the user submits the script without specifying the version
wds.linkis.shell.engine.version1The default shell version used when the user submits a script without specifying a version
wds.linkis.appconn.engine.versionv1The default AppConn version used when the user submits a script without a specified version
wds.linkis.entrance.scheduler.maxParallelismUsers1000Maximum number of concurrent users supported by Entrance
wds.linkis.entrance.job.persist.wait.max5mMaximum time for Entrance to wait for JobHistory to persist a Job
wds.linkis.entrance.config.log.pathNoneIf not configured, the value of wds.linkis.filesystem.hdfs.root.path is used by default
wds.linkis.default.requestApplication.nameIDEThe default submission system when the submission system is not specified
wds.linkis.default.runTypesqlThe default script type when the script type is not specified
wds.linkis.warn.log.excludeorg.apache,hive.ql,hive.metastore,com.netflix,com.webank.wedatasphereReal-time WARN-level logs that are not output to the client by default
wds.linkis.log.excludeorg.apache, hive.ql, hive.metastore, com.netflix, com.webank.wedatasphere, com.webankReal-time INFO-level logs that are not output to the client by default
wds.linkis.instance3User’s default number of concurrent jobs per engine
wds.linkis.max.ask.executor.time5mApply to LinkisManager for the maximum time available for EngineConn
wds.linkis.hive.special.log.includeorg.apache.hadoop.hive.ql.exec.TaskWhen pushing Hive logs to the client, which logs are not filtered by default
wds.linkis.spark.special.log.includeorg.apache.linkis.engine.spark.utils.JobProgressUtilWhen pushing Spark logs to the client, which logs are not filtered by default
wds.linkis.entrance.shell.danger.check.enabledfalseWhether to check and block dangerous shell syntax
wds.linkis.shell.danger.usagerm,sh,find,kill,python,for,source,hdfs,hadoop,spark-sql,spark-submit,pyspark,spark-shell,hive,yarnShell default Dangerous grammar
wds.linkis.shell.white.usagecd,lsShell whitelist syntax
wds.linkis.sql.default.limit5000SQL default maximum return result set rows

2.2 EngineConn configuration parameters

Parameter nameDefault valueDescription
wds.linkis.engineconn.resultSet.default.store.pathhdfs:///tmpJob result set default storage path
wds.linkis.engine.resultSet.cache.max0kWhen the size of the result set is lower than how much, EngineConn will return to Entrance without placing the disk.
wds.linkis.engine.default.limit5000
wds.linkis.engine.lock.expire.time120000The maximum idle time of the engine lock, that is, after Entrance applies for the lock, how long does it take to submit code to EngineConn will be released
wds.linkis.engineconn.ignore.wordsorg.apache.spark.deploy.yarn.ClientLogs that are ignored by default when the Engine pushes logs to the Entrance side
wds.linkis.engineconn.pass.wordsorg.apache.hadoop.hive.ql.exec.TaskThe log that must be pushed by default when the Engine pushes logs to the Entrance side
wds.linkis.engineconn.heartbeat.time3mDefault heartbeat interval from EngineConn to LinkisManager
wds.linkis.engineconn.max.free.time1hEngineConn’s maximum free time

2.3 EngineConnManager configuration parameters

Parameter nameDefault valueDescription
wds.linkis.ecm.memory.max80gECM’s maximum bootable EngineConn memory
wds.linkis.ecm.cores.max50ECM’s maximum number of CPUs that can start EngineConn
wds.linkis.ecm.engineconn.instances.max50The maximum number of EngineConn that can be started, it is generally recommended to set the same as wds.linkis.ecm.cores.max
wds.linkis.ecm.protected.memory4gECM protected memory, that is, the memory used by ECM to start EngineConn cannot exceed wds.linkis.ecm.memory.max-wds.linkis.ecm.protected.memory
wds.linkis.ecm.protected.cores.max2The number of protected CPUs of ECM, the meaning is the same as wds.linkis.ecm.protected.memory
wds.linkis.ecm.protected.engine.instances2Number of protected instances of ECM
wds.linkis.engineconn.wait.callback.pid3sWaiting time for EngineConn to return pid

2.4 LinkisManager configuration parameters

Parameter nameDefault valueDescription
wds.linkis.manager.am.engine.start.max.time”10mThe maximum start time for LinkisManager to start a new EngineConn
wds.linkis.manager.am.engine.reuse.max.time5mLinkisManager reuses an existing EngineConn’s maximum selection time
wds.linkis.manager.am.engine.reuse.count.limit10LinkisManager reuses an existing EngineConn’s maximum polling times
wds.linkis.multi.user.engine.typesjdbc,es,prestoWhen LinkisManager reuses an existing EngineConn, which engine users are not used as reuse rules
wds.linkis.rm.instance10The default maximum number of instances per user per engine
wds.linkis.rm.yarnqueue.cores.max150Maximum number of cores per user in each engine usage queue
wds.linkis.rm.yarnqueue.memory.max450gThe maximum amount of memory per user in each engine’s use queue
wds.linkis.rm.yarnqueue.instance.max30The maximum number of applications launched by each user in the queue of each engine

3. Each engine configuration parameter

3.1 JDBC engine configuration parameters

Parameter nameDefault valueDescription
wds.linkis.jdbc.default.limit5000The default maximum return result set rows
wds.linkis.jdbc.support.dbsmysql=>com.mysql.jdbc.Driver,postgresql=>org.postgresql.Driver,oracle=>oracle.jdbc.driver.OracleDriver,hive2=>org.apache.hive .jdbc.HiveDriver,presto=>com.facebook.presto.jdbc.PrestoDriverDrivers supported by JDBC engine
wds.linkis.engineconn.jdbc.concurrent.limit100Maximum number of concurrent SQL executions

3.2 Python engine configuration parameters

Parameter nameDefault valueDescription
pythonVersion/appcom/Install/anaconda3/bin/pythonPython command path
python.pathNoneSpecify an additional path for Python, which only accepts shared storage paths

3.3 Spark engine configuration parameters

Parameter nameDefault valueDescription
wds.linkis.engine.spark.language-repl.init.time30sMaximum initialization time for Scala and Python command interpreters
PYSPARK_DRIVER_PYTHONpythonPython command path
wds.linkis.server.spark-submitspark-submitspark-submit command path

4. PublicEnhancements configuration parameters

4.1 BML configuration parameters

Parameter nameDefault valueDescription
wds.linkis.bml.dws.versionv1Version number requested by Linkis Restful
wds.linkis.bml.auth.token.keyValidation-CodePassword-free token-key for BML request
wds.linkis.bml.auth.token.valueBML-AUTHPassword-free token-value requested by BML
wds.linkis.bml.hdfs.prefix/tmp/linkisThe prefix file path of the BML file stored on hdfs

4.2 Metadata configuration parameters

Parameter nameDefault valueDescription
hadoop.config.dir/appcom/config/hadoop-configIf it does not exist, the value of the environment variable HADOOP_CONF_DIR is used by default
hive.config.dir/appcom/config/hive-configIf it does not exist, the value of the environment variable HIVE_CONF_DIR is used by default
hive.meta.urlNoneThe URL of the HiveMetaStore database. If hive.config.dir is not configured, this value must be configured
hive.meta.userNoneUser of the HiveMetaStore database
hive.meta.passwordNonePassword of the HiveMetaStore database

4.3 JobHistory configuration parameters

Parameter nameDefault valueDescription
wds.linkis.jobhistory.adminNoneThe default Admin account is used to specify which users can view the execution history of everyone

4.4 FileSystem configuration parameters

Parameter nameDefault valueDescription
wds.linkis.filesystem.root.pathfile:///tmp/linkis/User’s Linux local root directory
wds.linkis.filesystem.hdfs.root.pathhdfs:///tmp/User’s HDFS root directory
wds.linkis.workspace.filesystem.hdfsuserrootpath.suffix/linkis/The first-level prefix after the user’s HDFS root directory. The user’s actual root directory is: ${hdfs.root.path}\${user}\${ hdfsuserrootpath.suffix}
wds.linkis.workspace.resultset.download.is.limittrueWhen Client downloads the result set, whether to limit the number of downloads
wds.linkis.workspace.resultset.download.maxsize.csv5000When the result set is downloaded as a CSV file, the number of downloads is limited
wds.linkis.workspace.resultset.download.maxsize.excel5000When the result set is downloaded as an Excel file, the number of downloads is limited
wds.linkis.workspace.filesystem.get.timeout2000LThe maximum timeout period for requesting the underlying file system. (If the performance of your HDFS or Linux machine is low, it is recommended to increase the check number appropriately)

4.5 UDF configuration parameters

Parameter nameDefault valueDescription
wds.linkis.udf.share.path/mnt/bdap/udfThe storage path of the shared UDF, it is recommended to set it to the HDFS path

5. MicroService configuration parameters

5.1 Gateway configuration parameters

Parameter nameDefault valueDescription
wds.linkis.gateway.conf.enable.proxy.userfalseWhether to enable proxy user mode, if enabled, the login user’s request will be proxied to the proxy user for execution
wds.linkis.gateway.conf.proxy.user.configproxy.propertiesStorage file of proxy rules
wds.linkis.gateway.conf.proxy.user.scan.interval600000Proxy file refresh interval
wds.linkis.gateway.conf.enable.token.authfalseWhether to enable the Token login mode, if enabled, allow access to Linkis in the form of tokens
wds.linkis.gateway.conf.token.auth.configtoken.propertiesToken rule storage file
wds.linkis.gateway.conf.token.auth.scan.interval600000Token file refresh interval
wds.linkis.gateway.conf.url.pass.auth/dws/Request for default release without login verification
wds.linkis.gateway.conf.enable.ssofalseWhether to enable SSO user login mode
wds.linkis.gateway.conf.sso.interceptorNoneIf the SSO login mode is enabled, the user needs to implement SSOInterceptor to jump to the SSO login page
wds.linkis.admin.userhadoopAdministrator user list
wds.linkis.login_encrypt.enablefalseWhen the user logs in, does the password enable RSA encryption transmission
wds.linkis.enable.gateway.authfalseWhether to enable the Gateway IP whitelist mechanism
wds.linkis.gateway.auth.fileauth.txtIP whitelist storage file

6. DataSource and Metadata Service configuration parameters

6.1 MetaData Service configuration parameters

From VersionParameter nameDefault valueDescription
v1.1.0wds.linkis.server.mdm.service.lib.dir/lib/linkis-pulicxxxx-/linkis-metdata-manager/serviceSpecify the relative path of the service to be loaded
v1.1.0wds.linkis.server.mdm.service.instance.expire-in-seconds60Set the service loading timeout. If it exceeds the specified time, it will not be loaded
v1.1.0wds.linkis.server.dsm.app.namelinkis-ps-data-source-managerSet the service to get the data source
v1.1.0wds.linkis.server.mdm.service.kerberos.principlehadoop/HOST@EXAMPLE.COMset kerberos principle for linkis-metadata hive service
v1.1.0wds.linkis.server.mdm.service.userhadoopset user for linkis-metadata hive service
v1.1.0wds.linkis.server.mdm.service.kerberos.krb5.path“”set kerberos krb5 path for linkis-metadata hive service
v1.1.0wds.linkis.server.mdm.service.temp.locationclasspath:/tmpset tmp loc for linkis-metadata hive and kafka service
v1.1.0wds.linkis.server.mdm.service.sql.drivercom.mysql.jdbc.Driverset driver for hive-metadata mysql service
v1.1.0wds.linkis.server.mdm.service.sql.urljdbc:mysql://%s:%s/%sset url format for hive-metadata mysql service
v1.1.0wds.linkis.server.mdm.service.sql.connect.timeout3000set timeout for mysql connect for hive-metadata mysql service
v1.1.0wds.linkis.server.mdm.service.sql.socket.timeout6000set timeout for socket open for hive-metadata mysql service