Impala

This article mainly introduces the installation, usage and configuration of the Impala engine plugin in Linkis.

If you want to use the Impala engine on your server, you need to prepare the Impala service and provide connection information, such as the connection address of the Impala cluster, SASL user name and password, etc.

Execute the impala-shell command to get the following output, indicating that the impala service is available.

  1. [root@8f43473645b1 /]# impala-shell
  2. Starting Impala Shell without Kerberos authentication
  3. Connected to 8f43473645b1:21000
  4. Server version: impalad version 2.12.0-cdh5.15.0 RELEASE (build 23f574543323301846b41fa5433690df32efe085)
  5. ***************************************************** *********************************
  6. Welcome to the Impala shell.
  7. (Impala Shell v2.12.0-cdh5.15.0 (23f5745) built on Thu May 24 04:07:31 PDT 2018)
  8. When pretty-printing is disabled, you can use the '--output_delimiter' flag to set
  9. the delimiter for fields in the same row. The default is ','.
  10. ***************************************************** *********************************
  11. [8f43473645b1:21000] >

Before compiling the Impala engine, the Linkis project needs to be fully compiled, and the default installation and deployment package released by Linkis does not include this engine plug-in by default.

Method 1: Download the engine plug-in package directly

Linkis Engine Plugin Download

Method 2: Compile the engine plug-in separately (requires maven environment)

  1. # compile
  2. cd ${linkis_code_dir}/linkis-engineconn-plugins/impala/
  3. mvn clean install
  4. # The compiled engine plug-in package is located in the following directory
  5. ${linkis_code_dir}/linkis-engineconn-plugins/impala/target/out/

EngineConnPlugin Engine Plugin Installation

Upload the engine package in 2.1 to the engine directory of the server

  1. ${LINKIS_HOME}/lib/linkis-engineconn-plugins

The directory structure after uploading is as follows

  1. linkis-engineconn-plugins/
  2. ├── impala
  3. ├── dist
  4. └── 3.4.0
  5. ├── conf
  6. └── lib
  7. └── plugin
  8. └── 3.4.0

Refresh the engine by restarting the linkis-cg-linkismanager service

  1. cd ${LINKIS_HOME}/sbin
  2. sh linkis-daemon.sh restart cg-linkismanager

You can check whether the last_update_time of the linkis_engine_conn_plugin_bml_resources table in the database is the time to trigger the refresh.

  1. #login to `linkis` database
  2. select * from linkis_cg_engine_conn_plugin_bml_resources;
  1. sh ./bin/linkis-cli -submitUser impala \
  2. -engineType impala-3.4.0 -code 'show databases;' \
  3. -runtimeMap linkis.es.http.method=GET \
  4. -runtimeMap linkis.impala.servers=127.0.0.1:21050

More Linkis-Cli command parameter reference: Linkis-Cli usage

ConfigurationDefaultDescriptionRequired
linkis.impala.default.limit5000YesThe limit on the number of returned items in the query result set
linkis.impala.engine.user${HDFS_ROOT_USER}yesdefault engine startup user
linkis.impala.user.isolation.modefalseyesstart the engine in multi-user mode
linkis.impala.servers127.0.0.1:21050isImpala server address, separated by ‘,’
linkis.impala.maxConnections10YesMaximum number of connections to each Impala server
linkis.impala.ssl.enablefalseyeswhether to enable SSL connection
linkis.impala.ssl.keystore.typeJKSNoSSL Keystore type
linkis.impala.ssl.keystorenullNoSSL Keystore path
linkis.impala.ssl.keystore.passwordnullNoSSL Keystore password
linkis.impala.ssl.truststore.typeJKSNoSSL Truststore type
linkis.impala.ssl.truststorenullNoSSL Truststore path
linkis.impala.ssl.truststore.passwordnullNoSSL Truststore password
linkis.impala.sasl.enablefalseyeswhether to enable SASL authentication
linkis.impala.sasl.mechanismPLAINSASL Mechanism
linkis.impala.sasl.authorizationIdnullSASL AuthorizationId
linkis.impala.sasl.protocolLDAPSASL Protocol
linkis.impala.sasl.propertiesnullNoSASL Properties: key1=value1,key2=value2
linkis.impala.sasl.username${impala.engine.user}SASL Username
linkis.impala.sasl.passwordnullNoSASL Password
linkis.impala.sasl.password.cmdnullNoSASL Password get command
linkis.impala.heartbeat.seconds1yestask status update interval
linkis.impala.query.timeout.seconds0NoTask execution timeout
linkis.impala.query.batchSize1000yesresult set fetch batch size
linkis.impala.query.optionsnullNoQuery submission parameters: key1=value1,key2=value2

If the default parameters are not satisfied, there are the following ways to configure some basic parameters

Submit the task interface and configure it through the parameter params.configuration.runtime

  1. Example of http request parameters
  2. {
  3. "executionContent": {"code": "show databases;", "runType": "sql"},
  4. "params": {
  5. "variable": {},
  6. "configuration": {
  7. "runtime": {
  8. "linkis.impala.servers"="127.0.0.1:21050"
  9. }
  10. }
  11. },
  12. "labels": {
  13. "engineType": "impala-3.4.0",
  14. "userCreator": "hadoop-IDE"
  15. }
  16. }

Linkis is managed through engine tags, and the data table information involved is as follows.

  1. linkis_ps_configuration_config_key: Insert the key and default values ​​​​of the configuration parameters of the engine
  2. linkis_cg_manager_label: insert engine label such as: impala-3.4.0
  3. linkis_ps_configuration_category: Insert the directory association of the engine
  4. linkis_ps_configuration_config_value: Insert the configuration that the engine needs to display
  5. linkis_ps_configuration_key_engine_relation: the relationship between configuration items and engines

The initial data related to the engine in the table is as follows

  1. -- set variable
  2. SET @ENGINE_LABEL="impala-3.4.0";
  3. SET @ENGINE_IDE=CONCAT('*-IDE,',@ENGINE_LABEL);
  4. SET @ENGINE_ALL=CONCAT('*-*,',@ENGINE_LABEL);
  5. SET @ENGINE_NAME="impala";
  6. -- add impala engine to IDE
  7. insert into `linkis_cg_manager_label` (`label_key`, `label_value`, `label_feature`, `label_value_size`, `update_time`, `create_time`) VALUES ('combined_userCreator_engineType', @ENGINE_ALL, 'OPTIONAL', 2, now(), now());
  8. insert into `linkis_cg_manager_label` (`label_key`, `label_value`, `label_feature`, `label_value_size`, `update_time`, `create_time`) VALUES ('combined_userCreator_engineType', @ENGINE_IDE, 'OPTIONAL', 2, now(), now());
  9. select @label_id := id from `linkis_cg_manager_label` where label_value = @ENGINE_IDE;
  10. insert into `linkis_ps_configuration_category` (`label_id`, `level`) VALUES (@label_id, 2);
  11. -- insert configuration key
  12. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.default.limit', 'result result set limit of query', 'result set limit', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1 , 'Data Source Configuration');
  13. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.engine.user', 'Default engine startup user', 'Default startup user', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, 'Data source configuration' );
  14. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.user.isolation.mode', 'Start engine in multi-user mode', 'Multi-user mode', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, ' Datasource configuration');
  15. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.servers', 'Impala server address', 'service address', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, 'data source configuration');
  16. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.maxConnections ', 'The maximum number of connections to each Impala server', 'Maximum number of connections', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, 'data source configuration');
  17. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.ssl.enable', 'Enable SSL connection', 'Enable SSL', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, 'Data source configuration') ;
  18. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.ssl.keystore.type', 'SSL Keystore类型', 'SSL Keystore类型', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, '数据源配置');
  19. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.ssl.keystore', 'SSL Keystore路径', 'SSL Keystore路径', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, '数据源配置');
  20. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.ssl.keystore.password', 'SSL Keystore密码', 'SSL Keystore密码', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, '数据源配置');
  21. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.ssl.truststore.type', 'SSL Truststore类型', 'SSL Truststore类型', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, '数据源配置');
  22. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.ssl.truststore', 'SSL Truststore路径', 'SSL Truststore路径', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, '数据源配置');
  23. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.ssl.truststore.password', 'SSL Truststore密码', 'SSL Truststore密码', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, '数据源配置');
  24. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.sasl.enable', 'whether to enable SASL authentication', 'enable SASL', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, 'data source configuration') ;
  25. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.sasl.mechanism', 'SASL Mechanism', 'SASL Mechanism', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, '数据源配置');
  26. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.sasl.authorizationId', 'SASL AuthorizationId', 'SASL AuthorizationId', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, '数据源配置');
  27. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.sasl.protocol', 'SASL Protocol', 'SASL Protocol', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, '数据源配置');
  28. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.sasl.properties', 'SASL Properties: key1=value1,key2=value2', 'SASL Properties', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, '数据源配置');
  29. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.sasl.username', 'SASL Username', 'SASL Username', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, '数据源配置');
  30. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.sasl.password', 'SASL Password', 'SASL Password', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, '数据源配置');
  31. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.sasl.password.cmd', 'SASL Password get command', 'SASL Password get command', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, 'data source configuration');
  32. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.heartbeat.seconds', 'Task status update interval', 'Task status update interval', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, 'Data source configuration ');
  33. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.query.timeout.seconds', 'Task execution timeout', 'Task execution timeout', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, 'data source configuration');
  34. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.query.batchSize', 'result set acquisition batch size', 'result set acquisition batch size', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, 'Datasource Configuration');
  35. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.impala.query.options', 'Query submission parameters: key1=value1,key2=value2', 'Query submission parameters', 'null', 'None', '', @ENGINE_NAME, 0, 0, 1, 'Data source configuration');
  36. -- impala engine -*
  37. insert into `linkis_ps_configuration_key_engine_relation` (`config_key_id`, `engine_type_label_id`)
  38. (select config.id as config_key_id, label.id AS engine_type_label_id FROM `linkis_ps_configuration_config_key` config
  39. INNER JOIN `linkis_cg_manager_label` label ON config.engine_conn_type = @ENGINE_NAME and label_value = @ENGINE_ALL);
  40. -- impala engine default configuration
  41. insert into `linkis_ps_configuration_config_value` (`config_key_id`, `config_value`, `config_label_id`)
  42. (select relation.config_key_id AS config_key_id, '' AS config_value, relation.engine_type_label_id AS config_label_id FROM `linkis_ps_configuration_key_engine_relation` relation
  43. INNER JOIN `linkis_cg_manager_label` label ON relation.engine_type_label_id = label.id AND label.label_value = @ENGINE_ALL);