ElasticSearch Engine

This article mainly introduces the installation, usage and configuration of the ElasticSearch engine plugin in Linkis.

If you want to use the ElasticSearch engine on your Linkis service, you need to install the ElasticSearch service and make sure the service is available.

Use the following command to verify whether the ElasticSearch engine service is available. If the service has enabled user authentication, you need to add --user username:password

  1. curl [--user username:password] http://ip:port/_cluster/healty?pretty

The following output means that the ElasticSearch service is available, note that the cluster status is green

  1. {
  2. "cluster_name" : "docker-cluster",
  3. "status" : "green",
  4. "timed_out" : false,
  5. "number_of_nodes" : 1,
  6. "number_of_data_nodes" : 1,
  7. "active_primary_shards" : 7,
  8. "active_shards" : 7,
  9. "relocating_shards" : 0,
  10. "initializing_shards" : 0,
  11. "unassigned_shards" : 0,
  12. "delayed_unassigned_shards" : 0,
  13. "number_of_pending_tasks" : 0,
  14. "number_of_in_flight_fetch" : 0,
  15. "task_max_waiting_in_queue_millis" : 0,
  16. "active_shards_percent_as_number" : 100.0
  17. }

Method 1: Download the engine plug-in package directly

Linkis Engine Plugin Download

Method 2: Compile the engine plug-in separately (maven environment is required)

  1. # compile
  2. cd ${linkis_code_dir}/linkis-engineconn-plugins/elasticsearch/
  3. mvn clean install
  4. # The compiled engine plug-in package is located in the following directory
  5. ${linkis_code_dir}/linkis-engineconn-plugins/elasticsearch/target/out/

EngineConnPlugin Engine Plugin Installation

Upload the engine plug-in package in 2.1 to the engine directory of the server

  1. ${LINKIS_HOME}/lib/linkis-engineconn-plugins

The directory structure after uploading is as follows

  1. linkis-engineconn-plugins/
  2. ├── elasticsearch
  3. ├── dist
  4. └── 7.6.2
  5. ├── conf
  6. └── lib
  7. └── plugin
  8. └── 7.6.2

Refresh the engine by restarting the linkis-cg-linkismanager service

  1. cd ${LINKIS_HOME}/sbin
  2. sh linkis-daemon.sh restart cg-linkismanager

You can check whether the last_update_time of this table in the linkis_engine_conn_plugin_bml_resources in the database is the time when the refresh is triggered.

  1. #Login to the linkis database
  2. select * from linkis_cg_engine_conn_plugin_bml_resources;

-codeType parameter description

  • essql: Execute ElasticSearch engine tasks through SQL scripts
  • esjson: Execute ElasticSearch engine tasks through JSON script

essql method example

Note: Using this form, the ElasticSearch service must install the SQL plug-in, please refer to the installation method: https://github.com/NLPchina/elasticsearch-sql#elasticsearch-762

  1. sh ./bin/linkis-cli -submitUser Hadoop \
  2. -engineType elasticsearch-7.6.2 -codeType essql \
  3. -code '{"sql": "select * from kibana_sample_data_ecommerce limit 10' \
  4. -runtimeMap linkis.es.http.method=GET \
  5. -runtimeMap linkis.es.http.endpoint=/_sql \
  6. -runtimeMap linkis.es.datasource=hadoop \
  7. -runtimeMap linkis.es.cluster=127.0.0.1:9200

esjson style example

  1. sh ./bin/linkis-cli -submitUser Hadoop \
  2. -engineType elasticsearch-7.6.2 -codeType esjson \
  3. -code '{"query": {"match": {"order_id": "584677"}}}' \
  4. -runtimeMap linkis.es.http.method=GET \
  5. -runtimeMap linkis.es.http.endpoint=/kibana_sample_data_ecommerce/_search \
  6. -runtimeMap linkis.es.datasource=hadoop \
  7. -runtimeMap linkis.es.cluster=127.0.0.1:9200

More Linkis-Cli command parameter reference: Linkis-Cli usage

ConfigurationDefaultRequiredDescription
linkis.es.cluster127.0.0.1:9200yesElasticSearch cluster, multiple nodes separated by commas
linkis.es.datasourcehadoopElasticSearch datasource
linkis.es.usernamenonenoElasticSearch cluster username
linkis.es.passwordnonenoElasticSearch cluster password
linkis.es.auth.cachefalseNoWhether the client caches authentication
linkis.es.sniffer.enablefalseNoWhether the client enables sniffer
linkis.es.http.methodGETNoCall method
linkis.es.http.endpoint/_searchNoEndpoint called by JSON script
linkis.es.sql.endpoint/_sqlNoEndpoint called by SQL script
linkis.es.sql.format{“query”:”%s”}NoTemplate called by SQL script, %s is replaced with SQL as the request body to request Es cluster
linkis.es.headers.*NoneNoClient Headers Configuration
linkis.engineconn.concurrent.limit100NoMaximum concurrent engine

If the default parameters are not satisfied, there are the following ways to configure some basic parameters

ElasticSearch Engine - 图1

Note: After modifying the configuration under the IDE tag, you need to specify -creator IDE to take effect (other tags are similar), such as:

  1. sh ./bin/linkis-cli -creator IDE -submitUser hadoop \
  2. -engineType elasticsearch-7.6.2 -codeType esjson \
  3. -code '{"query": {"match": {"order_id": "584677"}}}' \
  4. -runtimeMap linkis.es.http.method=GET \
  5. -runtimeMap linkis.es.http.endpoint=/kibana_sample_data_ecommerce/_search

Submit the task interface, configure it through the parameter params.configuration.runtime

  1. Example of http request parameters
  2. {
  3. "executionContent": {"code": "select * from kibana_sample_data_ecommerce limit 10;", "runType": "essql"},
  4. "params": {
  5. "variable": {},
  6. "configuration": {
  7. "runtime": {
  8. "linkis.es.cluster":"http://127.0.0.1:9200",
  9. "linkis.es.datasource":"hadoop",
  10. "linkis.es.username":"",
  11. "linkis.es.password":""
  12. }
  13. }
  14. },
  15. "labels": {
  16. "engineType": "elasticsearch-7.6.2",
  17. "userCreator": "hadoop-IDE"
  18. }
  19. }

Configure by modifying the linkis-engineconn.properties file in the directory ${LINKIS_HOME}/lib/linkis-engineconn-plugins/elasticsearch/dist/7.6.2/conf/, as shown below:

ElasticSearch Engine - 图2

Linkis is managed through the engine tag, and the data table information involved is shown below.

  1. linkis_ps_configuration_config_key: key and default values ​​of configuration parameters inserted into the engine
  2. linkis_cg_manager_label: Insert engine label such as: elasticsearch-7.6.2
  3. linkis_ps_configuration_category: Insert the directory association of the engine
  4. linkis_ps_configuration_config_value: The configuration that the insertion engine needs to display
  5. linkis_ps_configuration_key_engine_relation: The relationship between the configuration item and the engine

The initial data related to the engine in the table is as follows

  1. -- set variable
  2. SET @ENGINE_LABEL="elasticsearch-7.6.2";
  3. SET @ENGINE_ALL=CONCAT('*-*,',@ENGINE_LABEL);
  4. SET @ENGINE_IDE=CONCAT('*-IDE,',@ENGINE_LABEL);
  5. SET @ENGINE_NAME="elasticsearch";
  6. -- engine label
  7. insert into `linkis_cg_manager_label` (`label_key`, `label_value`, `label_feature`, `label_value_size`, `update_time`, `create_time`) VALUES ('combined_userCreator_engineType', @ENGINE_ALL, 'OPTIONAL', 2, now(), now());
  8. insert into `linkis_cg_manager_label` (`label_key`, `label_value`, `label_feature`, `label_value_size`, `update_time`, `create_time`) VALUES ('combined_userCreator_engineType', @ENGINE_IDE, 'OPTIONAL', 2, now(), now());
  9. select @label_id := id from `linkis_cg_manager_label` where label_value = @ENGINE_IDE;
  10. insert into `linkis_ps_configuration_category` (`label_id`, `level`) VALUES (@label_id, 2);
  11. -- configuration key
  12. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.es.cluster', 'eg: http://127.0.0.1:9200', 'connection address', 'http://127.0.0.1:9200', 'None', '', @ENGINE_NAME , 0, 0, 1, 'data source conf');
  13. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.es.datasource', 'Connection Alias', 'Connection Alias', 'hadoop', 'None', '', @ENGINE_NAME, 0, 0, 1, 'Datasource Configuration');
  14. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.es.username', 'username', 'ES cluster username', 'No', 'None', '', @ENGINE_NAME, 0, 0, 1, 'data source conf');
  15. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.es.password', 'password', 'ES cluster password', 'None', 'None', '', @ENGINE_NAME, 0, 0, 1, 'data source conf');
  16. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.es.auth.cache', 'Does the client cache authentication', 'Does the client cache authentication', 'false', 'None', '', @ENGINE_NAME, 0, 0, 1, 'data source conf');
  17. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.es.sniffer.enable', 'Whether the client enables sniffer', 'Whether the client enables sniffer', 'false', 'None', '', @ENGINE_NAME, 0, 0, 1, 'data source conf');
  18. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.es.http.method', 'call method', 'HTTP request method', 'GET', 'None', '', @ENGINE_NAME, 0, 0, 1, 'data source conf');
  19. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.es.http.endpoint', '/_search', 'JSON script Endpoint', '/_search', 'None', '', @ENGINE_NAME, 0, 0, 1, 'data source conf');
  20. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.es.sql.endpoint', '/_sql', 'SQL script Endpoint', '/_sql', 'None', '', @ENGINE_NAME, 0, 0, 1, 'data source conf');
  21. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.es.sql.format', 'The template called by the SQL script, replace %s with SQL as the request body to request the Es cluster', 'request body', '{"query":"%s"}', 'None', '', @ENGINE_NAME, 0, 0, 1, 'data source conf');
  22. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.es.headers.*', 'Client Headers Configuration', 'Client Headers Configuration', 'None', 'None', '', @ENGINE_NAME, 0, 0, 1, 'data source conf');
  23. INSERT INTO `linkis_ps_configuration_config_key` (`key`, `description`, `name`, `default_value`, `validate_type`, `validate_range`, `engine_conn_type`, `is_hidden`, `is_advanced`, `level`, `treeName`) VALUES ('linkis.engineconn.concurrent.limit', 'engine max concurrency', 'engine max concurrency', '100', 'None', '', @ENGINE_NAME, 0, 0, 1, 'data source conf') ;
  24. -- key engine relation
  25. insert into `linkis_ps_configuration_key_engine_relation` (`config_key_id`, `engine_type_label_id`)
  26. (select config.id as config_key_id, label.id AS engine_type_label_id FROM `linkis_ps_configuration_config_key` config
  27. INNER JOIN `linkis_cg_manager_label` label ON config.engine_conn_type = @ENGINE_NAME and label_value = @ENGINE_ALL);
  28. -- engine default configuration
  29. insert into `linkis_ps_configuration_config_value` (`config_key_id`, `config_value`, `config_label_id`)
  30. (select relation.config_key_id AS config_key_id, '' AS config_value, relation.engine_type_label_id AS config_label_id FROM `linkis_ps_configuration_key_engine_relation` relation
  31. INNER JOIN `linkis_cg_manager_label` label ON relation.engine_type_label_id = label.id AND label.label_value = @ENGINE_ALL);