ML Commons API


Table of contents


The ML Commons API lets you train machine learning (ML) algorithms synchronously and asynchronously, make predictions with that trained model, and train and predict with the same dataset.

To train tasks through the API, three inputs are required:

  • Algorithm name: Must be one of a FunctionName. This determines what algorithm the ML Engine runs. To add a new function, see How To Add a New Function.
  • Model hyperparameters: Adjust these parameters to improve model accuracy.
  • Input data: The data that trains the ML model, or applies the ML models to predictions. You can input data in two ways, query against your index or use a data frame.

Model access control considerations

For clusters with model access control enabled, users can perform API operations on models in model groups with specified access levels as follows:

  • public model group: Any user.
  • restricted model group: Only the model owner or users who share at least one backend role with the model group.
  • private model group: Only the model owner.

For clusters with model access control disabled, any user can perform API operations on models in any model group.

Admin users can perform API operations for models in any model group.

For more information, see Model access control.

Training the model

The train API operation trains a model based on a selected algorithm. Training can occur both synchronously and asynchronously.

Request

The following examples use the k-means algorithm to train index data.

Train with k-means synchronously

  1. POST /_plugins/_ml/_train/kmeans
  2. {
  3. "parameters": {
  4. "centroids": 3,
  5. "iterations": 10,
  6. "distance_type": "COSINE"
  7. },
  8. "input_query": {
  9. "_source": ["petal_length_in_cm", "petal_width_in_cm"],
  10. "size": 10000
  11. },
  12. "input_index": [
  13. "iris_data"
  14. ]
  15. }

copy

Train with k-means asynchronously

  1. POST /_plugins/_ml/_train/kmeans?async=true
  2. {
  3. "parameters": {
  4. "centroids": 3,
  5. "iterations": 10,
  6. "distance_type": "COSINE"
  7. },
  8. "input_query": {
  9. "_source": ["petal_length_in_cm", "petal_width_in_cm"],
  10. "size": 10000
  11. },
  12. "input_index": [
  13. "iris_data"
  14. ]
  15. }

copy

Response

Synchronous

For synchronous responses, the API returns the model_id, which can be used to get or delete a model.

  1. {
  2. "model_id" : "lblVmX8BO5w8y8RaYYvN",
  3. "status" : "COMPLETED"
  4. }

Asynchronous

For asynchronous responses, the API returns the task_id, which can be used to get or delete a task.

  1. {
  2. "task_id" : "lrlamX8BO5w8y8Ra2otd",
  3. "status" : "CREATED"
  4. }

Getting model information

You can retrieve model information using the model_id.

For information about user access for this API, see Model access control considerations.

Path and HTTP methods

  1. GET /_plugins/_ml/models/<model-id>

copy

The response contains the following model information:

  1. {
  2. "name" : "all-MiniLM-L6-v2_onnx",
  3. "algorithm" : "TEXT_EMBEDDING",
  4. "version" : "1",
  5. "model_format" : "TORCH_SCRIPT",
  6. "model_state" : "LOADED",
  7. "model_content_size_in_bytes" : 83408741,
  8. "model_content_hash_value" : "9376c2ebd7c83f99ec2526323786c348d2382e6d86576f750c89ea544d6bbb14",
  9. "model_config" : {
  10. "model_type" : "bert",
  11. "embedding_dimension" : 384,
  12. "framework_type" : "SENTENCE_TRANSFORMERS",
  13. "all_config" : """{"_name_or_path":"nreimers/MiniLM-L6-H384-uncased","architectures":["BertModel"],"attention_probs_dropout_prob":0.1,"gradient_checkpointing":false,"hidden_act":"gelu","hidden_dropout_prob":0.1,"hidden_size":384,"initializer_range":0.02,"intermediate_size":1536,"layer_norm_eps":1e-12,"max_position_embeddings":512,"model_type":"bert","num_attention_heads":12,"num_hidden_layers":6,"pad_token_id":0,"position_embedding_type":"absolute","transformers_version":"4.8.2","type_vocab_size":2,"use_cache":true,"vocab_size":30522}"""
  14. },
  15. "created_time" : 1665961344044,
  16. "last_uploaded_time" : 1665961373000,
  17. "last_loaded_time" : 1665961815959,
  18. "total_chunks" : 9
  19. }

Registering a model

All versions of a particular model are held in a model group. You can either register a model group before registering a model to the group or register a first version of a model, thereby creating the group. Each model group name in the cluster must be globally unique.

If you are registering the first version of a model without first registering the model group, a new model group is created automatically with the following name and access level:

  • Name: The new model group will have the same name as the model. Because the model group name must be unique, ensure that your model name does not have the same name as any model groups in the cluster.
  • Access level: The access level for the new model group is determined using the access_mode, backend_roles, and add_all_backend_roles parameters that you pass in the request. If you provide none of the three parameters, the new model group will be private if model access control is enabled on your cluster and public if model access control is disabled. The newly registered model is the first model version assigned to that model group.

Once a model group is created, provide its model_group_id to register a new model version to the model group. In this case, the model name does not need to be unique.

If you’re using pretrained models provided by OpenSearch, we recommend that you first register a model group with a unique name for these models. Then register the pretrained models as versions to that model group. This ensures that every model group has a globally unique model group name.

For information about user access for this API, see Model access control considerations.

If the model is more than 10 MB in size, ML Commons splits it into smaller chunks and saves those chunks in the model’s index.

Path and HTTP methods

  1. POST /_plugins/_ml/models/_register

copy

Request fields

All request fields are required.

FieldData typeDescription
nameStringThe model’s name.
versionIntegerThe model’s version number.
model_formatStringThe portable format of the model file. Currently only supports TORCH_SCRIPT.
model_group_idStringThe model group ID of the model group to register this model to.
model_content_hash_valueStringThe model content hash generated using the SHA-256 hashing algorithm.
model_configJSON objectThe model’s configuration, including the model_type, embedding_dimension, and framework_type. all_config is an optional JSON string that contains all model configurations.
urlStringThe URL that contains the model.

Example

The following example request registers a version 1.0.0 of an NLP sentence transformation model named all-MiniLM-L6-v2.

  1. POST /_plugins/_ml/models/_register
  2. {
  3. "name": "all-MiniLM-L6-v2",
  4. "version": "1.0.0",
  5. "description": "test model",
  6. "model_format": "TORCH_SCRIPT",
  7. "model_group_id": "FTNlQ4gBYW0Qyy5ZoxfR",
  8. "model_content_hash_value": "c15f0d2e62d872be5b5bc6c84d2e0f4921541e29fefbef51d59cc10a8ae30e0f",
  9. "model_config": {
  10. "model_type": "bert",
  11. "embedding_dimension": 384,
  12. "framework_type": "sentence_transformers",
  13. "all_config": "{\"_name_or_path\":\"nreimers/MiniLM-L6-H384-uncased\",\"architectures\":[\"BertModel\"],\"attention_probs_dropout_prob\":0.1,\"gradient_checkpointing\":false,\"hidden_act\":\"gelu\",\"hidden_dropout_prob\":0.1,\"hidden_size\":384,\"initializer_range\":0.02,\"intermediate_size\":1536,\"layer_norm_eps\":1e-12,\"max_position_embeddings\":512,\"model_type\":\"bert\",\"num_attention_heads\":12,\"num_hidden_layers\":6,\"pad_token_id\":0,\"position_embedding_type\":\"absolute\",\"transformers_version\":\"4.8.2\",\"type_vocab_size\":2,\"use_cache\":true,\"vocab_size\":30522}"
  14. },
  15. "url": "https://artifacts.opensearch.org/models/ml-models/huggingface/sentence-transformers/all-MiniLM-L6-v2/1.0.1/torch_script/sentence-transformers_all-MiniLM-L6-v2-1.0.1-torch_script.zip"
  16. }

copy

Response

OpenSearch responds with the task_id and task status.

  1. {
  2. "task_id" : "ew8I44MBhyWuIwnfvDIH",
  3. "status" : "CREATED"
  4. }

To see the status of your model registration and retrieve the model ID created for the new model version, pass the task_id as a path parameter to the Tasks API:

  1. GET /_plugins/_ml/tasks/<task_id>

copy

The response contains the model ID of the model version:

  1. {
  2. "model_id": "Qr1YbogBYOqeeqR7sI9L",
  3. "task_type": "DEPLOY_MODEL",
  4. "function_name": "TEXT_EMBEDDING",
  5. "state": "COMPLETED",
  6. "worker_node": [
  7. "N77RInqjTSq_UaLh1k0BUg"
  8. ],
  9. "create_time": 1685478486057,
  10. "last_update_time": 1685478491090,
  11. "is_async": true
  12. }

Deploying a model

The deploy model operation reads the model’s chunks from the model index and then creates an instance of the model to cache into memory. This operation requires the model_id.

For information about user access for this API, see Model access control considerations.

Path and HTTP methods

  1. POST /_plugins/_ml/models/<model_id>/_deploy

Example: Deploying to all available ML nodes

In this example request, OpenSearch deploys the model to any available OpenSearch ML node:

  1. POST /_plugins/_ml/models/WWQI44MBbzI2oUKAvNUt/_deploy

copy

Example: Deploying to a specific node

If you want to reserve the memory of other ML nodes within your cluster, you can deploy your model to a specific node(s) by specifying the node_ids in the request body:

  1. POST /_plugins/_ml/models/WWQI44MBbzI2oUKAvNUt/_deploy
  2. {
  3. "node_ids": ["4PLK7KJWReyX0oWKnBA8nA"]
  4. }

copy

Response

  1. {
  2. "task_id" : "hA8P44MBhyWuIwnfvTKP",
  3. "status" : "DEPLOYING"
  4. }

Undeploying a model

To undeploy a model from memory, use the undeploy operation.

For information about user access for this API, see Model access control considerations.

Path and HTTP methods

  1. POST /_plugins/_ml/models/<model_id>/_undeploy

Example: Undeploying model from all ML nodes

  1. POST /_plugins/_ml/models/MGqJhYMBbbh0ushjm8p_/_undeploy

copy

Response: Undeploying a model from all ML nodes

  1. {
  2. "s5JwjZRqTY6nOT0EvFwVdA": {
  3. "stats": {
  4. "MGqJhYMBbbh0ushjm8p_": "UNDEPLOYED"
  5. }
  6. }
  7. }

Example: Undeploying specific models from specific nodes

  1. POST /_plugins/_ml/models/_undeploy
  2. {
  3. "node_ids": ["sv7-3CbwQW-4PiIsDOfLxQ"],
  4. "model_ids": ["KDo2ZYQB-v9VEDwdjkZ4"]
  5. }

copy

Response: Undeploying specific models from specific nodes

  1. {
  2. "sv7-3CbwQW-4PiIsDOfLxQ" : {
  3. "stats" : {
  4. "KDo2ZYQB-v9VEDwdjkZ4" : "UNDEPLOYED"
  5. }
  6. }
  7. }

Response: Undeploying all models from specific nodes

  1. {
  2. "sv7-3CbwQW-4PiIsDOfLxQ" : {
  3. "stats" : {
  4. "KDo2ZYQB-v9VEDwdjkZ4" : "UNDEPLOYED",
  5. "-8o8ZYQBvrLMaN0vtwzN" : "UNDEPLOYED"
  6. }
  7. }
  8. }

Example: Undeploying specific models from all nodes

  1. {
  2. "model_ids": ["KDo2ZYQB-v9VEDwdjkZ4"]
  3. }

copy

Response: Undeploying specific models from all nodes

  1. {
  2. "sv7-3CbwQW-4PiIsDOfLxQ" : {
  3. "stats" : {
  4. "KDo2ZYQB-v9VEDwdjkZ4" : "UNDEPLOYED"
  5. }
  6. }
  7. }

Searching for a model

Use this command to search for models you’ve already created.

The response will contain only those model versions to which you have access. For example, if you send a match all query, model versions for the following model group types will be returned:

  • All public model groups in the index.
  • Private model groups for which you are the model owner.
  • Model groups that share at least one backend role with your backend roles.

For more information, see Model access control.

Path and HTTP methods

  1. POST /_plugins/_ml/models/_search
  2. {query}

Example: Searching for all models

  1. POST /_plugins/_ml/models/_search
  2. {
  3. "query": {
  4. "match_all": {}
  5. },
  6. "size": 1000
  7. }

copy

Example: Searching for models with algorithm “FIT_RCF”

  1. POST /_plugins/_ml/models/_search
  2. {
  3. "query": {
  4. "term": {
  5. "algorithm": {
  6. "value": "FIT_RCF"
  7. }
  8. }
  9. }
  10. }

copy

Response

  1. {
  2. "took" : 8,
  3. "timed_out" : false,
  4. "_shards" : {
  5. "total" : 1,
  6. "successful" : 1,
  7. "skipped" : 0,
  8. "failed" : 0
  9. },
  10. "hits" : {
  11. "total" : {
  12. "value" : 2,
  13. "relation" : "eq"
  14. },
  15. "max_score" : 2.4159138,
  16. "hits" : [
  17. {
  18. "_index" : ".plugins-ml-model",
  19. "_id" : "-QkKJX8BvytMh9aUeuLD",
  20. "_version" : 1,
  21. "_seq_no" : 12,
  22. "_primary_term" : 15,
  23. "_score" : 2.4159138,
  24. "_source" : {
  25. "name" : "FIT_RCF",
  26. "version" : 1,
  27. "content" : "xxx",
  28. "algorithm" : "FIT_RCF"
  29. }
  30. },
  31. {
  32. "_index" : ".plugins-ml-model",
  33. "_id" : "OxkvHn8BNJ65KnIpck8x",
  34. "_version" : 1,
  35. "_seq_no" : 2,
  36. "_primary_term" : 8,
  37. "_score" : 2.4159138,
  38. "_source" : {
  39. "name" : "FIT_RCF",
  40. "version" : 1,
  41. "content" : "xxx",
  42. "algorithm" : "FIT_RCF"
  43. }
  44. }
  45. ]
  46. }
  47. }

Deleting a model

Deletes a model based on the model_id.

When you delete the last model version in a model group, that model group is automatically deleted from the index.

For information about user access for this API, see Model access control considerations.

Path and HTTP methods

  1. DELETE /_plugins/_ml/models/<model_id>

copy

The API returns the following:

  1. {
  2. "_index" : ".plugins-ml-model",
  3. "_id" : "MzcIJX8BA7mbufL6DOwl",
  4. "_version" : 2,
  5. "result" : "deleted",
  6. "_shards" : {
  7. "total" : 2,
  8. "successful" : 2,
  9. "failed" : 0
  10. },
  11. "_seq_no" : 27,
  12. "_primary_term" : 18
  13. }

Profile

The profile operation returns runtime information on ML tasks and models. The profile operation can help debug issues with models at runtime.

  1. GET /_plugins/_ml/profile
  2. GET /_plugins/_ml/profile/models
  3. GET /_plugins/_ml/profile/tasks

Path parameters

ParameterData typeDescription
model_idStringReturns runtime data for a specific model. You can string together multiple model_ids to return multiple model profiles.
tasksStringReturns runtime data for a specific task. You can string together multiple task_ids to return multiple task profiles.

Request fields

All profile body request fields are optional.

FieldData typeDescription
node_idsStringReturns all tasks and profiles from a specific node.
model_idsStringReturns runtime data for a specific model. You can string together multiple model IDs to return multiple model profiles.
task_idsStringReturns runtime data for a specific task. You can string together multiple task IDs to return multiple task profiles.
return_all_tasksBooleanDetermines whether or not a request returns all tasks. When set to false, task profiles are left out of the response.
return_all_modelsBooleanDetermines whether or not a profile request returns all models. When set to false, model profiles are left out of the response.

Example: Returning all tasks and models on a specific node

  1. GET /_plugins/_ml/profile
  2. {
  3. "node_ids": ["KzONM8c8T4Od-NoUANQNGg"],
  4. "return_all_tasks": true,
  5. "return_all_models": true
  6. }

copy

Response: Returning all tasks and models on a specific node

  1. {
  2. "nodes" : {
  3. "qTduw0FJTrmGrqMrxH0dcA" : { # node id
  4. "models" : {
  5. "WWQI44MBbzI2oUKAvNUt" : { # model id
  6. "worker_nodes" : [ # routing table
  7. "KzONM8c8T4Od-NoUANQNGg"
  8. ]
  9. }
  10. }
  11. },
  12. ...
  13. "KzONM8c8T4Od-NoUANQNGg" : { # node id
  14. "models" : {
  15. "WWQI44MBbzI2oUKAvNUt" : { # model id
  16. "model_state" : "DEPLOYED", # model status
  17. "predictor" : "org.opensearch.ml.engine.algorithms.text_embedding.TextEmbeddingModel@592814c9",
  18. "worker_nodes" : [ # routing table
  19. "KzONM8c8T4Od-NoUANQNGg"
  20. ],
  21. "predict_request_stats" : { # predict request stats on this node
  22. "count" : 2, # total predict requests on this node
  23. "max" : 89.978681, # max latency in milliseconds
  24. "min" : 5.402,
  25. "average" : 47.6903405,
  26. "p50" : 47.6903405,
  27. "p90" : 81.5210129,
  28. "p99" : 89.13291418999998
  29. }
  30. }
  31. }
  32. },
  33. ...
  34. }

Predict

ML Commons can predict new data with your trained model either from indexed data or a data frame. To use the Predict API, the model_id is required.

For information about user access for this API, see Model access control considerations.

Path and HTTP methods

  1. POST /_plugins/_ml/_predict/<algorithm_name>/<model_id>

Request

  1. POST /_plugins/_ml/_predict/kmeans/<model-id>
  2. {
  3. "input_query": {
  4. "_source": ["petal_length_in_cm", "petal_width_in_cm"],
  5. "size": 10000
  6. },
  7. "input_index": [
  8. "iris_data"
  9. ]
  10. }

copy

Response

  1. {
  2. "status" : "COMPLETED",
  3. "prediction_result" : {
  4. "column_metas" : [
  5. {
  6. "name" : "ClusterID",
  7. "column_type" : "INTEGER"
  8. }
  9. ],
  10. "rows" : [
  11. {
  12. "values" : [
  13. {
  14. "column_type" : "INTEGER",
  15. "value" : 1
  16. }
  17. ]
  18. },
  19. {
  20. "values" : [
  21. {
  22. "column_type" : "INTEGER",
  23. "value" : 1
  24. }
  25. ]
  26. },
  27. {
  28. "values" : [
  29. {
  30. "column_type" : "INTEGER",
  31. "value" : 0
  32. }
  33. ]
  34. },
  35. {
  36. "values" : [
  37. {
  38. "column_type" : "INTEGER",
  39. "value" : 0
  40. }
  41. ]
  42. },
  43. {
  44. "values" : [
  45. {
  46. "column_type" : "INTEGER",
  47. "value" : 0
  48. }
  49. ]
  50. },
  51. {
  52. "values" : [
  53. {
  54. "column_type" : "INTEGER",
  55. "value" : 0
  56. }
  57. ]
  58. }
  59. ]
  60. }

Train and predict

Use to train and then immediately predict against the same training dataset. Can only be used with unsupervised learning models and the following algorithms:

  • BATCH_RCF
  • FIT_RCF
  • k-means

Example: Train and predict with indexed data

  1. POST /_plugins/_ml/_train_predict/kmeans
  2. {
  3. "parameters": {
  4. "centroids": 2,
  5. "iterations": 10,
  6. "distance_type": "COSINE"
  7. },
  8. "input_query": {
  9. "query": {
  10. "bool": {
  11. "filter": [
  12. {
  13. "range": {
  14. "k1": {
  15. "gte": 0
  16. }
  17. }
  18. }
  19. ]
  20. }
  21. },
  22. "size": 10
  23. },
  24. "input_index": [
  25. "test_data"
  26. ]
  27. }

copy

Example: Train and predict with data directly

  1. POST /_plugins/_ml/_train_predict/kmeans
  2. {
  3. "parameters": {
  4. "centroids": 2,
  5. "iterations": 1,
  6. "distance_type": "EUCLIDEAN"
  7. },
  8. "input_data": {
  9. "column_metas": [
  10. {
  11. "name": "k1",
  12. "column_type": "DOUBLE"
  13. },
  14. {
  15. "name": "k2",
  16. "column_type": "DOUBLE"
  17. }
  18. ],
  19. "rows": [
  20. {
  21. "values": [
  22. {
  23. "column_type": "DOUBLE",
  24. "value": 1.00
  25. },
  26. {
  27. "column_type": "DOUBLE",
  28. "value": 2.00
  29. }
  30. ]
  31. },
  32. {
  33. "values": [
  34. {
  35. "column_type": "DOUBLE",
  36. "value": 1.00
  37. },
  38. {
  39. "column_type": "DOUBLE",
  40. "value": 4.00
  41. }
  42. ]
  43. },
  44. {
  45. "values": [
  46. {
  47. "column_type": "DOUBLE",
  48. "value": 1.00
  49. },
  50. {
  51. "column_type": "DOUBLE",
  52. "value": 0.00
  53. }
  54. ]
  55. },
  56. {
  57. "values": [
  58. {
  59. "column_type": "DOUBLE",
  60. "value": 10.00
  61. },
  62. {
  63. "column_type": "DOUBLE",
  64. "value": 2.00
  65. }
  66. ]
  67. },
  68. {
  69. "values": [
  70. {
  71. "column_type": "DOUBLE",
  72. "value": 10.00
  73. },
  74. {
  75. "column_type": "DOUBLE",
  76. "value": 4.00
  77. }
  78. ]
  79. },
  80. {
  81. "values": [
  82. {
  83. "column_type": "DOUBLE",
  84. "value": 10.00
  85. },
  86. {
  87. "column_type": "DOUBLE",
  88. "value": 0.00
  89. }
  90. ]
  91. }
  92. ]
  93. }
  94. }

copy

Response

  1. {
  2. "status" : "COMPLETED",
  3. "prediction_result" : {
  4. "column_metas" : [
  5. {
  6. "name" : "ClusterID",
  7. "column_type" : "INTEGER"
  8. }
  9. ],
  10. "rows" : [
  11. {
  12. "values" : [
  13. {
  14. "column_type" : "INTEGER",
  15. "value" : 1
  16. }
  17. ]
  18. },
  19. {
  20. "values" : [
  21. {
  22. "column_type" : "INTEGER",
  23. "value" : 1
  24. }
  25. ]
  26. },
  27. {
  28. "values" : [
  29. {
  30. "column_type" : "INTEGER",
  31. "value" : 1
  32. }
  33. ]
  34. },
  35. {
  36. "values" : [
  37. {
  38. "column_type" : "INTEGER",
  39. "value" : 0
  40. }
  41. ]
  42. },
  43. {
  44. "values" : [
  45. {
  46. "column_type" : "INTEGER",
  47. "value" : 0
  48. }
  49. ]
  50. },
  51. {
  52. "values" : [
  53. {
  54. "column_type" : "INTEGER",
  55. "value" : 0
  56. }
  57. ]
  58. }
  59. ]
  60. }
  61. }

Getting task information

You can retrieve information about a task using the task_id.

  1. GET /_plugins/_ml/tasks/<task_id>

copy

The response includes information about the task.

  1. {
  2. "model_id" : "l7lamX8BO5w8y8Ra2oty",
  3. "task_type" : "TRAINING",
  4. "function_name" : "KMEANS",
  5. "state" : "COMPLETED",
  6. "input_type" : "SEARCH_QUERY",
  7. "worker_node" : "54xOe0w8Qjyze00UuLDfdA",
  8. "create_time" : 1647545342556,
  9. "last_update_time" : 1647545342587,
  10. "is_async" : true
  11. }

Searching for a task

Search tasks based on parameters indicated in the request body.

  1. GET /_plugins/_ml/tasks/_search
  2. {query body}

Example: Search task which function_name is KMEANS

  1. GET /_plugins/_ml/tasks/_search
  2. {
  3. "query": {
  4. "bool": {
  5. "filter": [
  6. {
  7. "term": {
  8. "function_name": "KMEANS"
  9. }
  10. }
  11. ]
  12. }
  13. }
  14. }

copy

Response

  1. {
  2. "took" : 12,
  3. "timed_out" : false,
  4. "_shards" : {
  5. "total" : 1,
  6. "successful" : 1,
  7. "skipped" : 0,
  8. "failed" : 0
  9. },
  10. "hits" : {
  11. "total" : {
  12. "value" : 2,
  13. "relation" : "eq"
  14. },
  15. "max_score" : 0.0,
  16. "hits" : [
  17. {
  18. "_index" : ".plugins-ml-task",
  19. "_id" : "_wnLJ38BvytMh9aUi-Ia",
  20. "_version" : 4,
  21. "_seq_no" : 29,
  22. "_primary_term" : 4,
  23. "_score" : 0.0,
  24. "_source" : {
  25. "last_update_time" : 1645640125267,
  26. "create_time" : 1645640125209,
  27. "is_async" : true,
  28. "function_name" : "KMEANS",
  29. "input_type" : "SEARCH_QUERY",
  30. "worker_node" : "jjqFrlW7QWmni1tRnb_7Dg",
  31. "state" : "COMPLETED",
  32. "model_id" : "AAnLJ38BvytMh9aUi-M2",
  33. "task_type" : "TRAINING"
  34. }
  35. },
  36. {
  37. "_index" : ".plugins-ml-task",
  38. "_id" : "wwRRLX8BydmmU1x6I-AI",
  39. "_version" : 3,
  40. "_seq_no" : 38,
  41. "_primary_term" : 7,
  42. "_score" : 0.0,
  43. "_source" : {
  44. "last_update_time" : 1645732766656,
  45. "create_time" : 1645732766472,
  46. "is_async" : true,
  47. "function_name" : "KMEANS",
  48. "input_type" : "SEARCH_QUERY",
  49. "worker_node" : "A_IiqoloTDK01uZvCjREaA",
  50. "state" : "COMPLETED",
  51. "model_id" : "xARRLX8BydmmU1x6I-CG",
  52. "task_type" : "TRAINING"
  53. }
  54. }
  55. ]
  56. }
  57. }

Deleting a task

Delete a task based on the task_id.

ML Commons does not check the task status when running the Delete request. There is a risk that a currently running task could be deleted before the task completes. To check the status of a task, run GET /_plugins/_ml/tasks/<task_id> before task deletion.

  1. DELETE /_plugins/_ml/tasks/{task_id}

copy

The API returns the following:

  1. {
  2. "_index" : ".plugins-ml-task",
  3. "_id" : "xQRYLX8BydmmU1x6nuD3",
  4. "_version" : 4,
  5. "result" : "deleted",
  6. "_shards" : {
  7. "total" : 2,
  8. "successful" : 2,
  9. "failed" : 0
  10. },
  11. "_seq_no" : 42,
  12. "_primary_term" : 7
  13. }

Stats

Get statistics related to the number of tasks.

To receive all stats, use:

  1. GET /_plugins/_ml/stats

copy

To receive stats for a specific node, use:

  1. GET /_plugins/_ml/<nodeId>/stats/

copy

To receive stats for a specific node and return a specified stat, use:

  1. GET /_plugins/_ml/<nodeId>/stats/<stat>

copy

To receive information on a specific stat from all nodes, use:

  1. GET /_plugins/_ml/stats/<stat>

copy

Example: Get all stats

  1. GET /_plugins/_ml/stats

copy

Response

  1. {
  2. "zbduvgCCSOeu6cfbQhTpnQ" : {
  3. "ml_executing_task_count" : 0
  4. },
  5. "54xOe0w8Qjyze00UuLDfdA" : {
  6. "ml_executing_task_count" : 0
  7. },
  8. "UJiykI7bTKiCpR-rqLYHyw" : {
  9. "ml_executing_task_count" : 0
  10. },
  11. "zj2_NgIbTP-StNlGZJlxdg" : {
  12. "ml_executing_task_count" : 0
  13. },
  14. "jjqFrlW7QWmni1tRnb_7Dg" : {
  15. "ml_executing_task_count" : 0
  16. },
  17. "3pSSjl5PSVqzv5-hBdFqyA" : {
  18. "ml_executing_task_count" : 0
  19. },
  20. "A_IiqoloTDK01uZvCjREaA" : {
  21. "ml_executing_task_count" : 0
  22. }
  23. }

Execute

Some algorithms, such as Localization, don’t require trained models. You can run no-model-based algorithms using the execute API.

  1. POST _plugins/_ml/_execute/<algorithm_name>

Example: Execute localization

The following example uses the Localization algorithm to find subset-level information for aggregate data (for example, aggregated over time) that demonstrates the activity of interest, such as spikes, drops, changes, or anomalies.

  1. POST /_plugins/_ml/_execute/anomaly_localization
  2. {
  3. "index_name": "rca-index",
  4. "attribute_field_names": [
  5. "attribute"
  6. ],
  7. "aggregations": [
  8. {
  9. "sum": {
  10. "sum": {
  11. "field": "value"
  12. }
  13. }
  14. }
  15. ],
  16. "time_field_name": "timestamp",
  17. "start_time": 1620630000000,
  18. "end_time": 1621234800000,
  19. "min_time_interval": 86400000,
  20. "num_outputs": 10
  21. }

copy

Upon execution, the API returns the following:

  1. "results" : [
  2. {
  3. "name" : "sum",
  4. "result" : {
  5. "buckets" : [
  6. {
  7. "start_time" : 1620630000000,
  8. "end_time" : 1620716400000,
  9. "overall_aggregate_value" : 65.0
  10. },
  11. {
  12. "start_time" : 1620716400000,
  13. "end_time" : 1620802800000,
  14. "overall_aggregate_value" : 75.0,
  15. "entities" : [
  16. {
  17. "key" : [
  18. "attr0"
  19. ],
  20. "contribution_value" : 1.0,
  21. "base_value" : 2.0,
  22. "new_value" : 3.0
  23. },
  24. {
  25. "key" : [
  26. "attr1"
  27. ],
  28. "contribution_value" : 1.0,
  29. "base_value" : 3.0,
  30. "new_value" : 4.0
  31. },
  32. {
  33. "key" : [
  34. "attr2"
  35. ],
  36. "contribution_value" : 1.0,
  37. "base_value" : 4.0,
  38. "new_value" : 5.0
  39. },
  40. {
  41. "key" : [
  42. "attr3"
  43. ],
  44. "contribution_value" : 1.0,
  45. "base_value" : 5.0,
  46. "new_value" : 6.0
  47. },
  48. {
  49. "key" : [
  50. "attr4"
  51. ],
  52. "contribution_value" : 1.0,
  53. "base_value" : 6.0,
  54. "new_value" : 7.0
  55. },
  56. {
  57. "key" : [
  58. "attr5"
  59. ],
  60. "contribution_value" : 1.0,
  61. "base_value" : 7.0,
  62. "new_value" : 8.0
  63. },
  64. {
  65. "key" : [
  66. "attr6"
  67. ],
  68. "contribution_value" : 1.0,
  69. "base_value" : 8.0,
  70. "new_value" : 9.0
  71. },
  72. {
  73. "key" : [
  74. "attr7"
  75. ],
  76. "contribution_value" : 1.0,
  77. "base_value" : 9.0,
  78. "new_value" : 10.0
  79. },
  80. {
  81. "key" : [
  82. "attr8"
  83. ],
  84. "contribution_value" : 1.0,
  85. "base_value" : 10.0,
  86. "new_value" : 11.0
  87. },
  88. {
  89. "key" : [
  90. "attr9"
  91. ],
  92. "contribution_value" : 1.0,
  93. "base_value" : 11.0,
  94. "new_value" : 12.0
  95. }
  96. ]
  97. },
  98. ...
  99. ]
  100. }
  101. }
  102. ]
  103. }