Get machine learning memory stats API

Get machine learning memory stats API

New API reference

For the most up-to-date API details, refer to Machine learning APIs.

Returns information on how machine learning is using memory.

Request

GET _ml/memory/_stats
GET _ml/memory/<node_id>/_stats

Prerequisites

Requires the monitor_ml cluster privilege. This privilege is included in the machine_learning_user built-in role.

Description

Get information about how machine learning jobs and trained models are using memory, on each node, both within the JVM heap, and natively, outside of the JVM.

Path parameters

<node_id>

(Optional, string) The names of particular nodes in the cluster to target. For example, nodeId1,nodeId2 or ml:true. For node selection options, see Node specification.

Query parameters

human

Specify this query parameter to include the fields with units in the response. Otherwise only the _in_bytes sizes are returned in the response.

master_timeout

(Optional, time units) Period to wait for the master node. If the master node is not available before the timeout expires, the request fails and returns an error. Defaults to 30s. Can also be set to -1 to indicate that the request should never timeout.

timeout

(Optional, time units) Period to wait for a response from all relevant nodes in the cluster after updating the cluster metadata. If no response is received before the timeout expires, the cluster metadata update still applies but the response will indicate that it was not completely acknowledged. Defaults to 30s. Can also be set to -1 to indicate that the request should never timeout.

Response body

_nodes

(object) Contains statistics about the number of nodes selected by the request.

Properties of _nodes

  • failed

    (integer) Number of nodes that rejected the request or failed to respond. If this value is not 0, a reason for the rejection or failure is included in the response.

    successful

    (integer) Number of nodes that responded successfully to the request.

    total

    (integer) Total number of nodes selected by the request.

cluster_name

(string) Name of the cluster. Based on the cluster.name setting.

nodes

(object) Contains statistics for the nodes selected by the request.

Properties of nodes

  • <node_id>

    (object) Contains statistics for the node.

    Properties of <node_id>

    • attributes

      (object) Lists node attributes such as ml.machine_memory or ml.max_open_jobs settings.

      ephemeral_id

      (string) The ephemeral ID of the node.

      jvm

      (object) Contains Java Virtual Machine (JVM) statistics for the node.

      Properties of jvm

      heap_max

      (byte value) Maximum amount of memory available for use by the heap.

      heap_max_in_bytes

      (integer) Maximum amount of memory, in bytes, available for use by the heap.

      java_inference

      (byte value) Amount of Java heap currently being used for caching inference models.

      java_inference_in_bytes

      (integer) Amount of Java heap, in bytes, currently being used for caching inference models.

      java_inference_max

      (byte value) Maximum amount of Java heap to be used for caching inference models.

      java_inference_max_in_bytes

      (integer) Maximum amount of Java heap, in bytes, to be used for caching inference models.

      mem

      (object) Contains statistics about memory usage for the node.

      Properties of mem

      adjusted_total

      (byte value) If the amount of physical memory has been overridden using the es.total_memory_bytes system property then this reports the overridden value. Otherwise it reports the same value as total.

      adjusted_total_in_bytes

      (integer) If the amount of physical memory has been overridden using the es.total_memory_bytes system property then this reports the overridden value in bytes. Otherwise it reports the same value as total_in_bytes.

      ml

      (object) Contains statistics about machine learning use of native memory on the node.

      Properties of ml

      anomaly_detectors

      (byte value) Amount of native memory set aside for anomaly detection jobs.

      anomaly_detectors_in_bytes

      (integer) Amount of native memory, in bytes, set aside for anomaly detection jobs.

      data_frame_analytics

      (byte value) Amount of native memory set aside for data frame analytics jobs.

      data_frame_analytics_in_bytes

      (integer) Amount of native memory, in bytes, set aside for data frame analytics jobs.

      max

      (byte value) Maximum amount of native memory (separate to the JVM heap) that may be used by machine learning native processes.

      max_in_bytes

      (integer) Maximum amount of native memory (separate to the JVM heap), in bytes, that may be used by machine learning native processes.

      native_code_overhead

      (byte value) Amount of native memory set aside for loading machine learning native code shared libraries.

      native_code_overhead_in_bytes

      (integer) Amount of native memory, in bytes, set aside for loading machine learning native code shared libraries.

      native_inference

      (byte value) Amount of native memory set aside for trained models that have a PyTorch model_type.

      native_inference_in_bytes

      (integer) Amount of native memory, in bytes, set aside for trained models that have a PyTorch model_type.

      total

      (byte value) Total amount of physical memory.

      total_in_bytes

      (integer) Total amount of physical memory in bytes.

      name

      (string) Human-readable identifier for the node. Based on the Node name setting setting.

      roles

      (array of strings) Roles assigned to the node. See Node settings.

      transport_address

      (string) The host and port where transport HTTP connections are accepted.

Examples

  1. resp = client.ml.get_memory_stats(
  2. human=True,
  3. )
  4. print(resp)
  1. response = client.ml.get_memory_stats(
  2. human: true
  3. )
  4. puts response
  1. const response = await client.ml.getMemoryStats({
  2. human: "true",
  3. });
  4. console.log(response);
  1. GET _ml/memory/_stats?human

This is a possible response:

  1. {
  2. "_nodes": {
  3. "total": 1,
  4. "successful": 1,
  5. "failed": 0
  6. },
  7. "cluster_name": "my_cluster",
  8. "nodes": {
  9. "pQHNt5rXTTWNvUgOrdynKg": {
  10. "name": "node-0",
  11. "ephemeral_id": "ITZ6WGZnSqqeT_unfit2SQ",
  12. "transport_address": "127.0.0.1:9300",
  13. "attributes": {
  14. "ml.machine_memory": "68719476736",
  15. "ml.max_jvm_size": "536870912"
  16. },
  17. "roles": [
  18. "data",
  19. "data_cold",
  20. "data_content",
  21. "data_frozen",
  22. "data_hot",
  23. "data_warm",
  24. "ingest",
  25. "master",
  26. "ml",
  27. "remote_cluster_client",
  28. "transform"
  29. ],
  30. "mem": {
  31. "total": "64gb",
  32. "total_in_bytes": 68719476736,
  33. "adjusted_total": "64gb",
  34. "adjusted_total_in_bytes": 68719476736,
  35. "ml": {
  36. "max": "19.1gb",
  37. "max_in_bytes": 20615843020,
  38. "native_code_overhead": "0b",
  39. "native_code_overhead_in_bytes": 0,
  40. "anomaly_detectors": "0b",
  41. "anomaly_detectors_in_bytes": 0,
  42. "data_frame_analytics": "0b",
  43. "data_frame_analytics_in_bytes": 0,
  44. "native_inference": "0b",
  45. "native_inference_in_bytes": 0
  46. }
  47. },
  48. "jvm": {
  49. "heap_max": "512mb",
  50. "heap_max_in_bytes": 536870912,
  51. "java_inference_max": "204.7mb",
  52. "java_inference_max_in_bytes": 214748364,
  53. "java_inference": "0b",
  54. "java_inference_in_bytes": 0
  55. }
  56. }
  57. }
  58. }