Analyze index disk usage API

Analyze index disk usage API

This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

New API reference

For the most up-to-date API details, refer to Index APIs.

Analyzes the disk usage of each field of an index or data stream. This API might not support indices created in previous Elasticsearch versions. The result of a small index can be inaccurate as some parts of an index might not be analyzed by the API.

  1. resp = client.indices.disk_usage(
  2. index="my-index-000001",
  3. run_expensive_tasks=True,
  4. )
  5. print(resp)
  1. response = client.indices.disk_usage(
  2. index: 'my-index-000001',
  3. run_expensive_tasks: true
  4. )
  5. puts response
  1. const response = await client.indices.diskUsage({
  2. index: "my-index-000001",
  3. run_expensive_tasks: "true",
  4. });
  5. console.log(response);
  1. POST /my-index-000001/_disk_usage?run_expensive_tasks=true

Request

POST /<target>/_disk_usage

Prerequisites

  • If the Elasticsearch security features are enabled, you must have the manage index privilege for the target index, data stream, or alias.

Path parameters

<target>

(Required, string) Comma-separated list of data streams, indices, and aliases used to limit the request. It’s recommended to execute this API with a single index (or the latest backing index of a data stream) as the API consumes resources significantly.

Query parameters

allow_no_indices

(Optional, Boolean) If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

Defaults to true.

expand_wildcards

(Optional, string) Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are:

  • all

    Match any data stream or index, including hidden ones.

    open

    Match open, non-hidden indices. Also matches any non-hidden data stream.

    closed

    Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.

    hidden

    Match hidden data streams and hidden indices. Must be combined with open, closed, or both.

    none

    Wildcard patterns are not accepted.

Defaults to open.

flush

(Optional, Boolean) If true, the API performs a flush before analysis. If false, the response may not include uncommitted data. Defaults to true.

ignore_unavailable

(Optional, Boolean) If false, the request returns an error if it targets a missing or closed index. Defaults to false.

run_expensive_tasks

(Required, Boolean) Analyzing field disk usage is resource-intensive. To use the API, this parameter must be set to true. Defaults to false.

wait_for_active_shards

(Optional, string) The number of copies of each shard that must be active before proceeding with the operation. Set to all or any non-negative integer up to the total number of copies of each shard in the index (number_of_replicas+1). Defaults to 1, meaning to wait just for each primary shard to be active.

See Active shards.

Examples

  1. resp = client.indices.disk_usage(
  2. index="my-index-000001",
  3. run_expensive_tasks=True,
  4. )
  5. print(resp)
  1. response = client.indices.disk_usage(
  2. index: 'my-index-000001',
  3. run_expensive_tasks: true
  4. )
  5. puts response
  1. const response = await client.indices.diskUsage({
  2. index: "my-index-000001",
  3. run_expensive_tasks: "true",
  4. });
  5. console.log(response);
  1. POST /my-index-000001/_disk_usage?run_expensive_tasks=true

The API returns:

  1. {
  2. "_shards": {
  3. "total": 1,
  4. "successful": 1,
  5. "failed": 0
  6. },
  7. "my-index-000001": {
  8. "store_size": "929mb",
  9. "store_size_in_bytes": 974192723,
  10. "all_fields": {
  11. "total": "928.9mb",
  12. "total_in_bytes": 973977084,
  13. "inverted_index": {
  14. "total": "107.8mb",
  15. "total_in_bytes": 113128526
  16. },
  17. "stored_fields": "623.5mb",
  18. "stored_fields_in_bytes": 653819143,
  19. "doc_values": "125.7mb",
  20. "doc_values_in_bytes": 131885142,
  21. "points": "59.9mb",
  22. "points_in_bytes": 62885773,
  23. "norms": "2.3kb",
  24. "norms_in_bytes": 2356,
  25. "term_vectors": "2.2kb",
  26. "term_vectors_in_bytes": 2310,
  27. "knn_vectors": "0b",
  28. "knn_vectors_in_bytes": 0
  29. },
  30. "fields": {
  31. "_id": {
  32. "total": "49.3mb",
  33. "total_in_bytes": 51709993,
  34. "inverted_index": {
  35. "total": "29.7mb",
  36. "total_in_bytes": 31172745
  37. },
  38. "stored_fields": "19.5mb",
  39. "stored_fields_in_bytes": 20537248,
  40. "doc_values": "0b",
  41. "doc_values_in_bytes": 0,
  42. "points": "0b",
  43. "points_in_bytes": 0,
  44. "norms": "0b",
  45. "norms_in_bytes": 0,
  46. "term_vectors": "0b",
  47. "term_vectors_in_bytes": 0,
  48. "knn_vectors": "0b",
  49. "knn_vectors_in_bytes": 0
  50. },
  51. "_primary_term": {...},
  52. "_seq_no": {...},
  53. "_version": {...},
  54. "_source": {
  55. "total": "603.9mb",
  56. "total_in_bytes": 633281895,
  57. "inverted_index": {...},
  58. "stored_fields": "603.9mb",
  59. "stored_fields_in_bytes": 633281895,
  60. "doc_values": "0b",
  61. "doc_values_in_bytes": 0,
  62. "points": "0b",
  63. "points_in_bytes": 0,
  64. "norms": "0b",
  65. "norms_in_bytes": 0,
  66. "term_vectors": "0b",
  67. "term_vectors_in_bytes": 0,
  68. "knn_vectors": "0b",
  69. "knn_vectors_in_bytes": 0
  70. },
  71. "context": {
  72. "total": "28.6mb",
  73. "total_in_bytes": 30060405,
  74. "inverted_index": {
  75. "total": "22mb",
  76. "total_in_bytes": 23090908
  77. },
  78. "stored_fields": "0b",
  79. "stored_fields_in_bytes": 0,
  80. "doc_values": "0b",
  81. "doc_values_in_bytes": 0,
  82. "points": "0b",
  83. "points_in_bytes": 0,
  84. "norms": "2.3kb",
  85. "norms_in_bytes": 2356,
  86. "term_vectors": "2.2kb",
  87. "term_vectors_in_bytes": 2310,
  88. "knn_vectors": "0b",
  89. "knn_vectors_in_bytes": 0
  90. },
  91. "context.keyword": {...},
  92. "message": {...},
  93. "message.keyword": {...}
  94. }
  95. }
  96. }

The store size of only analyzed shards of the index.

The total size of fields of the analyzed shards of the index. This total is usually smaller than the index size specified in <1> as some small metadata files are ignored and some parts of data files might not be scanned by the API.

The stored size of the _id field

The stored size of the _source field. As stored fields are stored together in a compressed format, the sizes of stored fields are estimates and can be inaccurate. The stored size of the _id field is likely underestimated while the _source field is overestimated.