Median absolute deviation aggregations

The median_absolute_deviation metric is a single-value metric aggregation that returns a median absolute deviation field. Median absolute deviation is a statistical measure of data variability. Because the median absolute deviation measures dispersion from the median, it provides a more robust measure of variability that is less affected by outliers in a dataset.

Median absolute deviation is calculated as follows:
median_absolute_deviation = median(|Xi - Median(Xi)|)

The following example calculates the median absolute deviation of the DistanceMiles field in the sample dataset opensearch_dashboards_sample_data_flights:

  1. GET opensearch_dashboards_sample_data_flights/_search
  2. {
  3. "size": 0,
  4. "aggs": {
  5. "median_absolute_deviation_DistanceMiles": {
  6. "median_absolute_deviation": {
  7. "field": "DistanceMiles"
  8. }
  9. }
  10. }
  11. }

copy

Example response

  1. {
  2. "took": 35,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 10000,
  13. "relation": "gte"
  14. },
  15. "max_score": null,
  16. "hits": []
  17. },
  18. "aggregations": {
  19. "median_absolute_deviation_distanceMiles": {
  20. "value": 1829.8993624441966
  21. }
  22. }
  23. }

Missing

By default, if a field is missing or has a null value in a document, it is ignored during computation. However, you can specify a value to be used for those missing or null fields by using the missing parameter, as shown in the following request:

  1. GET opensearch_dashboards_sample_data_flights/_search
  2. {
  3. "size": 0,
  4. "aggs": {
  5. "median_absolute_deviation_distanceMiles": {
  6. "median_absolute_deviation": {
  7. "field": "DistanceMiles",
  8. "missing": 1000
  9. }
  10. }
  11. }
  12. }

copy

Example response

  1. {
  2. "took": 7,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 10000,
  13. "relation": "gte"
  14. },
  15. "max_score": null,
  16. "hits": []
  17. },
  18. "aggregations": {
  19. "median_absolute_deviation_distanceMiles": {
  20. "value": 1829.6443646143355
  21. }
  22. }
  23. }

Compression

The median absolute deviation is calculated using the t-digest data structure, which balances between performance and estimation accuracy through the compression parameter (default value: 1000). Adjusting the compression value affects the trade-off between computational efficiency and precision. Lower compression values improve performance but may reduce estimation accuracy, while higher values enhance accuracy at the cost of increased computational overhead, as shown in the following request:

  1. GET opensearch_dashboards_sample_data_flights/_search
  2. {
  3. "size": 0,
  4. "aggs": {
  5. "median_absolute_deviation_DistanceMiles": {
  6. "median_absolute_deviation": {
  7. "field": "DistanceMiles",
  8. "compression": 10
  9. }
  10. }
  11. }
  12. }

copy

Example response

  1. {
  2. "took": 1,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 10000,
  13. "relation": "gte"
  14. },
  15. "max_score": null,
  16. "hits": []
  17. },
  18. "aggregations": {
  19. "median_absolute_deviation_DistanceMiles": {
  20. "value": 1836.265614211182
  21. }
  22. }
  23. }