Moving percentiles aggregation

Moving percentiles aggregation

Given an ordered series of percentiles, the Moving Percentile aggregation will slide a window across those percentiles and allow the user to compute the cumulative percentile.

This is conceptually very similar to the Moving Function pipeline aggregation, except it works on the percentiles sketches instead of the actual buckets values.

Syntax

A moving_percentiles aggregation looks like this in isolation:

  1. {
  2. "moving_percentiles": {
  3. "buckets_path": "the_percentile",
  4. "window": 10
  5. }
  6. }

Table 76. moving_percentiles Parameters

Parameter NameDescriptionRequiredDefault Value

buckets_path

Path to the percentile of interest (see buckets_path Syntax for more details

Required

window

The size of window to “slide” across the histogram.

Required

shift

Shift of window position.

Optional

0

moving_percentiles aggregations must be embedded inside of a histogram or date_histogram aggregation. They can be embedded like any other metric aggregation:

  1. resp = client.search(
  2. size=0,
  3. aggs={
  4. "my_date_histo": {
  5. "date_histogram": {
  6. "field": "date",
  7. "calendar_interval": "1M"
  8. },
  9. "aggs": {
  10. "the_percentile": {
  11. "percentiles": {
  12. "field": "price",
  13. "percents": [
  14. 1,
  15. 99
  16. ]
  17. }
  18. },
  19. "the_movperc": {
  20. "moving_percentiles": {
  21. "buckets_path": "the_percentile",
  22. "window": 10
  23. }
  24. }
  25. }
  26. }
  27. },
  28. )
  29. print(resp)
  1. response = client.search(
  2. body: {
  3. size: 0,
  4. aggregations: {
  5. my_date_histo: {
  6. date_histogram: {
  7. field: 'date',
  8. calendar_interval: '1M'
  9. },
  10. aggregations: {
  11. the_percentile: {
  12. percentiles: {
  13. field: 'price',
  14. percents: [
  15. 1,
  16. 99
  17. ]
  18. }
  19. },
  20. the_movperc: {
  21. moving_percentiles: {
  22. buckets_path: 'the_percentile',
  23. window: 10
  24. }
  25. }
  26. }
  27. }
  28. }
  29. }
  30. )
  31. puts response
  1. const response = await client.search({
  2. size: 0,
  3. aggs: {
  4. my_date_histo: {
  5. date_histogram: {
  6. field: "date",
  7. calendar_interval: "1M",
  8. },
  9. aggs: {
  10. the_percentile: {
  11. percentiles: {
  12. field: "price",
  13. percents: [1, 99],
  14. },
  15. },
  16. the_movperc: {
  17. moving_percentiles: {
  18. buckets_path: "the_percentile",
  19. window: 10,
  20. },
  21. },
  22. },
  23. },
  24. },
  25. });
  26. console.log(response);
  1. POST /_search
  2. {
  3. "size": 0,
  4. "aggs": {
  5. "my_date_histo": {
  6. "date_histogram": {
  7. "field": "date",
  8. "calendar_interval": "1M"
  9. },
  10. "aggs": {
  11. "the_percentile": {
  12. "percentiles": {
  13. "field": "price",
  14. "percents": [ 1.0, 99.0 ]
  15. }
  16. },
  17. "the_movperc": {
  18. "moving_percentiles": {
  19. "buckets_path": "the_percentile",
  20. "window": 10
  21. }
  22. }
  23. }
  24. }
  25. }
  26. }

A date_histogram named “my_date_histo” is constructed on the “timestamp” field, with one-day intervals

A percentile metric is used to calculate the percentiles of a field.

Finally, we specify a moving_percentiles aggregation which uses “the_percentile” sketch as its input.

Moving percentiles are built by first specifying a histogram or date_histogram over a field. You then add a percentile metric inside of that histogram. Finally, the moving_percentiles is embedded inside the histogram. The buckets_path parameter is then used to “point” at the percentiles aggregation inside of the histogram (see buckets_path Syntax for a description of the syntax for buckets_path).

And the following may be the response:

  1. {
  2. "took": 11,
  3. "timed_out": false,
  4. "_shards": ...,
  5. "hits": ...,
  6. "aggregations": {
  7. "my_date_histo": {
  8. "buckets": [
  9. {
  10. "key_as_string": "2015/01/01 00:00:00",
  11. "key": 1420070400000,
  12. "doc_count": 3,
  13. "the_percentile": {
  14. "values": {
  15. "1.0": 151.0,
  16. "99.0": 200.0
  17. }
  18. }
  19. },
  20. {
  21. "key_as_string": "2015/02/01 00:00:00",
  22. "key": 1422748800000,
  23. "doc_count": 2,
  24. "the_percentile": {
  25. "values": {
  26. "1.0": 10.4,
  27. "99.0": 49.6
  28. }
  29. },
  30. "the_movperc": {
  31. "values": {
  32. "1.0": 151.0,
  33. "99.0": 200.0
  34. }
  35. }
  36. },
  37. {
  38. "key_as_string": "2015/03/01 00:00:00",
  39. "key": 1425168000000,
  40. "doc_count": 2,
  41. "the_percentile": {
  42. "values": {
  43. "1.0": 175.25,
  44. "99.0": 199.75
  45. }
  46. },
  47. "the_movperc": {
  48. "values": {
  49. "1.0": 11.6,
  50. "99.0": 200.0
  51. }
  52. }
  53. }
  54. ]
  55. }
  56. }
  57. }

The output format of the moving_percentiles aggregation is inherited from the format of the referenced percentiles aggregation.

Moving percentiles pipeline aggregations always run with skip gap policy.

shift parameter

By default (with shift = 0), the window that is offered for calculation is the last n values excluding the current bucket. Increasing shift by 1 moves starting window position by 1 to the right.

  • To include current bucket to the window, use shift = 1.
  • For center alignment (n / 2 values before and after the current bucket), use shift = window / 2.
  • For right alignment (n values after the current bucket), use shift = window.

If either of window edges moves outside the borders of data series, the window shrinks to include available values only.