Bucket sort aggregation
Bucket sort aggregation
A parent pipeline aggregation which sorts the buckets of its parent multi-bucket aggregation. Zero or more sort fields may be specified together with the corresponding sort order. Each bucket may be sorted based on its _key
, _count
or its sub-aggregations. In addition, parameters from
and size
may be set in order to truncate the result buckets.
The bucket_sort
aggregation, like all pipeline aggregations, is executed after all other non-pipeline aggregations. This means the sorting only applies to whatever buckets are already returned from the parent aggregation. For example, if the parent aggregation is terms
and its size
is set to 10
, the bucket_sort
will only sort over those 10 returned term buckets.
Syntax
A bucket_sort
aggregation looks like this in isolation:
{
"bucket_sort": {
"sort": [
{ "sort_field_1": { "order": "asc" } },
{ "sort_field_2": { "order": "desc" } },
"sort_field_3"
],
"from": 1,
"size": 3
}
}
Here, |
Table 51. bucket_sort
Parameters
Parameter Name | Description | Required | Default Value |
---|---|---|---|
| The list of fields to sort on. See sort for more details. | Optional | |
| Buckets in positions prior to the set value will be truncated. | Optional |
|
| The number of buckets to return. Defaults to all buckets of the parent aggregation. | Optional | |
| The policy to apply when gaps are found in the data (see Dealing with gaps in the data for more details) | Optional |
|
The following snippet returns the buckets corresponding to the 3 months with the highest total sales in descending order:
POST /sales/_search
{
"size": 0,
"aggs": {
"sales_per_month": {
"date_histogram": {
"field": "date",
"calendar_interval": "month"
},
"aggs": {
"total_sales": {
"sum": {
"field": "price"
}
},
"sales_bucket_sort": {
"bucket_sort": {
"sort": [
{ "total_sales": { "order": "desc" } }
],
"size": 3
}
}
}
}
}
}
| |
|
And the following may be the response:
{
"took": 82,
"timed_out": false,
"_shards": ...,
"hits": ...,
"aggregations": {
"sales_per_month": {
"buckets": [
{
"key_as_string": "2015/01/01 00:00:00",
"key": 1420070400000,
"doc_count": 3,
"total_sales": {
"value": 550.0
}
},
{
"key_as_string": "2015/03/01 00:00:00",
"key": 1425168000000,
"doc_count": 2,
"total_sales": {
"value": 375.0
}
},
{
"key_as_string": "2015/02/01 00:00:00",
"key": 1422748800000,
"doc_count": 2,
"total_sales": {
"value": 60.0
}
}
]
}
}
}
Truncating without sorting
It is also possible to use this aggregation in order to truncate the result buckets without doing any sorting. To do so, just use the from
and/or size
parameters without specifying sort
.
The following example simply truncates the result so that only the second bucket is returned:
POST /sales/_search
{
"size": 0,
"aggs": {
"sales_per_month": {
"date_histogram": {
"field": "date",
"calendar_interval": "month"
},
"aggs": {
"bucket_truncate": {
"bucket_sort": {
"from": 1,
"size": 1
}
}
}
}
}
}
Response:
{
"took": 11,
"timed_out": false,
"_shards": ...,
"hits": ...,
"aggregations": {
"sales_per_month": {
"buckets": [
{
"key_as_string": "2015/02/01 00:00:00",
"key": 1422748800000,
"doc_count": 2
}
]
}
}
}