Geo Centroid Aggregation
A metric aggregation that computes the weighted centroid from all coordinate values for geo fields.
Example:
PUT /museums
{
"mappings": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
POST /museums/_bulk?refresh
{"index":{"_id":1}}
{"location": "52.374081,4.912350", "city": "Amsterdam", "name": "NEMO Science Museum"}
{"index":{"_id":2}}
{"location": "52.369219,4.901618", "city": "Amsterdam", "name": "Museum Het Rembrandthuis"}
{"index":{"_id":3}}
{"location": "52.371667,4.914722", "city": "Amsterdam", "name": "Nederlands Scheepvaartmuseum"}
{"index":{"_id":4}}
{"location": "51.222900,4.405200", "city": "Antwerp", "name": "Letterenhuis"}
{"index":{"_id":5}}
{"location": "48.861111,2.336389", "city": "Paris", "name": "Musée du Louvre"}
{"index":{"_id":6}}
{"location": "48.860000,2.327000", "city": "Paris", "name": "Musée d'Orsay"}
POST /museums/_search?size=0
{
"aggs": {
"centroid": {
"geo_centroid": {
"field": "location"
}
}
}
}
The |
The above aggregation demonstrates how one would compute the centroid of the location field for all documents with a crime type of burglary.
The response for the above aggregation:
{
...
"aggregations": {
"centroid": {
"location": {
"lat": 51.00982965203002,
"lon": 3.9662131341174245
},
"count": 6
}
}
}
The geo_centroid
aggregation is more interesting when combined as a sub-aggregation to other bucket aggregations.
Example:
POST /museums/_search?size=0
{
"aggs": {
"cities": {
"terms": { "field": "city.keyword" },
"aggs": {
"centroid": {
"geo_centroid": { "field": "location" }
}
}
}
}
}
The above example uses geo_centroid
as a sub-aggregation to a terms bucket aggregation for finding the central location for museums in each city.
The response for the above aggregation:
{
...
"aggregations": {
"cities": {
"sum_other_doc_count": 0,
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "Amsterdam",
"doc_count": 3,
"centroid": {
"location": {
"lat": 52.371655656024814,
"lon": 4.909563297405839
},
"count": 3
}
},
{
"key": "Paris",
"doc_count": 2,
"centroid": {
"location": {
"lat": 48.86055548675358,
"lon": 2.3316944623366
},
"count": 2
}
},
{
"key": "Antwerp",
"doc_count": 1,
"centroid": {
"location": {
"lat": 51.22289997059852,
"lon": 4.40519998781383
},
"count": 1
}
}
]
}
}
}
Geo Centroid Aggregation on geo_shape
fields
The centroid metric for geo-shapes is more nuanced than for points. The centroid of a specific aggregation bucket containing shapes is the centroid of the highest-dimensionality shape type in the bucket. For example, if a bucket contains shapes comprising of polygons and lines, then the lines do not contribute to the centroid metric. Each type of shape’s centroid is calculated differently. Envelopes and circles ingested via the Circle are treated as polygons.
Geometry Type | Centroid Calculation |
---|---|
[Multi]Point | equally weighted average of all the coordinates |
[Multi]LineString | a weighted average of all the centroids of each segment, where the weight of each segment is its length in degrees |
[Multi]Polygon | a weighted average of all the centroids of all the triangles of a polygon where the triangles are formed by every two consecutive vertices and the starting-point. holes have negative weights. weights represent the area of the triangle in deg^2 calculated |
GeometryCollection | The centroid of all the underlying geometries with the highest dimension. If Polygons and Lines and/or Points, then lines and/or points are ignored. If Lines and Points, then points are ignored |
Example:
PUT /places
{
"mappings": {
"properties": {
"geometry": {
"type": "geo_shape"
}
}
}
}
POST /places/_bulk?refresh
{"index":{"_id":1}}
{"name": "NEMO Science Museum", "geometry": "POINT(4.912350 52.374081)" }
{"index":{"_id":2}}
{"name": "Sportpark De Weeren", "geometry": { "type": "Polygon", "coordinates": [ [ [ 4.965305328369141, 52.39347642069457 ], [ 4.966979026794433, 52.391721758934835 ], [ 4.969425201416015, 52.39238958618537 ], [ 4.967944622039794, 52.39420969150824 ], [ 4.965305328369141, 52.39347642069457 ] ] ] } }
POST /places/_search?size=0
{
"aggs": {
"centroid": {
"geo_centroid": {
"field": "geometry"
}
}
}
}
{
...
"aggregations": {
"centroid": {
"location": {
"lat": 52.39296147599816,
"lon": 4.967404240742326
},
"count": 2
}
}
}
Using geo_centroid
as a sub-aggregation of geohash_grid
The geohash_grid
aggregation places documents, not individual geo-points, into buckets. If a document’s geo_point
field contains multiple values, the document could be assigned to multiple buckets, even if one or more of its geo-points are outside the bucket boundaries.
If a geocentroid
sub-aggregation is also used, each centroid is calculated using all geo-points in a bucket, including those outside the bucket boundaries. This can result in centroids outside of bucket boundaries.