Geoshape field type
Geoshape field type
The geo_shape
data type facilitates the indexing of and searching with arbitrary geoshapes such as rectangles and polygons. It should be used when either the data being indexed or the queries being executed contain shapes other than just points.
You can query documents using this type using a geo_shape query.
Elasticsearch encodes geo_shape
values as BKD trees by default. To use BKD encoding, do not specify the following mapping options:
distance_error_pct
points_only
precision
strategy
tree_levels
tree
If you specify one or more of these options, the field will use prefix tree encoding instead. Prefix tree encoding is deprecated.
Mapping Options
The geo_shape
mapping maps GeoJSON geometry objects to the geo_shape
type. To enable it, users must explicitly map fields to the geo_shape
type.
Option | Description | Default |
---|---|---|
| [6.6] Deprecated in 6.6. PrefixTrees no longer used Name of the PrefixTree implementation to be used: |
|
| [6.6] Deprecated in 6.6. PrefixTrees no longer used This parameter may be used instead of |
|
| [6.6] Deprecated in 6.6. PrefixTrees no longer used Maximum number of layers to be used by the PrefixTree. This can be used to control the precision of shape representations andtherefore how many terms are indexed. Defaults to the default value of the chosen PrefixTree implementation. Since this parameter requires a certain level of understanding of the underlying implementation, users may use the | various |
| [6.6] Deprecated in 6.6. PrefixTrees no longer used The strategy parameter defines the approach for how to represent shapes at indexing and search time. It also influences the capabilities available so it is recommended to let Elasticsearch set this parameter automatically. There are two strategies available: |
|
| [6.6] Deprecated in 6.6. PrefixTrees no longer used Used as a hint to the PrefixTree about how precise it should be. Defaults to 0.025 (2.5%) with 0.5 as the maximum supported value. PERFORMANCE NOTE: This value will default to 0 if a |
|
| Optional. Default orientation for the field’s WKT polygons. This parameter sets and returns only a To set
To set
|
|
| [6.6] Deprecated in 6.6. PrefixTrees no longer used Setting this option to |
|
| If true, malformed GeoJSON or WKT shapes are ignored. If false (default), malformed GeoJSON and WKT shapes throw an exception and reject the entire document. |
|
| If |
|
| If |
|
Indexing approach
Geoshape types are indexed by decomposing the shape into a triangular mesh and indexing each triangle as a 7 dimension point in a BKD tree. This provides near perfect spatial resolution (down to 1e-7 decimal degree precision) since all spatial relations are computed using an encoded vector representation of the original shape instead of a raster-grid representation as used by the Prefix trees indexing approach. Performance of the tessellator primarily depends on the number of vertices that define the polygon/multi-polygon. While this is the default indexing technique prefix trees can still be used by setting the tree
or strategy
parameters according to the appropriate Mapping Options. Note that these parameters are now deprecated and will be removed in a future version.
IMPORTANT NOTES
CONTAINS
relation query - when using the new default vector indexing strategy, geo_shape
queries with relation
defined as contains
are supported for indices created with ElasticSearch 7.5.0 or higher.
Prefix trees
[6.6] Deprecated in 6.6. PrefixTrees no longer used To efficiently represent shapes in an inverted index, Shapes are converted into a series of hashes representing grid squares (commonly referred to as “rasters”) using implementations of a PrefixTree. The tree notion comes from the fact that the PrefixTree uses multiple grid layers, each with an increasing level of precision to represent the Earth. This can be thought of as increasing the level of detail of a map or image at higher zoom levels. Since this approach causes precision issues with indexed shape, it has been deprecated in favor of a vector indexing approach that indexes the shapes as a triangular mesh (see Indexing approach).
Multiple PrefixTree implementations are provided:
- GeohashPrefixTree - Uses geohashes for grid squares. Geohashes are base32 encoded strings of the bits of the latitude and longitude interleaved. So the longer the hash, the more precise it is. Each character added to the geohash represents another tree level and adds 5 bits of precision to the geohash. A geohash represents a rectangular area and has 32 sub rectangles. The maximum number of levels in Elasticsearch is 24; the default is 9.
- QuadPrefixTree - Uses a quadtree for grid squares. Similar to geohash, quad trees interleave the bits of the latitude and longitude the resulting hash is a bit set. A tree level in a quad tree represents 2 bits in this bit set, one for each coordinate. The maximum number of levels for the quad trees in Elasticsearch is 29; the default is 21.
Spatial strategies
[6.6] Deprecated in 6.6. PrefixTrees no longer used The indexing implementation selected relies on a SpatialStrategy for choosing how to decompose the shapes (either as grid squares or a tessellated triangular mesh). Each strategy answers the following:
- What type of Shapes can be indexed?
- What types of Query Operations and Shapes can be used?
- Does it support more than one Shape per field?
The following Strategy implementations (with corresponding capabilities) are provided:
Strategy | Supported Shapes | Supported Queries | Multiple Shapes |
---|---|---|---|
|
| Yes | |
|
| Yes |
Accuracy
Recursive
and Term
strategies do not provide 100% accuracy and depending on how they are configured it may return some false positives for INTERSECTS
, WITHIN
and CONTAINS
queries, and some false negatives for DISJOINT
queries. To mitigate this, it is important to select an appropriate value for the tree_levels parameter and to adjust expectations accordingly. For example, a point may be near the border of a particular grid cell and may thus not match a query that only matches the cell right next to it — even though the shape is very close to the point.
Example
PUT /example
{
"mappings": {
"properties": {
"location": {
"type": "geo_shape"
}
}
}
}
This mapping definition maps the location field to the geo_shape type using the default vector implementation. It provides approximately 1e-7 decimal degree precision.
Performance considerations with Prefix Trees
[6.6] Deprecated in 6.6. PrefixTrees no longer used With prefix trees, Elasticsearch uses the paths in the tree as terms in the inverted index and in queries. The higher the level (and thus the precision), the more terms are generated. Of course, calculating the terms, keeping them in memory, and storing them on disk all have a price. Especially with higher tree levels, indices can become extremely large even with a modest amount of data. Additionally, the size of the features also matters. Big, complex polygons can take up a lot of space at higher tree levels. Which setting is right depends on the use case. Generally one trades off accuracy against index size and query performance.
The defaults in Elasticsearch for both implementations are a compromise between index size and a reasonable level of precision of 50m at the equator. This allows for indexing tens of millions of shapes without overly bloating the resulting index too much relative to the input size.
Geo-shape queries on geo-shapes implemented with PrefixTrees will not be executed if search.allow_expensive_queries is set to false.
Input Structure
Shapes can be represented using either the GeoJSON or Well-Known Text (WKT) format. The following table provides a mapping of GeoJSON and WKT to Elasticsearch types:
GeoJSON Type | WKT Type | Elasticsearch Type | Description |
---|---|---|---|
|
|
| A single geographic coordinate. Note: Elasticsearch uses WGS-84 coordinates only. |
|
|
| An arbitrary line given two or more points. |
|
|
| A closed polygon whose first and last point must match, thus requiring |
|
|
| An array of unconnected, but likely related points. |
|
|
| An array of separate linestrings. |
|
|
| An array of separate polygons. |
|
|
| A GeoJSON shape similar to the |
|
|
| A bounding rectangle, or envelope, specified by specifying only the top left and bottom right points. |
|
|
| A circle specified by a center point and radius with units, which default to |
For all types, both the inner type
and coordinates
fields are required.
In GeoJSON and WKT, and therefore Elasticsearch, the correct coordinate order is longitude, latitude (X, Y) within coordinate arrays. This differs from many Geospatial APIs (e.g., Google Maps) that generally use the colloquial latitude, longitude (Y, X).
A point is a single geographic coordinate, such as the location of a building or the current position given by a smartphone’s Geolocation API. The following is an example of a point in GeoJSON.
POST /example/_doc
{
"location" : {
"type" : "Point",
"coordinates" : [-77.03653, 38.897676]
}
}
The following is an example of a point in WKT:
POST /example/_doc
{
"location" : "POINT (-77.03653 38.897676)"
}
A linestring defined by an array of two or more positions. By specifying only two points, the linestring will represent a straight line. Specifying more than two points creates an arbitrary path. The following is an example of a linestring in GeoJSON.
POST /example/_doc
{
"location" : {
"type" : "LineString",
"coordinates" : [[-77.03653, 38.897676], [-77.009051, 38.889939]]
}
}
The following is an example of a linestring in WKT:
POST /example/_doc
{
"location" : "LINESTRING (-77.03653 38.897676, -77.009051 38.889939)"
}
The above linestring would draw a straight line starting at the White House to the US Capitol Building.
A polygon is defined by a list of a list of points. The first and last points in each (outer) list must be the same (the polygon must be closed). The following is an example of a polygon in GeoJSON.
POST /example/_doc
{
"location" : {
"type" : "Polygon",
"coordinates" : [
[ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ]
]
}
}
The following is an example of a polygon in WKT:
POST /example/_doc
{
"location" : "POLYGON ((100.0 0.0, 101.0 0.0, 101.0 1.0, 100.0 1.0, 100.0 0.0))"
}
The first array represents the outer boundary of the polygon, the other arrays represent the interior shapes (“holes”). The following is a GeoJSON example of a polygon with a hole:
POST /example/_doc
{
"location" : {
"type" : "Polygon",
"coordinates" : [
[ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ],
[ [100.2, 0.2], [100.8, 0.2], [100.8, 0.8], [100.2, 0.8], [100.2, 0.2] ]
]
}
}
The following is an example of a polygon with a hole in WKT:
POST /example/_doc
{
"location" : "POLYGON ((100.0 0.0, 101.0 0.0, 101.0 1.0, 100.0 1.0, 100.0 0.0), (100.2 0.2, 100.8 0.2, 100.8 0.8, 100.2 0.8, 100.2 0.2))"
}
Polygon orientation
A polygon’s orientation indicates the order of its vertices: RIGHT
(counterclockwise) or LEFT
(clockwise). Elasticsearch uses a polygon’s orientation to determine if it crosses the international dateline (+/-180° longitude).
You can set a default orientation for WKT polygons using the orientation mapping parameter. This is because the WKT specification doesn’t specify or enforce a default orientation.
GeoJSON polygons use a default orientation of RIGHT
, regardless of orientation
mapping parameter’s value. This is because the GeoJSON specification mandates that an outer polygon use a counterclockwise orientation and interior shapes use a clockwise orientation.
You can override the default orientation for GeoJSON polygons using the document-level orientation
parameter. For example, the following indexing request specifies a document-level orientation
of LEFT
.
POST /example/_doc
{
"location" : {
"type" : "Polygon",
"orientation" : "LEFT",
"coordinates" : [
[ [-177.0, 10.0], [176.0, 15.0], [172.0, 0.0], [176.0, -15.0], [-177.0, -10.0], [-177.0, 10.0] ]
]
}
}
Elasticsearch only uses a polygon’s orientation to determine if it crosses the international dateline. If the difference between a polygon’s minimum longitude and the maximum longitude is less than 180°, the polygon doesn’t cross the dateline and its orientation has no effect.
If the difference between a polygon’s minimum longitude and the maximum longitude is 180° or greater, Elasticsearch checks whether the polygon’s document-level orientation
differs from the default orientation. If the orientation differs, Elasticsearch considers the polygon to cross the international dateline and splits the polygon at the dateline.
The following is an example of a list of GeoJSON points:
POST /example/_doc
{
"location" : {
"type" : "MultiPoint",
"coordinates" : [
[102.0, 2.0], [103.0, 2.0]
]
}
}
The following is an example of a list of WKT points:
POST /example/_doc
{
"location" : "MULTIPOINT (102.0 2.0, 103.0 2.0)"
}
The following is an example of a list of GeoJSON linestrings:
POST /example/_doc
{
"location" : {
"type" : "MultiLineString",
"coordinates" : [
[ [102.0, 2.0], [103.0, 2.0], [103.0, 3.0], [102.0, 3.0] ],
[ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0] ],
[ [100.2, 0.2], [100.8, 0.2], [100.8, 0.8], [100.2, 0.8] ]
]
}
}
The following is an example of a list of WKT linestrings:
POST /example/_doc
{
"location" : "MULTILINESTRING ((102.0 2.0, 103.0 2.0, 103.0 3.0, 102.0 3.0), (100.0 0.0, 101.0 0.0, 101.0 1.0, 100.0 1.0), (100.2 0.2, 100.8 0.2, 100.8 0.8, 100.2 0.8))"
}
The following is an example of a list of GeoJSON polygons (second polygon contains a hole):
POST /example/_doc
{
"location" : {
"type" : "MultiPolygon",
"coordinates" : [
[ [[102.0, 2.0], [103.0, 2.0], [103.0, 3.0], [102.0, 3.0], [102.0, 2.0]] ],
[ [[100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0]],
[[100.2, 0.2], [100.8, 0.2], [100.8, 0.8], [100.2, 0.8], [100.2, 0.2]] ]
]
}
}
The following is an example of a list of WKT polygons (second polygon contains a hole):
POST /example/_doc
{
"location" : "MULTIPOLYGON (((102.0 2.0, 103.0 2.0, 103.0 3.0, 102.0 3.0, 102.0 2.0)), ((100.0 0.0, 101.0 0.0, 101.0 1.0, 100.0 1.0, 100.0 0.0), (100.2 0.2, 100.8 0.2, 100.8 0.8, 100.2 0.8, 100.2 0.2)))"
}
The following is an example of a collection of GeoJSON geometry objects:
POST /example/_doc
{
"location" : {
"type": "GeometryCollection",
"geometries": [
{
"type": "Point",
"coordinates": [100.0, 0.0]
},
{
"type": "LineString",
"coordinates": [ [101.0, 0.0], [102.0, 1.0] ]
}
]
}
}
The following is an example of a collection of WKT geometry objects:
POST /example/_doc
{
"location" : "GEOMETRYCOLLECTION (POINT (100.0 0.0), LINESTRING (101.0 0.0, 102.0 1.0))"
}
Envelope
Elasticsearch supports an envelope
type, which consists of coordinates for upper left and lower right points of the shape to represent a bounding rectangle in the format [[minLon, maxLat], [maxLon, minLat]]
:
POST /example/_doc
{
"location" : {
"type" : "envelope",
"coordinates" : [ [100.0, 1.0], [101.0, 0.0] ]
}
}
The following is an example of an envelope using the WKT BBOX format:
NOTE: WKT specification expects the following order: minLon, maxLon, maxLat, minLat.
POST /example/_doc
{
"location" : "BBOX (100.0, 102.0, 2.0, 0.0)"
}
Circle
Elasticsearch supports a circle
type, which consists of a center point with a radius.
You cannot index the circle
type using the default BKD tree indexing approach. Instead, use a circle ingest processor to approximate the circle as a polygon.
The circle
type requires a geo_shape
field mapping with the deprecated recursive
Prefix Tree strategy.
PUT /circle-example
{
"mappings": {
"properties": {
"location": {
"type": "geo_shape",
"strategy": "recursive"
}
}
}
}
The following request indexes a circle
geo-shape.
POST /circle-example/_doc
{
"location" : {
"type" : "circle",
"coordinates" : [101.0, 1.0],
"radius" : "100m"
}
}
Note: The inner radius
field is required. If not specified, then the units of the radius
will default to METERS
.
NOTE: Neither GeoJSON or WKT support a point-radius circle type.
Sorting and Retrieving index Shapes
Due to the complex input structure and index representation of shapes, it is not currently possible to sort shapes or retrieve their fields directly. The geo_shape
value is only retrievable through the _source
field.