Data Types
Data Types
Elasticsearch type | Elasticsearch SQL type | SQL type | SQL precision |
Core types | |||
| NULL | 0 | |
| BOOLEAN | 1 | |
| TINYINT | 3 | |
| SMALLINT | 5 | |
| INTEGER | 10 | |
| BIGINT | 19 | |
| DOUBLE | 15 | |
| REAL | 7 | |
| FLOAT | 3 | |
| DOUBLE | 15 | |
| VARCHAR | 32,766 | |
| VARCHAR | 2,147,483,647 | |
| VARBINARY | 2,147,483,647 | |
| TIMESTAMP | 29 | |
| VARCHAR | 39 | |
Complex types | |||
| STRUCT | 0 | |
| STRUCT | 0 | |
Unsupported types | |||
types not mentioned above |
| OTHER | 0 |
Most of Elasticsearch data types are available in Elasticsearch SQL, as indicated above. As one can see, all of Elasticsearch data types are mapped to the data type with the same name in Elasticsearch SQL, with the exception of date data type which is mapped to datetime in Elasticsearch SQL. This is to avoid confusion with the ANSI SQL types DATE (date only) and TIME (time only), which are also supported by Elasticsearch SQL in queries (with the use of CAST/CONVERT), but don’t correspond to an actual mapping in Elasticsearch (see the table below).
Obviously, not all types in Elasticsearch have an equivalent in SQL and vice-versa hence why, Elasticsearch SQL uses the data type particularities of the former over the latter as ultimately Elasticsearch is the backing store.
In addition to the types above, Elasticsearch SQL also supports at runtime SQL-specific types that do not have an equivalent in Elasticsearch. Such types cannot be loaded from Elasticsearch (as it does not know about them) however can be used inside Elasticsearch SQL in queries or their results.
The table below indicates these types:
SQL type | SQL precision |
| 29 |
| 18 |
| 7 |
| 7 |
| 23 |
| 23 |
| 23 |
| 23 |
| 7 |
| 23 |
| 23 |
| 23 |
| 23 |
| 23 |
| 23 |
| 52 |
| 2,147,483,647 |
| 2,147,483,647 |
SQL and multi-fields
A core concept in Elasticsearch is that of an analyzed
field, that is a full-text value that is interpreted in order to be effectively indexed. These fields are of type text and are not used for sorting or aggregations as their actual value depends on the analyzer used hence why Elasticsearch also offers the keyword type for storing the exact value.
In most case, and the default actually, is to use both types for strings which Elasticsearch supports through multi-fields, that is the ability to index the same string in multiple ways; for example index it both as text
for search but also as keyword
for sorting and aggregations.
As SQL requires exact values, when encountering a text
field Elasticsearch SQL will search for an exact multi-field that it can use for comparisons, sorting and aggregations. To do that, it will search for the first keyword
that it can find that is not normalized and use that as the original field exact value.
Consider the following string
mapping:
{
"first_name": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
The following SQL query:
SELECT first_name FROM index WHERE first_name = 'John'
is identical to:
SELECT first_name FROM index WHERE first_name.raw = 'John'
as Elasticsearch SQL automatically picks up the raw
multi-field from raw
for exact matching.