Mapping concepts across SQL and Elasticsearch
While SQL and Elasticsearch have different terms for the way the data is organized (and different semantics), essentially their purpose is the same.
So let’s start from the bottom; these roughly are:
SQL | Elasticsearch | Description |
---|---|---|
|
| In both cases, at the lowest level, data is stored in named entries, of a variety of data types, containing one value. SQL calls such an entry a column while Elasticsearch a field. Notice that in Elasticsearch a field can contain multiple values of the same type (essentially a list) while in SQL, a column can contain exactly one value of said type. Elasticsearch SQL will do its best to preserve the SQL semantic and, depending on the query, reject those that return fields with more than one value. |
|
|
|
|
| The target against which queries, whether in SQL or Elasticsearch get executed against. |
| implicit | In RDBMS, |
|
| In SQL, |
|
| Traditionally in SQL, cluster refers to a single RDMBS instance which contains a number of While RDBMS tend to have only one running instance, on a single machine (not distributed), Elasticsearch goes the opposite way and by default, is distributed and multi-instance. Further more, an Elasticsearch single cluster:: Multiple Elasticsearch instances typically distributed across machines, running within the same namespace. multiple clusters:: Multiple clusters, each with its own namespace, connected to each other in a federated setup (see Cross-cluster search). |
As one can see while the mapping between the concepts are not exactly one to one and the semantics somewhat different, there are more things in common than differences. In fact, thanks to SQL declarative nature, many concepts can move across Elasticsearch transparently and the terminology of the two likely to be used interchangeably throughout the rest of the material.