Term-level queries
Term-level queries search an index for documents that contain an exact search term. Documents returned by a term-level query are not sorted by their relevance scores.
When working with text data, use term-level queries for fields mapped as keyword
only.
Term-level queries are not suited for searching analyzed text fields. To return analyzed fields, use a full-text query.
Term-level query types
The following table lists all term-level query types.
Query type | Description |
---|---|
term | Searches for documents with an exact term in a specific field. |
terms | Searches for documents with one or more terms in a specific field. |
terms_set | Searches for documents that match a minimum number of terms in a specific field. |
ids | Searches for documents by document ID. |
range | Searches for documents with field values in a specific range. |
prefix | Searches for documents with terms that begin with a specific prefix. |
exists | Searches for documents with any indexed value in a specific field. |
fuzzy | Searches for documents with terms that are similar to the search term within the maximum allowed Levenshtein distance. The Levenshtein distance measures the number of one-character changes needed to change one term to another term. |
wildcard | Searches for documents with terms that match a wildcard pattern. |
regexp | Searches for documents with terms that match a regular expression. |
Term
Use the term
query to search for an exact term in a field.
GET shakespeare/_search
{
"query": {
"term": {
"line_id": {
"value": "61809"
}
}
}
}
copy
Terms
Use the terms
query to search for multiple terms in the same field.
GET shakespeare/_search
{
"query": {
"terms": {
"line_id": [
"61809",
"61810"
]
}
}
}
copy
You get back documents that match any of the terms.
Terms set
With a terms set query, you can search for documents that match a minimum number of exact terms in a specified field. The terms_set
query is similar to the terms
query, but you can specify the minimum number of matching terms that are required to return a document. You can specify this number either in a field in the index or with a script.
As an example, consider an index that contains students with classes they have taken. When setting up the mapping for this index, you need to provide a numeric field that specifies the minimum number of matching terms that are required to return a document:
PUT students
{
"mappings": {
"properties": {
"name": {
"type": "keyword"
},
"classes": {
"type": "keyword"
},
"min_required": {
"type": "integer"
}
}
}
}
copy
Next, index two documents that correspond to students:
PUT students/_doc/1
{
"name": "Mary Major",
"classes": [ "CS101", "CS102", "MATH101" ],
"min_required": 2
}
copy
PUT students/_doc/2
{
"name": "John Doe",
"classes": [ "CS101", "MATH101", "ENG101" ],
"min_required": 2
}
copy
Now search for students who have taken at least two of the following classes: CS101
, CS102
, MATH101
:
GET students/_search
{
"query": {
"terms_set": {
"classes": {
"terms": [ "CS101", "CS102", "MATH101" ],
"minimum_should_match_field": "min_required"
}
}
}
}
copy
The response contains both students:
{
"took" : 44,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.4544616,
"hits" : [
{
"_index" : "students",
"_id" : "1",
"_score" : 1.4544616,
"_source" : {
"name" : "Mary Major",
"classes" : [
"CS101",
"CS102",
"MATH101"
],
"min_required" : 2
}
},
{
"_index" : "students",
"_id" : "2",
"_score" : 0.5013843,
"_source" : {
"name" : "John Doe",
"classes" : [
"CS101",
"MATH101",
"ENG101"
],
"min_required" : 2
}
}
]
}
}
To specify the minimum number of terms a document should match with a script, provide the script in the minimum_should_match_script
field:
GET students/_search
{
"query": {
"terms_set": {
"classes": {
"terms": [ "CS101", "CS102", "MATH101" ],
"minimum_should_match_script": {
"source": "Math.min(params.num_terms, doc['min_required'].value)"
}
}
}
}
}
copy
IDs
Use the ids
query to search for one or more document ID values.
GET shakespeare/_search
{
"query": {
"ids": {
"values": [
34229,
91296
]
}
}
}
copy
Range
You can search for a range of values in a field with the range
query.
To search for documents where the line_id
value is >= 10 and <= 20:
GET shakespeare/_search
{
"query": {
"range": {
"line_id": {
"gte": 10,
"lte": 20
}
}
}
}
copy
Parameter | Behavior |
---|---|
gte | Greater than or equal to. |
gt | Greater than. |
lte | Less than or equal to. |
lt | Less than. |
In addition to the range query parameters, you can provide date formats or relation operators such as “contains” or “within.” To see the supported field types for range queries, see Range query optional parameters. To see all date formats, see Formats.
Assume that you have a products
index and you want to find all the products that were added in the year 2019:
GET products/_search
{
"query": {
"range": {
"created": {
"gte": "2019/01/01",
"lte": "2019/12/31"
}
}
}
}
copy
Specify relative dates by using date math.
To subtract 1 year and 1 day from the specified date, use the following query:
GET products/_search
{
"query": {
"range": {
"created": {
"gte": "2019/01/01||-1y-1d"
}
}
}
}
copy
The first date that we specify is the anchor date or the starting point for the date math. Add two trailing pipe symbols. You could then add one day (+1d
) or subtract two weeks (-2w
). This math expression is relative to the anchor date that you specify.
You could also round off dates by adding a forward slash to the date or time unit.
To find products added in the last year and rounded off by month:
GET products/_search
{
"query": {
"range": {
"created": {
"gte": "now-1y/M"
}
}
}
}
copy
The keyword now
refers to the current date and time.
Prefix
Use the prefix
query to search for terms that begin with a specific prefix.
GET shakespeare/_search
{
"query": {
"prefix": {
"speaker": "KING"
}
}
}
copy
Exists
Use the exists
query to search for documents that contain a specific field.
GET shakespeare/_search
{
"query": {
"exists": {
"field": "speaker"
}
}
}
copy
Fuzzy
A fuzzy query searches for documents with terms that are similar to the search term within the maximum allowed Levenshtein distance. The Levenshtein distance measures the number of one-character changes needed to change one term to another term. These changes include:
- Replacements: cat to bat
- Insertions: cat to cats
- Deletions: cat to at
- Transpositions: cat to act
A fuzzy query creates a list of all possible expansions of the search term that fall within the Levenshtein distance. You can specify the maximum number of such expansions in the max_expansions
field. Then is searches for documents that match any of the expansions.
The following example query searches for the speaker HALET
(misspelled HAMLET
). The maximum edit distance is not specified, so the default AUTO
edit distance is used:
GET shakespeare/_search
{
"query": {
"fuzzy": {
"speaker": {
"value": "HALET"
}
}
}
}
copy
The response contains all documents where HAMLET
is the speaker.
The following example query searches for the word cat
with advanced parameters:
GET shakespeare/_search
{
"query": {
"fuzzy": {
"speaker": {
"value": "HALET",
"fuzziness": "2",
"max_expansions": 40,
"prefix_length": 0,
"transpositions": true,
"rewrite": "constant_score"
}
}
}
}
copy
Wildcard
Use wildcard queries to search for terms that match a wildcard pattern.
Feature | Behavior |
---|---|
* | Specifies all valid values. |
? | Specifies a single valid value. |
To search for terms that start with H
and end with Y
:
GET shakespeare/_search
{
"query": {
"wildcard": {
"speaker": {
"value": "H*Y"
}
}
}
}
copy
If we change *
to ?
, we get no matches, because ?
refers to a single character.
Wildcard queries tend to be slow because they need to iterate over a lot of terms. Avoid placing wildcard characters at the beginning of a query because it could be a very expensive operation in terms of both resources and time.
Regexp
Use the regexp
query to search for terms that match a regular expression.
This regular expression matches any single uppercase or lowercase letter:
GET shakespeare/_search
{
"query": {
"regexp": {
"play_name": "[a-zA-Z]amlet"
}
}
}
copy
A few important notes:
- Regular expressions are applied to the terms in the field (i.e. tokens), not the entire field.
- Regular expressions use the Lucene syntax, which differs from more standardized implementations. Test thoroughly to ensure that you receive the results you expect. To learn more, see the Lucene documentation.
regexp
queries can be expensive operations and require thesearch.allow_expensive_queries
setting to be set totrue
. Before making frequentregexp
queries, test their impact on cluster performance and examine alternative queries for achieving similar results.