Full-text queries

This page lists all full-text query types and common options. Given the sheer number of options and subtle behaviors, the best method of ensuring useful search results is to test different queries against representative indices and verify the output.



Common terms queries and the optional query field cutoff_frequency are now deprecated.

Match

Creates a boolean query that returns results if the search term is present in the field.

The most basic form of the query provides only a field (title) and a term (wind):

  1. GET _search
  2. {
  3. "query": {
  4. "match": {
  5. "title": "wind"
  6. }
  7. }
  8. }

For an example that uses curl, try:

  1. curl --insecure -XGET -u 'admin:admin' https://<host>:<port>/<index>/_search \
  2. -H "content-type: application/json" \
  3. -d '{
  4. "query": {
  5. "match": {
  6. "title": "wind"
  7. }
  8. }
  9. }'

The query accepts the following options. For descriptions of each, see Options.

  1. GET _search
  2. {
  3. "query": {
  4. "match": {
  5. "title": {
  6. "query": "wind",
  7. "fuzziness": "AUTO",
  8. "fuzzy_transpositions": true,
  9. "operator": "or",
  10. "minimum_should_match": 1,
  11. "analyzer": "standard",
  12. "zero_terms_query": "none",
  13. "lenient": false,
  14. "prefix_length": 0,
  15. "max_expansions": 50,
  16. "boost": 1
  17. }
  18. }
  19. }
  20. }
  21. ## Multi match
  22. Similar to [match](#match), but searches multiple fields.
  23. The `^` lets you "boost" certain fields. Boosts are multipliers that weigh matches in one field more heavily than matches in other fields. In the following example, a match for "wind" in the title field influences `_score` four times as much as a match in the plot field. The result is that films like *The Wind Rises* and *Gone with the Wind* are near the top of the search results, and films like *Twister* and *Sharknado*, which presumably have "wind" in their plot summaries, are near the bottom.
  24. ```json
  25. GET _search
  26. {
  27. "query": {
  28. "multi_match": {
  29. "query": "wind",
  30. "fields": ["title^4", "plot"]
  31. }
  32. }
  33. }

The query accepts the following options. For descriptions of each, see Options.

  1. GET _search
  2. {
  3. "query": {
  4. "multi_match": {
  5. "query": "wind",
  6. "fields": ["title^4", "description"],
  7. "type": "most_fields",
  8. "operator": "and",
  9. "minimum_should_match": 3,
  10. "tie_breaker": 0.0,
  11. "analyzer": "standard",
  12. "boost": 1,
  13. "fuzziness": "AUTO",
  14. "fuzzy_transpositions": true,
  15. "lenient": false,
  16. "prefix_length": 0,
  17. "max_expansions": 50,
  18. "auto_generate_synonyms_phrase_query": true,
  19. "zero_terms_query": "none"
  20. }
  21. }
  22. }

Match boolean prefix

Similar to match, but creates a prefix query out of the last term in the query string.

  1. GET _search
  2. {
  3. "query": {
  4. "match_bool_prefix": {
  5. "title": "rises wi"
  6. }
  7. }
  8. }

The query accepts the following options. For descriptions of each, see Options.

  1. GET _search
  2. {
  3. "query": {
  4. "match_bool_prefix": {
  5. "title": {
  6. "query": "rises wi",
  7. "fuzziness": "AUTO",
  8. "fuzzy_transpositions": true,
  9. "max_expansions": 50,
  10. "prefix_length": 0,
  11. "operator": "or",
  12. "minimum_should_match": 2,
  13. "analyzer": "standard"
  14. }
  15. }
  16. }
  17. }

Match phrase

Creates a phrase query that matches a sequence of terms.

  1. GET _search
  2. {
  3. "query": {
  4. "match_phrase": {
  5. "title": "the wind rises"
  6. }
  7. }
  8. }

The query accepts the following options. For descriptions of each, see Options.

  1. GET _search
  2. {
  3. "query": {
  4. "match_phrase": {
  5. "title": {
  6. "query": "wind rises the",
  7. "slop": 3,
  8. "analyzer": "standard",
  9. "zero_terms_query": "none"
  10. }
  11. }
  12. }
  13. }

Match phrase prefix

Similar to match phrase, but creates a prefix query out of the last term in the query string.

  1. GET _search
  2. {
  3. "query": {
  4. "match_phrase_prefix": {
  5. "title": "the wind ri"
  6. }
  7. }
  8. }

The query accepts the following options. For descriptions of each, see Options.

  1. GET _search
  2. {
  3. "query": {
  4. "match_phrase_prefix": {
  5. "title": {
  6. "query": "the wind ri",
  7. "analyzer": "standard",
  8. "max_expansions": 50,
  9. "slop": 3
  10. }
  11. }
  12. }
  13. }

Query string

The query string query splits text based on operators and analyzes each individually.

If you search using the HTTP request parameters (i.e. _search?q=wind), OpenSearch creates a query string query.

  1. GET _search
  2. {
  3. "query": {
  4. "query_string": {
  5. "query": "the wind AND (rises OR rising)"
  6. }
  7. }
  8. }

The query accepts the following options. For descriptions of each, see Options.

  1. GET _search
  2. {
  3. "query": {
  4. "query_string": {
  5. "query": "the wind AND (rises OR rising)",
  6. "default_field": "title",
  7. "type": "best_fields",
  8. "fuzziness": "AUTO",
  9. "fuzzy_transpositions": true,
  10. "fuzzy_max_expansions": 50,
  11. "fuzzy_prefix_length": 0,
  12. "minimum_should_match": 1,
  13. "default_operator": "or",
  14. "analyzer": "standard",
  15. "lenient": false,
  16. "boost": 1,
  17. "allow_leading_wildcard": true,
  18. "enable_position_increments": true,
  19. "phrase_slop": 3,
  20. "max_determinized_states": 10000,
  21. "time_zone": "-08:00",
  22. "quote_field_suffix": "",
  23. "quote_analyzer": "standard",
  24. "analyze_wildcard": false,
  25. "auto_generate_synonyms_phrase_query": true
  26. }
  27. }
  28. }

Simple query string

The simple query string query is like the query string query, but it lets advanced users specify many arguments directly in the query string. The query discards any invalid portions of the query string.

  1. GET _search
  2. {
  3. "query": {
  4. "simple_query_string": {
  5. "query": "\"rises wind the\"~4 | *ising~2",
  6. "fields": ["title"]
  7. }
  8. }
  9. }
Special characterBehavior
+Acts as the and operator.
|Acts as the or operator.
*Acts as a wildcard.
“”Wraps several terms into a phrase.
()Wraps a clause for precedence.
~nWhen used after a term (e.g. wnid~3), sets fuzziness. When used after a phrase, sets slop. See Options.
-Negates the term.

The query accepts the following options. For descriptions of each, see Options.

  1. GET _search
  2. {
  3. "query": {
  4. "simple_query_string": {
  5. "query": "\"rises wind the\"~4 | *ising~2",
  6. "fields": ["title"],
  7. "flags": "ALL",
  8. "fuzzy_transpositions": true,
  9. "fuzzy_max_expansions": 50,
  10. "fuzzy_prefix_length": 0,
  11. "minimum_should_match": 1,
  12. "default_operator": "or",
  13. "analyzer": "standard",
  14. "lenient": false,
  15. "quote_field_suffix": "",
  16. "analyze_wildcard": false,
  17. "auto_generate_synonyms_phrase_query": true
  18. }
  19. }
  20. }

Match all

Matches all documents. Can be useful for testing.

  1. GET _search
  2. {
  3. "query": {
  4. "match_all": {}
  5. }
  6. }

Match none

Matches no documents. Rarely useful.

  1. GET _search
  2. {
  3. "query": {
  4. "match_none": {}
  5. }
  6. }

Convert text with analyzers

OpenSearch provides the analyzer option to convert your structured text into the format that works best for your searches. You can use the following options with the analyzer field: standard, simple, whitespace, stop, keyword, pattern, fingerprint, and language. Different analyzers have different character filters, tokenizers, and token filters. The stop analyzer, for example, removes stop words (e.g., “an,” “but,” “this”) from the query string.

OpenSearch supports the following language values with the analyzer option: arabic, armenian, basque, bengali, brazilian, bulgarian, catalan, czech, danish, dutch, english, estonian, finnish, french, galicia, german, greek, hindi, hungarian, indonesian, irish, italian, latvian, lithuanian, norwegian, persian, portuguese, romanian, russian, sorani, spanish, swedish, turkish, and thai.

To use the analyzer when you map an index, specify the value within your query. For example, to map your index with the French language analyzer, specify the french value for the analyzer field:

  1. "analyzer": "french"

Sample Request

The following query maps an index with the language analyzer set to french:

  1. PUT my-index-000001
  2. {
  3. "mappings": {
  4. "properties": {
  5. "text": {
  6. "type": "text",
  7. "fields": {
  8. "french": {
  9. "type": "text",
  10. "analyzer": "french"
  11. }
  12. }
  13. }
  14. }
  15. }
  16. }

Optional query fields

You can filter your query results by using some of the optional query fields, such as wildcards, analyzers, fuzzy query fields, and synonyms.

Use wildcards

OptionValid valuesDescription
allow_leading_wildcardBooleanWhether * and ? are allowed as the first character of a search term. The default is true.
analyze_wildcardBooleanWhether OpenSearch should attempt to analyze wildcard terms. Some analyzers do a poor job at this task, so the default is false.

Use built-in analyzers

OptionValid valuesDescription
analyzerstandard, simple, whitespace, stop, keyword, pattern, language, fingerprintThe analyzer you want to use for the query. Different analyzers have different character filters, tokenizers, and token filters. The stop analyzer, for example, removes stop words (e.g., “an,” “but,” “this”) from the query string. For a full list of acceptable language values, see Convert text with analyzers on this page.
quote_analyzerStringThis option lets you choose to use the standard analyzer without any options, such as language or other analyzers. Usage is “quote_analyzer”: “standard”.

Run fuzzy queries

OptionValid valuesDescription
fuzzinessAUTO, 0, or a positive integerThe number of character edits (insert, delete, substitute) that it takes to change one word to another when determining whether a term matched a value. For example, the distance between wined and wind is 1. The default, AUTO, chooses a value based on the length of each term and is a good choice for most use cases.
fuzzy_transpositionsBooleanSetting fuzzy_transpositions to true (default) adds swaps of adjacent characters to the insert, delete, and substitute operations of the fuzziness option. For example, the distance between wind and wnid is 1 if fuzzy_transpositions is true (swap “n” and “i”) and 2 if it is false (delete “n”, insert “n”).

If fuzzy_transpositions is false, rewind and wnid have the same distance (2) from wind, despite the more human-centric opinion that wnid is an obvious typo. The default is a good choice for most use cases.
fuzzy_max_expansionsPositive integerFuzzy queries “expand to” a number of matching terms that are within the distance specified in fuzziness. Then OpenSearch tries to match those terms against its indexes.

Use synonyms with a query

You can also run multi-term queries that allow for generating synonyms. Use the auto_generate_synonyms_phrase_query Boolean field. By default it is set to true. It automatically generates phrase queries for multi-term synonyms. For example, if you have the synonym "ba, batting average" and search for “ba,” OpenSearch searches for ba OR "batting average" (if this option is true) or ba OR (batting AND average) (if this option is false).

Other optional query fields

You can also use the following optional query fields to filter your query results.

OptionValid valuesDescription
boostFloating-pointBoosts the clause by the given multiplier. Useful for weighing clauses in compound queries. The default is 1.0.
enable_position_incrementsBooleanWhen true, result queries are aware of position increments. This setting is useful when the removal of stop words leaves an unwanted “gap” between terms. The default is true.
fieldsString arrayThe list of fields to search (e.g. “fields”: [“title^4”, “description”]). If unspecified, defaults to the index.query.default_field setting, which defaults to [““].
flagsStringA |-delimited string of flags to enable (e.g., AND|OR|NOT). The default is ALL. You can explicitly set the value for default_field. For example, to return all titles, set it to “default_field”: “title”.
lenientBooleanSetting lenient to true lets you ignore data type mismatches between the query and the document field. For example, a query string of “8.2” could match a field of type float. The default is false.
low_freq_operatorand, orThe operator for low-frequency terms. The default is or. See Common terms queries and operator in this table.
max_determinized_statesPositive integerThe maximum number of “states” (a measure of complexity) that Lucene can create for query strings that contain regular expressions (e.g. “query”: “/wind.+?/“). Larger numbers allow for queries that use more memory. The default is 10,000.
max_expansionsPositive integermax_expansions specifies the maximum number of terms to which the query can expand. The default is 50.
minimum_should_matchPositive or negative integer, positive or negative percentage, combinationIf the query string contains multiple search terms and you used the or operator, the number of terms that need to match for the document to be considered a match. For example, if minimum_should_match is 2, “wind often rising” does not match “The Wind Rises.” If minimum_should_match is 1, it matches. This option also has low_freq and high_freq properties for Common terms queries.
operatoror, andIf the query string contains multiple search terms, whether all terms need to match (and) or only one term needs to match (or) for a document to be considered a match.
phrase_slop0 (default) or a positive integerSee slop.
prefix_length0 (default) or a positive integerThe number of leading characters that are not considered in fuzziness.
quote_field_suffixStringThis option lets you search different fields depending on whether terms are wrapped in quotes. For example, if quote_field_suffix is “.exact” and you search for “lightly” (in quotes) in the title field, OpenSearch searches the title.exact field. This second field might use a different type (e.g. keyword rather than text) or a different analyzer. The default is null.
rewriteconstant_score, scoring_boolean, constant_score_boolean, top_terms_N, top_terms_boost_N, top_terms_blended_freqs_NDetermines how OpenSearch rewrites and scores multi-term queries. The default is constant_score.
slop0 (default) or a positive integerControls the degree to which words in a query can be misordered and still be considered a match. From the Lucene documentation: “The number of other words permitted between words in query phrase. For example, to switch the order of two words requires two moves (the first move places the words atop one another), so to permit re-orderings of phrases, the slop must be at least two. A value of zero requires an exact match.”
tie_breaker0.0 (default) to 1.0Changes the way OpenSearch scores searches. For example, a type of best_fields typically uses the highest score from any one field. If you specify a tie_breaker value between 0.0 and 1.0, the score changes to highest score + tie_breaker score for all other matching fields. If you specify a value of 1.0, OpenSearch adds together the scores for all matching fields (effectively defeating the purpose of best_fields).
time_zoneUTC offset hoursSpecifies the number of hours to offset the desired time zone from UTC. You need to indicate the time zone offset number if the query string contains a date range. For example, set time_zone”: “-08:00” for a query with a date range such as “query”: “wind rises release_date[2012-01-01 TO 2014-01-01]”). The default time zone format used to specify number of offset hours is UTC.
typebest_fields, most_fields, cross_fields, phrase, phrase_prefixDetermines how OpenSearch executes the query and scores the results. The default is best_fields.
zero_terms_querynone, allIf the analyzer removes all terms from a query string, whether to match no documents (default) or all documents. For example, the stop analyzer removes all terms from the string “an but this.”