Match Boolean prefix query
The match_bool_prefix
query analyzes the provided search string and creates a Boolean query from the string’s terms. It uses every term except the last term as a whole word for matching. The last term is used as a prefix. The match_bool_prefix
query returns documents that contain either the whole-word terms or terms that start with the prefix term, in any order.
The following example shows a basic match_bool_prefix
query:
GET _search
{
"query": {
"match_bool_prefix": {
"title": "the wind"
}
}
}
copy
To pass additional parameters, you can use the expanded syntax:
GET _search
{
"query": {
"match_bool_prefix": {
"title": {
"query": "the wind",
"analyzer": "stop"
}
}
}
}
copy
Example
For example, consider an index with the following documents:
PUT testindex/_doc/1
{
"title": "The wind rises"
}
copy
PUT testindex/_doc/2
{
"title": "Gone with the wind"
}
copy
The following match_bool_prefix
query searches for the whole word rises
and the words that start with wi
, in any order:
GET testindex/_search
{
"query": {
"match_bool_prefix": {
"title": "rises wi"
}
}
}
copy
The preceding query is equivalent to the following Boolean query:
GET testindex/_search
{
"query": {
"bool" : {
"should": [
{ "term": { "title": "rises" }},
{ "prefix": { "title": "wi"}}
]
}
}
}
The response contains both documents:
Response
{
"took": 15,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1.73617,
"hits": [
{
"_index": "testindex",
"_id": "1",
"_score": 1.73617,
"_source": {
"title": "The wind rises"
}
},
{
"_index": "testindex",
"_id": "2",
"_score": 1,
"_source": {
"title": "Gone with the wind"
}
}
]
}
}
The match_bool_prefix
and match_phrase_prefix
queries
The match_bool_prefix
query matches terms in any position, while the match_phrase_prefix
query matches terms as a whole phrase. To illustrate the difference, once again consider the match_bool_prefix
query from the preceding section:
GET testindex/_search
{
"query": {
"match_bool_prefix": {
"title": "rises wi"
}
}
}
copy
Both The wind rises
and Gone with the wind
match the search terms, so the query returns both documents.
Now run a match_phrase_prefix
query on the same index:
GET testindex/_search
{
"query": {
"match_phrase_prefix": {
"title": "rises wi"
}
}
}
copy
The response returns no documents because none of the documents contain a phrase rises wi
in the specified order.
Analyzer
By default, when you run a query on a text
field, the search text is analyzed using the index analyzer associated with the field. You can specify a different search analyzer in the analyzer
parameter:
GET testindex/_search
{
"query": {
"match_bool_prefix": {
"title": {
"query": "rise the wi",
"analyzer": "stop"
}
}
}
}
copy
Parameters
The query accepts the name of the field (<field>
) as a top-level parameter:
GET _search
{
"query": {
"match_bool_prefix": {
"<field>": {
"query": "text to search for",
...
}
}
}
}
copy
The <field>
accepts the following parameters. All parameters except query
are optional.
Parameter | Data type | Description |
---|---|---|
query | String | The text, number, Boolean value, or date to use for search. Required. |
analyzer | String | The analyzer used to tokenize the query string text. Default is the index-time analyzer specified for the defaultfield . If no analyzer is specified for the default_field , the analyzer is the default analyzer for the index. |
fuzziness | AUTO , 0 , or a positive integer | The number of character edits (insert, delete, substitute) that it takes to change one word to another when determining whether a term matched a value. For example, the distance between wined and wind is 1. The default, AUTO , chooses a value based on the length of each term and is a good choice for most use cases. |
fuzzy_rewrite | String | Determines how OpenSearch rewrites the query. Valid values are constant_score , scoring_boolean , constant_score_boolean , top_terms_N , top_terms_boost_N , and top_terms_blended_freqs_N . If the fuzziness parameter is not 0 , the query uses a fuzzy_rewrite method of top_terms_blended_freqs${max_expansions} by default. Default is constant_score . |
fuzzy_transpositions | Boolean | Setting fuzzy_transpositions to true (default) adds swaps of adjacent characters to the insert, delete, and substitute operations of the fuzziness option. For example, the distance between wind and wnid is 1 if fuzzy_transpositions is true (swap “n” and “i”) and 2 if it is false (delete “n”, insert “n”). If fuzzy_transpositions is false, rewind and wnid have the same distance (2) from wind , despite the more human-centric opinion that wnid is an obvious typo. The default is a good choice for most use cases. |
max_expansions | Positive integer | The maximum number of terms to which the query can expand. Fuzzy queries “expand to” a number of matching terms that are within the distance specified in fuzziness . Then OpenSearch tries to match those terms. Default is 50 . |
minimum_should_match | Positive or negative integer, positive or negative percentage, combination | If the query string contains multiple search terms and you use the or operator, the number of terms that need to match for the document to be considered a match. For example, if minimum_should_match is 2, wind often rising does not match The Wind Rises. If minimum_should_match is 1 , it matches. For details, see Minimum should match. |
operator | String | If the query string contains multiple search terms, whether all terms need to match (and ) or only one term needs to match (or ) for a document to be considered a match. Valid values are or and and . Default is or . |
prefix_length | Non-negative integer | The number of leading characters that are not considered in fuzziness. Default is 0 . |
The fuzziness
, fuzzy_transpositions
, fuzzy_rewrite
, max_expansions
, and prefix_length
parameters can be applied to the term subqueries constructed for all terms except the final term. They do not have any effect on the prefix query constructed for the final term.