Boolean queries

The bool query lets you combine multiple search queries with boolean logic. You can use boolean logic between queries to either narrow or broaden your search results.

The bool query is a go-to query because it allows you to construct an advanced query by chaining together several simple ones.

Use the following clauses (subqueries) within the bool query:

ClauseBehavior
mustThe results must match the queries in this clause. If you have multiple queries, every single one must match. Acts as an and operator.
must_notThis is the anti-must clause. All matches are excluded from the results. Acts as a not operator.
shouldThe results should, but don’t have to, match the queries. Each matching should clause increases the relevancy score. As an option, you can require one or more queries to match the value of the minimum_number_should_match parameter (default is 1).
filterFilters reduce your dataset before applying the queries. A query within a filter clause is a yes-no option, where if a document matches the query it’s included in the results. Otherwise, it’s not. Filter queries do not affect the relevancy score that the results are sorted by. The results of a filter query are generally cached so they tend to run faster. Use the filter query to filter the results based on exact matches, ranges, dates, numbers, and so on.

The structure of a bool query is as follows:

  1. GET _search
  2. {
  3. "query": {
  4. "bool": {
  5. "must": [
  6. {}
  7. ],
  8. "must_not": [
  9. {}
  10. ],
  11. "should": [
  12. {}
  13. ],
  14. "filter": {}
  15. }
  16. }
  17. }

For example, assume you have the complete works of Shakespeare indexed in an OpenSearch cluster. You want to construct a single query that meets the following requirements:

  1. The text_entry field must contain the word love and should contain either life or grace.
  2. The speaker field must not contain ROMEO.
  3. Filter these results to the play Romeo and Juliet without affecting the relevancy score.

Use the following query:

  1. GET shakespeare/_search
  2. {
  3. "query": {
  4. "bool": {
  5. "must": [
  6. {
  7. "match": {
  8. "text_entry": "love"
  9. }
  10. }
  11. ],
  12. "should": [
  13. {
  14. "match": {
  15. "text_entry": "life"
  16. }
  17. },
  18. {
  19. "match": {
  20. "text_entry": "grace"
  21. }
  22. }
  23. ],
  24. "minimum_should_match": 1,
  25. "must_not": [
  26. {
  27. "match": {
  28. "speaker": "ROMEO"
  29. }
  30. }
  31. ],
  32. "filter": {
  33. "term": {
  34. "play_name": "Romeo and Juliet"
  35. }
  36. }
  37. }
  38. }
  39. }

Sample output

  1. {
  2. "took": 12,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 4,
  6. "successful": 4,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 1,
  13. "relation": "eq"
  14. },
  15. "max_score": 11.356054,
  16. "hits": [
  17. {
  18. "_index": "shakespeare",
  19. "_type": "_doc",
  20. "_id": "88020",
  21. "_score": 11.356054,
  22. "_source": {
  23. "type": "line",
  24. "line_id": 88021,
  25. "play_name": "Romeo and Juliet",
  26. "speech_number": 19,
  27. "line_number": "4.5.61",
  28. "speaker": "PARIS",
  29. "text_entry": "O love! O life! not life, but love in death!"
  30. }
  31. }
  32. ]
  33. }
  34. }

If you want to identify which of these clauses actually caused the matching results, name each query with the _name parameter. To add the _name parameter, change the field name in the match query to an object:

  1. GET shakespeare/_search
  2. {
  3. "query": {
  4. "bool": {
  5. "must": [
  6. {
  7. "match": {
  8. "text_entry": {
  9. "query": "love",
  10. "_name": "love-must"
  11. }
  12. }
  13. }
  14. ],
  15. "should": [
  16. {
  17. "match": {
  18. "text_entry": {
  19. "query": "life",
  20. "_name": "life-should"
  21. }
  22. }
  23. },
  24. {
  25. "match": {
  26. "text_entry": {
  27. "query": "grace",
  28. "_name": "grace-should"
  29. }
  30. }
  31. }
  32. ],
  33. "minimum_should_match": 1,
  34. "must_not": [
  35. {
  36. "match": {
  37. "speaker": {
  38. "query": "ROMEO",
  39. "_name": "ROMEO-must-not"
  40. }
  41. }
  42. }
  43. ],
  44. "filter": {
  45. "term": {
  46. "play_name": "Romeo and Juliet"
  47. }
  48. }
  49. }
  50. }
  51. }

OpenSearch returns a matched_queries array that lists the queries that matched these results:

  1. "matched_queries": [
  2. "love-must",
  3. "life-should"
  4. ]

If you remove the queries not in this list, you will still see the exact same result. By examining which should clause matched, you can better understand the relevancy score of the results.

You can also construct complex boolean expressions by nesting bool queries. For example, to find a text_entry field that matches (love OR hate) AND (life OR grace) in the play Romeo and Juliet:

  1. GET shakespeare/_search
  2. {
  3. "query": {
  4. "bool": {
  5. "must": [
  6. {
  7. "bool": {
  8. "should": [
  9. {
  10. "match": {
  11. "text_entry": "love"
  12. }
  13. },
  14. {
  15. "match": {
  16. "text": "hate"
  17. }
  18. }
  19. ]
  20. }
  21. },
  22. {
  23. "bool": {
  24. "should": [
  25. {
  26. "match": {
  27. "text_entry": "life"
  28. }
  29. },
  30. {
  31. "match": {
  32. "text": "grace"
  33. }
  34. }
  35. ]
  36. }
  37. }
  38. ],
  39. "filter": {
  40. "term": {
  41. "play_name": "Romeo and Juliet"
  42. }
  43. }
  44. }
  45. }
  46. }

Sample output

  1. {
  2. "took": 10,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 2,
  6. "successful": 2,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": 1,
  12. "max_score": 11.37006,
  13. "hits": [
  14. {
  15. "_index": "shakespeare",
  16. "_type": "doc",
  17. "_id": "88020",
  18. "_score": 11.37006,
  19. "_source": {
  20. "type": "line",
  21. "line_id": 88021,
  22. "play_name": "Romeo and Juliet",
  23. "speech_number": 19,
  24. "line_number": "4.5.61",
  25. "speaker": "PARIS",
  26. "text_entry": "O love! O life! not life, but love in death!"
  27. }
  28. }
  29. ]
  30. }
  31. }