Search your data

In OpenSearch, there are several ways to search data:

  • Query domain-specific language (DSL): The primary OpenSearch query language, which you can use to create complex, fully customizable queries.
  • Query string query language: A scaled-down query language that you can use in a query parameter of a search request or in OpenSearch Dashboards.
  • SQL: A traditional query language that bridges the gap between traditional relational database concepts and the flexibility of OpenSearch’s document-oriented data storage.
  • Piped Processing Language (PPL): The primary language used for observability in OpenSearch. PPL uses a pipe syntax that chains commands into a query.
  • Dashboards Query Language (DQL): A simple text-based query language for filtering data in OpenSearch Dashboards.

Prepare the data

For this tutorial, you’ll need to index student data if you haven’t done so already. You can start by deleting the students index (DELETE /students) and then sending the following bulk request:

  1. POST _bulk
  2. { "create": { "_index": "students", "_id": "1" } }
  3. { "name": "John Doe", "gpa": 3.89, "grad_year": 2022}
  4. { "create": { "_index": "students", "_id": "2" } }
  5. { "name": "Jonathan Powers", "gpa": 3.85, "grad_year": 2025 }
  6. { "create": { "_index": "students", "_id": "3" } }
  7. { "name": "Jane Doe", "gpa": 3.52, "grad_year": 2024 }

copy

Retrieve all documents in an index

To retrieve all documents in an index, send the following request:

  1. GET /students/_search

copy

The preceding request is equivalent to the match_all query, which matches all documents in an index:

  1. GET /students/_search
  2. {
  3. "query": {
  4. "match_all": {}
  5. }
  6. }

copy

OpenSearch returns the matching documents:

  1. {
  2. "took": 12,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 3,
  13. "relation": "eq"
  14. },
  15. "max_score": 1,
  16. "hits": [
  17. {
  18. "_index": "students",
  19. "_id": "1",
  20. "_score": 1,
  21. "_source": {
  22. "name": "John Doe",
  23. "gpa": 3.89,
  24. "grad_year": 2022
  25. }
  26. },
  27. {
  28. "_index": "students",
  29. "_id": "2",
  30. "_score": 1,
  31. "_source": {
  32. "name": "Jonathan Powers",
  33. "gpa": 3.85,
  34. "grad_year": 2025
  35. }
  36. },
  37. {
  38. "_index": "students",
  39. "_id": "3",
  40. "_score": 1,
  41. "_source": {
  42. "name": "Jane Doe",
  43. "gpa": 3.52,
  44. "grad_year": 2024
  45. }
  46. }
  47. ]
  48. }
  49. }

Response body fields

The preceding response contains the following fields.

took

The took field contains the amount of time the query took to run, in milliseconds.

timed_out

This field indicates whether the request timed out. If a request timed out, then OpenSearch returns the results that were gathered before the timeout. You can set the desired timeout value by providing the timeout query parameter:

  1. GET /students/_search?timeout=20ms

copy

_shards

The _shards object specifies the total number of shards on which the query ran as well as the number of shards that succeeded or failed. A shard may fail if the shard itself and all its replicas are unavailable. If any of the involved shards fail, OpenSearch continues to run the query on the remaining shards.

hits

The hits object contains the total number of matching documents and the documents themselves (listed in the hits array). Each matching document contains the _index and _id fields as well as the _source field, which contains the complete originally indexed document.

Each document is given a relevance score in the _score field. Because you ran a match_all search, all document scores are set to 1 (there is no difference in their relevance). The max_score field contains the highest score of any matching document.

Query string queries

Query string queries are lightweight but powerful. You can send a query string query as a q query parameter. For example, the following query searches for students with the name john:

  1. GET /students/_search?q=name:john

copy

OpenSearch returns the matching document:

  1. {
  2. "took": 18,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 1,
  13. "relation": "eq"
  14. },
  15. "max_score": 0.9808291,
  16. "hits": [
  17. {
  18. "_index": "students",
  19. "_id": "1",
  20. "_score": 0.9808291,
  21. "_source": {
  22. "name": "John Doe",
  23. "grade": 12,
  24. "gpa": 3.89,
  25. "grad_year": 2022,
  26. "future_plans": "John plans to be a computer science major"
  27. }
  28. }
  29. ]
  30. }
  31. }

For more information about query string syntax, see Query string query language.

Query DSL

Using Query DSL, you can create more complex and customized queries.

You can run a full-text search on fields mapped as text. By default, text fields are analyzed by the default analyzer. The analyzer splits text into terms and changes it to lowercase. For more information about OpenSearch analyzers, see Analyzers.

For example, the following query searches for students with the name john:

  1. GET /students/_search
  2. {
  3. "query": {
  4. "match": {
  5. "name": "john"
  6. }
  7. }
  8. }

copy

The response contains the matching document:

  1. {
  2. "took": 13,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 1,
  13. "relation": "eq"
  14. },
  15. "max_score": 0.9808291,
  16. "hits": [
  17. {
  18. "_index": "students",
  19. "_id": "1",
  20. "_score": 0.9808291,
  21. "_source": {
  22. "name": "John Doe",
  23. "gpa": 3.89,
  24. "grad_year": 2022
  25. }
  26. }
  27. ]
  28. }
  29. }

Notice that the query text is lowercase while the text in the field is not, but the query still returns the matching document.

You can reorder the terms in the search string. For example, the following query searches for doe john:

  1. GET /students/_search
  2. {
  3. "query": {
  4. "match": {
  5. "name": "doe john"
  6. }
  7. }
  8. }

copy

The response contains two matching documents:

  1. {
  2. "took": 14,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 2,
  13. "relation": "eq"
  14. },
  15. "max_score": 1.4508327,
  16. "hits": [
  17. {
  18. "_index": "students",
  19. "_id": "1",
  20. "_score": 1.4508327,
  21. "_source": {
  22. "name": "John Doe",
  23. "gpa": 3.89,
  24. "grad_year": 2022
  25. }
  26. },
  27. {
  28. "_index": "students",
  29. "_id": "3",
  30. "_score": 0.4700036,
  31. "_source": {
  32. "name": "Jane Doe",
  33. "gpa": 3.52,
  34. "grad_year": 2024
  35. }
  36. }
  37. ]
  38. }
  39. }

The match query type uses OR as an operator by default, so the query is functionally doe OR john. Both John Doe and Jane Doe matched the word doe, but John Doe is scored higher because it also matched john.

The name field contains the name.keyword subfield, which is added by OpenSearch automatically. If you search the name.keyword field in a manner similar to the previous request:

  1. GET /students/_search
  2. {
  3. "query": {
  4. "match": {
  5. "name.keyword": "john"
  6. }
  7. }
  8. }

copy

Then the request returns no hits because the keyword fields must exactly match.

However, if you search for the exact text John Doe:

  1. GET /students/_search
  2. {
  3. "query": {
  4. "match": {
  5. "name.keyword": "John Doe"
  6. }
  7. }
  8. }

copy

OpenSearch returns the matching document:

  1. {
  2. "took": 19,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 1,
  13. "relation": "eq"
  14. },
  15. "max_score": 0.9808291,
  16. "hits": [
  17. {
  18. "_index": "students",
  19. "_id": "1",
  20. "_score": 0.9808291,
  21. "_source": {
  22. "name": "John Doe",
  23. "gpa": 3.89,
  24. "grad_year": 2022
  25. }
  26. }
  27. ]
  28. }
  29. }

Filters

Using a Boolean query, you can add a filter clause to your query for fields with exact values

Term filters match specific terms. For example, the following Boolean query searches for students whose graduation year is 2022:

  1. GET students/_search
  2. {
  3. "query": {
  4. "bool": {
  5. "filter": [
  6. { "term": { "grad_year": 2022 }}
  7. ]
  8. }
  9. }
  10. }

copy

With range filters, you can specify a range of values. For example, the following Boolean query searches for students whose GPA is greater than 3.6:

  1. GET students/_search
  2. {
  3. "query": {
  4. "bool": {
  5. "filter": [
  6. { "range": { "gpa": { "gt": 3.6 }}}
  7. ]
  8. }
  9. }
  10. }

copy

For more information about filters, see Query and filter context.

Compound queries

A compound query lets you combine multiple query or filter clauses. A Boolean query is an example of a compound query.

For example, to search for students whose name matches doe and filter by graduation year and GPA, use the following request:

  1. GET students/_search
  2. {
  3. "query": {
  4. "bool": {
  5. "must": [
  6. {
  7. "match": {
  8. "name": "doe"
  9. }
  10. },
  11. { "range": { "gpa": { "gte": 3.6, "lte": 3.9 } } },
  12. { "term": { "grad_year": 2022 }}
  13. ]
  14. }
  15. }
  16. }

copy

For more information about Boolean and other compound queries, see Compound queries.

Search methods

Along with the traditional full-text search described in this tutorial, OpenSearch supports a range of machine learning (ML)-powered search methods, including k-NN, semantic, multimodal, sparse, hybrid, and conversational search. For information about all OpenSearch-supported search methods, see Search.

Next steps

  • For information about available query types, see Query DSL.
  • For information about available search methods, see Search.