This version of the OpenSearch documentation is no longer maintained. For the latest version, see the current documentation. For information about OpenSearch version maintenance, see Release Schedule and Maintenance Policy.

Query DSL

OpenSearch provides a search language called query domain-specific language (DSL) that you can use to search your data. Query DSL is a flexible language with a JSON interface.

With query DSL, you need to specify a query in the query parameter of the search. One of the simplest searches in OpenSearch uses the match_all query, which matches all documents in an index:

  1. GET testindex/_search
  2. {
  3. "query": {
  4. "match_all": {
  5. }
  6. }
  7. }

A query can consist of many query clauses. You can combine query clauses to produce complex queries.

Broadly, you can classify queries into two categories—leaf queries and compound queries:

  • Leaf queries: Leaf queries search for a specified value in a certain field or fields. You can use leaf queries on their own. They include the following query types:

    • Full-text queries: Use full-text queries to search text documents. For an analyzed text field search, full-text queries split the query string into terms using the same analyzer that was used when the field was indexed. For an exact value search, full-text queries look for the specified value without applying text analysis.

    • Term-level queries: Use term-level queries to search documents for an exact term, such as an ID or value range. Term-level queries do not analyze search terms or sort results by relevance score.

    • Geographic and xy queries: Use geographic queries to search documents that include geographic data. Use xy queries to search documents that include points and shapes in a two-dimensional coordinate system.

    • Joining queries: Use joining queries to search nested fields or return parent and child documents that match a specific query. Types of joining queries include nested, has_child, has_parent, and parent_id queries.

    • Span queries: Use span queries to perform precise positional searches. Span queries are low-level, specific queries that provide control over the order and proximity of specified query terms. They are primarily used to search legal documents.

    • Specialized queries: Specialized queries include all other query types (distance_feature, more_like_this, percolate, rank_feature, script, script_score, and wrapper).

  • Compound queries: Compound queries serve as wrappers for multiple leaf or compound clauses, either to combine their results or to modify their behavior. They include the Boolean, disjunction max, constant score, function score, and boosting query types. To learn more, see Compound queries.

A note on Unicode special characters in text fields

Because of word boundaries associated with Unicode special characters, the Unicode standard analyzer cannot index a text field type value as a whole value when it includes one of these special characters. As a result, a text field value that includes a special character is parsed by the standard analyzer as multiple values separated by the special character, effectively tokenizing the different elements on either side of it. This can lead to unintentional filtering of documents and potentially compromise control over their access.

The following examples illustrate values containing special characters that will be parsed improperly by the standard analyzer. In this example, the existence of the hyphen/minus sign in the value prevents the analyzer from distinguishing between the two different users for user.id and interprets them as being one and the same:

  1. {
  2. "bool": {
  3. "must": {
  4. "match": {
  5. "user.id": "User-1"
  6. }
  7. }
  8. }
  9. }
  1. {
  2. "bool": {
  3. "must": {
  4. "match": {
  5. "user.id": "User-2"
  6. }
  7. }
  8. }
  9. }

To avoid this circumstance when using either query DSL or the REST API, you can use a custom analyzer or map the field as keyword, which performs an exact-match search. See Keyword field type for the latter option.

For a list of characters that should be avoided when using text field types, see Word Boundaries.

Expensive queries

Expensive queries can consume a lot of memory and lead to a decline in cluster performance. The following queries may be resource consuming:

To disallow expensive queries, you can disable the search.allow_expensive_queries cluster setting as follows:

  1. PUT _cluster/settings
  2. {
  3. "persistent": {
  4. "search.allow_expensive_queries": false
  5. }
  6. }

copy

To track expensive queries, enable slow logs.