Retrieve specific fields

When you run a basic search in OpenSearch, by default, the original JSON objects that were used during indexing are also returned in the response for each hit in the _source object. This can lead to large amounts of data being transferred through the network, increasing latency and costs. There are several ways to limit the responses to only the required information.

Disabling _source

You can set _source to false in a search request to exclude the _source field from the response:

  1. GET /index1/_search
  2. {
  3. "_source": false,
  4. "query": {
  5. "match_all": {}
  6. }
  7. }

copy

Because no fields were selected in the preceding search, the retrieved hits will only include the _index, _id and _score of the hits:

  1. {
  2. "hits" : {
  3. "total" : {
  4. "value" : 2,
  5. "relation" : "eq"
  6. },
  7. "max_score" : 1.0,
  8. "hits" : [
  9. {
  10. "_index" : "index1",
  11. "_id" : "41",
  12. "_score" : 1.0
  13. },
  14. {
  15. "_index" : "index1",
  16. "_id" : "51",
  17. "_score" : 1.0
  18. }
  19. ]
  20. }
  21. }

The _source can also be disabled in index mappings by using the following configuration:

  1. "mappings": {
  2. "_source": {
  3. "enabled": false
  4. }
  5. }

If _source is disabled in the index mappings, searching with docvalue fields and searching with stored fields become extremely useful.

Specifying the fields to retrieve

You can list the fields you want to retrieve in the fields parameter. Wildcard patterns are also accepted:

  1. GET "/index1/_search?pretty"
  2. {
  3. "_source": false,
  4. "fields": ["age", "nam*"],
  5. "query": {
  6. "match_all": {}
  7. }
  8. }

copy

The response contains the name and age fields:

  1. {
  2. "hits" : {
  3. "total" : {
  4. "value" : 2,
  5. "relation" : "eq"
  6. },
  7. "max_score" : 1.0,
  8. "hits" : [
  9. {
  10. "_index" : "index1",
  11. "_id" : "41",
  12. "_score" : 1.0,
  13. "fields" : {
  14. "name" : [
  15. "John Doe"
  16. ],
  17. "age" : [
  18. 30
  19. ]
  20. }
  21. },
  22. {
  23. "_index" : "index1",
  24. "_id" : "51",
  25. "_score" : 1.0,
  26. "fields" : {
  27. "name" : [
  28. "Jane Smith"
  29. ],
  30. "age" : [
  31. 25
  32. ]
  33. }
  34. }
  35. ]
  36. }
  37. }

Extracting fields with a custom format

You can also use object notation to apply a custom format to the chosen field.

If you have the following document:

  1. {
  2. "_index": "my_index",
  3. "_type": "_doc",
  4. "_id": "1",
  5. "_source": {
  6. "title": "Document 1",
  7. "date": "2023-07-04T12:34:56Z"
  8. }
  9. }

Then you can query using the fields parameter and a custom format:

  1. GET /my_index/_search
  2. {
  3. "query": {
  4. "match_all": {}
  5. },
  6. "fields": {
  7. "date": {
  8. "format": "yyyy-MM-dd"
  9. }
  10. },
  11. "_source": false
  12. }

copy

Additionally, you can use most fields and field aliases in the fields parameter because it queries both the document _source and _mappings of the index.

Searching with docvalue_fields

To retrieve specific fields from the index, you can also use the docvalue_fields parameter. This parameter works slightly differently as compared to the fields parameter. It retrieves information from doc values rather than from the _source field, which is more efficient for fields that are not analyzed, like keyword, date, and numeric fields. Doc values have a columnar storage format optimized for efficient sorting and aggregations. It stores the values on disk in a way that is easy to read. When you use docvalue_fields, OpenSearch reads the values directly from this optimized storage format. It is useful for retrieving values of fields that are primarily used for sorting, aggregations, and for use in scripts.

The following example demonstrates how to use the docvalue_fields parameter.

  1. Create an index with the following mappings:

    1. PUT my_index
    2. {
    3. "mappings": {
    4. "properties": {
    5. "title": { "type": "text" },
    6. "author": { "type": "keyword" },
    7. "publication_date": { "type": "date" },
    8. "price": { "type": "double" }
    9. }
    10. }
    11. }

    copy

  2. Index the following documents into the newly created index:

    1. POST my_index/_doc/1
    2. {
    3. "title": "OpenSearch Basics",
    4. "author": "John Doe",
    5. "publication_date": "2021-01-01",
    6. "price": 29.99
    7. }
    8. POST my_index/_doc/2
    9. {
    10. "title": "Advanced OpenSearch",
    11. "author": "Jane Smith",
    12. "publication_date": "2022-01-01",
    13. "price": 39.99
    14. }

    copy

  3. Retrieve only the author and publication_date fields using docvalue_fields:

    1. POST my_index/_search
    2. {
    3. "_source": false,
    4. "docvalue_fields": ["author", "publication_date"],
    5. "query": {
    6. "match_all": {}
    7. }
    8. }

    copy

The response contains the author and publication_date fields:

  1. {
  2. "hits": {
  3. "total": {
  4. "value": 2,
  5. "relation": "eq"
  6. },
  7. "max_score": 1.0,
  8. "hits": [
  9. {
  10. "_index": "my_index",
  11. "_id": "1",
  12. "_score": 1.0,
  13. "fields": {
  14. "author": ["John Doe"],
  15. "publication_date": ["2021-01-01T00:00:00.000Z"]
  16. }
  17. },
  18. {
  19. "_index": "my_index",
  20. "_id": "2",
  21. "_score": 1.0,
  22. "fields": {
  23. "author": ["Jane Smith"],
  24. "publication_date": ["2022-01-01T00:00:00.000Z"]
  25. }
  26. }
  27. ]
  28. }
  29. }

Using docvalue_fields with nested objects

In OpenSearch, if you want to retrieve doc values for nested objects, you cannot directly use the docvalue_fields parameter because it will return an empty array. Instead, you should use the inner_hits parameter with its own docvalue_fields property, as shown in the following example.

  1. Define the index mappings:

    1. PUT my_index
    2. {
    3. "mappings": {
    4. "properties": {
    5. "title": { "type": "text" },
    6. "author": { "type": "keyword" },
    7. "comments": {
    8. "type": "nested",
    9. "properties": {
    10. "username": { "type": "keyword" },
    11. "content": { "type": "text" },
    12. "created_at": { "type": "date" }
    13. }
    14. }
    15. }
    16. }
    17. }

    copy

  2. Index your data:

    1. POST my_index/_doc/1
    2. {
    3. "title": "OpenSearch Basics",
    4. "author": "John Doe",
    5. "comments": [
    6. {
    7. "username": "alice",
    8. "content": "Great article!",
    9. "created_at": "2023-01-01T12:00:00Z"
    10. },
    11. {
    12. "username": "bob",
    13. "content": "Very informative.",
    14. "created_at": "2023-01-02T12:00:00Z"
    15. }
    16. ]
    17. }

    copy

  3. Perform a search with inner_hits and docvalue_fields:

    1. POST my_index/_search
    2. {
    3. "query": {
    4. "nested": {
    5. "path": "comments",
    6. "query": {
    7. "match_all": {}
    8. },
    9. "inner_hits": {
    10. "docvalue_fields": ["username", "created_at"]
    11. }
    12. }
    13. }
    14. }

    copy

The following is the expected response:

  1. {
  2. "hits": {
  3. "total": {
  4. "value": 1,
  5. "relation": "eq"
  6. },
  7. "max_score": 1.0,
  8. "hits": [
  9. {
  10. "_index": "my_index",
  11. "_id": "1",
  12. "_score": 1.0,
  13. "_source": {
  14. "title": "OpenSearch Basics",
  15. "author": "John Doe",
  16. "comments": [
  17. {
  18. "username": "alice",
  19. "content": "Great article!",
  20. "created_at": "2023-01-01T12:00:00Z"
  21. },
  22. {
  23. "username": "bob",
  24. "content": "Very informative.",
  25. "created_at": "2023-01-02T12:00:00Z"
  26. }
  27. ]
  28. },
  29. "inner_hits": {
  30. "comments": {
  31. "hits": {
  32. "total": {
  33. "value": 2,
  34. "relation": "eq"
  35. },
  36. "max_score": 1.0,
  37. "hits": [
  38. {
  39. "_index": "my_index",
  40. "_id": "1",
  41. "_nested": {
  42. "field": "comments",
  43. "offset": 0
  44. },
  45. "docvalue_fields": {
  46. "username": ["alice"],
  47. "created_at": ["2023-01-01T12:00:00Z"]
  48. }
  49. },
  50. {
  51. "_index": "my_index",
  52. "_id": "1",
  53. "_nested": {
  54. "field": "comments",
  55. "offset": 1
  56. },
  57. "docvalue_fields": {
  58. "username": ["bob"],
  59. "created_at": ["2023-01-02T12:00:00Z"]
  60. }
  61. }
  62. ]
  63. }
  64. }
  65. }
  66. }
  67. ]
  68. }
  69. }

Searching with stored_fields

By default, OpenSearch stores the entire document in the _source field and uses it to return document contents in search results. However, you might also want to store certain fields separately for more efficient retrieval. You can explicitly store and retrieve specific document fields separately from the _source field by using stored_fields.

Unlike _source, stored_fields must be explicitly defined in the mappings for fields you want to store separately. It can be useful if you frequently need to retrieve only a small subset of fields and want to avoid retrieving the entire _source field. The following example demonstrates how to use the stored_fields parameter.

  1. Create an index with the following mappings:

    1. PUT my_index
    2. {
    3. "mappings": {
    4. "properties": {
    5. "title": {
    6. "type": "text",
    7. "store": true // Store the title field separately
    8. },
    9. "author": {
    10. "type": "keyword",
    11. "store": true // Store the author field separately
    12. },
    13. "publication_date": {
    14. "type": "date"
    15. },
    16. "price": {
    17. "type": "double"
    18. }
    19. }
    20. }
    21. }

    copy

  2. Index your data:

    1. POST my_index/_doc/1
    2. {
    3. "title": "OpenSearch Basics",
    4. "author": "John Doe",
    5. "publication_date": "2022-01-01",
    6. "price": 29.99
    7. }
    8. POST my_index/_doc/2
    9. {
    10. "title": "Advanced OpenSearch",
    11. "author": "Jane Smith",
    12. "publication_date": "2023-01-01",
    13. "price": 39.99
    14. }

    copy

  3. Perform a search with stored_fields:

    1. POST my_index/_search
    2. {
    3. "_source": false,
    4. "stored_fields": ["title", "author"],
    5. "query": {
    6. "match_all": {}
    7. }
    8. }

    copy

The following is the expected response:

  1. {
  2. "hits": {
  3. "total": {
  4. "value": 2,
  5. "relation": "eq"
  6. },
  7. "max_score": 1.0,
  8. "hits": [
  9. {
  10. "_index": "my_index",
  11. "_id": "1",
  12. "_score": 1.0,
  13. "fields": {
  14. "title": ["OpenSearch Basics"],
  15. "author": ["John Doe"]
  16. }
  17. },
  18. {
  19. "_index": "my_index",
  20. "_id": "2",
  21. "_score": 1.0,
  22. "fields": {
  23. "title": ["Advanced OpenSearch"],
  24. "author": ["Jane Smith"]
  25. }
  26. }
  27. ]
  28. }
  29. }

The stored_fields parameter can be disabled completely by setting stored_fields to _none_.

Searching stored_fields with nested objects

In OpenSearch, if you want to retrieve stored_fields for nested objects, you cannot directly use the stored_fields parameter because no data will be returned. Instead, you should use the inner_hits parameter with its own stored_fields property, as shown in the following example.

  1. Create an index with the following mappings:

    1. PUT my_index
    2. {
    3. "mappings": {
    4. "properties": {
    5. "title": { "type": "text" },
    6. "author": { "type": "keyword" },
    7. "comments": {
    8. "type": "nested",
    9. "properties": {
    10. "username": { "type": "keyword", "store": true },
    11. "content": { "type": "text", "store": true },
    12. "created_at": { "type": "date", "store": true }
    13. }
    14. }
    15. }
    16. }
    17. }

    copy

  2. Index your data:

    1. POST my_index/_doc/1
    2. {
    3. "title": "OpenSearch Basics",
    4. "author": "John Doe",
    5. "comments": [
    6. {
    7. "username": "alice",
    8. "content": "Great article!",
    9. "created_at": "2023-01-01T12:00:00Z"
    10. },
    11. {
    12. "username": "bob",
    13. "content": "Very informative.",
    14. "created_at": "2023-01-02T12:00:00Z"
    15. }
    16. ]
    17. }

    copy

  3. Perform a search with inner_hits and stored_fields:

    1. POST my_index/_search
    2. {
    3. "_source": false,
    4. "query": {
    5. "nested": {
    6. "path": "comments",
    7. "query": {
    8. "match_all": {}
    9. },
    10. "inner_hits": {
    11. "stored_fields": ["comments.username", "comments.content", "comments.created_at"]
    12. }
    13. }
    14. }
    15. }

    copy

The following is the expected response:

  1. {
  2. "hits": {
  3. "total": {
  4. "value": 1,
  5. "relation": "eq"
  6. },
  7. "max_score": 1.0,
  8. "hits": [
  9. {
  10. "_index": "my_index",
  11. "_id": "1",
  12. "_score": 1.0,
  13. "inner_hits": {
  14. "comments": {
  15. "hits": {
  16. "total": {
  17. "value": 2,
  18. "relation": "eq"
  19. },
  20. "max_score": 1.0,
  21. "hits": [
  22. {
  23. "_index": "my_index",
  24. "_id": "1",
  25. "_nested": {
  26. "field": "comments",
  27. "offset": 0
  28. },
  29. "fields": {
  30. "comments.username": ["alice"],
  31. "comments.content": ["Great article!"],
  32. "comments.created_at": ["2023-01-01T12:00:00.000Z"]
  33. }
  34. },
  35. {
  36. "_index": "my_index",
  37. "_id": "1",
  38. "_nested": {
  39. "field": "comments",
  40. "offset": 1
  41. },
  42. "fields": {
  43. "comments.username": ["bob"],
  44. "comments.content": ["Very informative."],
  45. "comments.created_at": ["2023-01-02T12:00:00.000Z"]
  46. }
  47. }
  48. ]
  49. }
  50. }
  51. }
  52. }
  53. ]
  54. }
  55. }

Using source filtering

Source filtering is a way to control which parts of the _source field are included in the search response. Including only the necessary fields in the response can help reduce the amount of data transferred over the network and improve performance.

You can include or exclude specific fields from the _source field in the search response using complete field names or simple wildcard patterns. The following example demonstrates how to include specific fields.

  1. Index your data:

    1. PUT my_index/_doc/1
    2. {
    3. "title": "OpenSearch Basics",
    4. "author": "John Doe",
    5. "publication_date": "2021-01-01",
    6. "price": 29.99
    7. }

    copy

  2. Perform a search using source filtering:

    1. POST my_index/_search
    2. {
    3. "_source": ["title", "author"],
    4. "query": {
    5. "match_all": {}
    6. }
    7. }

    copy

The following is the expected response:

  1. {
  2. "hits": {
  3. "total": {
  4. "value": 1,
  5. "relation": "eq"
  6. },
  7. "max_score": 1.0,
  8. "hits": [
  9. {
  10. "_index": "my_index",
  11. "_id": "1",
  12. "_score": 1.0,
  13. "_source": {
  14. "title": "OpenSearch Basics",
  15. "author": "John Doe"
  16. }
  17. }
  18. ]
  19. }
  20. }

Excluding fields with source filtering

You can choose to exclude fields by using the "excludes" parameter in a search request, as shown in the following example:

  1. POST my_index/_search
  2. {
  3. "_source": {
  4. "excludes": ["price"]
  5. },
  6. "query": {
  7. "match_all": {}
  8. }
  9. }

copy

The following is the expected response:

  1. {
  2. "hits": {
  3. "total": {
  4. "value": 1,
  5. "relation": "eq"
  6. },
  7. "max_score": 1.0,
  8. "hits": [
  9. {
  10. "_index": "my_index",
  11. "_id": "1",
  12. "_score": 1.0,
  13. "_source": {
  14. "title": "OpenSearch Basics",
  15. "author": "John Doe",
  16. "publication_date": "2021-01-01"
  17. }
  18. }
  19. ]
  20. }
  21. }

In some cases, both the include and exclude parameters may be necessary. The following examples demonstrate how to include and exclude fields in the same search.

Consider a products index containing the following document:

  1. {
  2. "product_id": "123",
  3. "name": "Smartphone",
  4. "category": "Electronics",
  5. "price": 699.99,
  6. "description": "A powerful smartphone with a sleek design.",
  7. "reviews": [
  8. {
  9. "user": "john_doe",
  10. "rating": 5,
  11. "comment": "Great phone!",
  12. "date": "2023-01-01"
  13. },
  14. {
  15. "user": "jane_doe",
  16. "rating": 4,
  17. "comment": "Good value for money.",
  18. "date": "2023-02-15"
  19. }
  20. ],
  21. "supplier": {
  22. "name": "TechCorp",
  23. "contact_email": "support@techcorp.com",
  24. "address": {
  25. "street": "123 Tech St",
  26. "city": "Techville",
  27. "zipcode": "12345"
  28. }
  29. },
  30. "inventory": {
  31. "stock": 50,
  32. "warehouse_location": "A1"
  33. }
  34. }

To perform a search on this index while including only the name, price, reviews, and supplier fields in the response, and excluding the contact_email field from the supplier object and the comment field from the reviews object, execute the following search:

  1. GET /products/_search
  2. {
  3. "_source": {
  4. "includes": ["name", "price", "reviews.*", "supplier.*"],
  5. "excludes": ["reviews.comment", "supplier.contact_email"]
  6. },
  7. "query": {
  8. "match": {
  9. "category": "Electronics"
  10. }
  11. }
  12. }

copy

The following is the expected response:

  1. {
  2. "hits": {
  3. "hits": [
  4. {
  5. "_source": {
  6. "name": "Smartphone",
  7. "price": 699.99,
  8. "reviews": [
  9. {
  10. "user": "john_doe",
  11. "rating": 5,
  12. "date": "2023-01-01"
  13. },
  14. {
  15. "user": "jane_doe",
  16. "rating": 4,
  17. "date": "2023-02-15"
  18. }
  19. ],
  20. "supplier": {
  21. "name": "TechCorp",
  22. "address": {
  23. "street": "123 Tech St",
  24. "city": "Techville",
  25. "zipcode": "12345"
  26. }
  27. }
  28. }
  29. }
  30. ]
  31. }
  32. }

Using scripted fields

The script_fields parameter allows you to include custom fields whose values are computed using scripts in your search results. This can be useful for calculating values dynamically based on the document data. You can also retrieve derived fields by using a similar approach. For more information, see Retrieving fields.

If you have an index of products, where each product document contains the price and discount_percentage fields. You can use script_fields parameter to include a custom field in the search results that displays the discounted price of each product. The following example demonstrates how to use the script_fields parameter:

  1. Index the data:

    1. PUT /products/_doc/123
    2. {
    3. "product_id": "123",
    4. "name": "Smartphone",
    5. "price": 699.99,
    6. "discount_percentage": 10,
    7. "category": "Electronics",
    8. "description": "A powerful smartphone with a sleek design."
    9. }

    copy

  2. Use the script_fields parameter to include a custom field called discounted_price in the search results. This field will be calculated based on the price and discount_percentage fields using a script:

  1. GET /products/_search
  2. {
  3. "_source": ["product_id", "name", "price", "discount_percentage"],
  4. "query": {
  5. "match": {
  6. "category": "Electronics"
  7. }
  8. },
  9. "script_fields": {
  10. "discounted_price": {
  11. "script": {
  12. "lang": "painless",
  13. "source": "doc[\"price\"].value * (1 - doc[\"discount_percentage\"].value / 100)"
  14. }
  15. }
  16. }
  17. }

copy

You should receive the following response:

  1. {
  2. "hits": {
  3. "total": {
  4. "value": 1,
  5. "relation": "eq"
  6. },
  7. "max_score": 1.0,
  8. "hits": [
  9. {
  10. "_index": "products",
  11. "_id": "123",
  12. "_score": 1.0,
  13. "_source": {
  14. "product_id": "123",
  15. "name": "Smartphone",
  16. "price": 699.99,
  17. "discount_percentage": 10
  18. },
  19. "fields": {
  20. "discounted_price": [629.991]
  21. }
  22. }
  23. ]
  24. }
  25. }