Neural search tutorial

By default, OpenSearch calculates document scores using the Okapi BM25 algorithm. BM25 is a keyword-based algorithm that performs well on queries containing keywords but fails to capture the semantic meaning of the query terms. Semantic search, unlike keyword-based search, takes into account the meaning of the query in the search context. Thus, semantic search performs well when a query requires natural language understanding.

In this tutorial, you’ll learn how to use neural search to:

  • Implement semantic search in OpenSearch.
  • Implement hybrid search by combining semantic and keyword search to improve search relevance.

Terminology

It’s helpful to understand the following terms before starting this tutorial:

  • Neural search: Facilitates vector search at ingestion time and at search time:

    • At ingestion time, neural search uses language models to generate vector embeddings from the text fields in the document. The documents containing both the original text field and the vector embedding of the field are then indexed in a k-NN index, as shown in the following diagram.

    Neural search at ingestion time diagram

    • At search time, when you then use a neural query, the query text is passed through a language model, and the resulting vector embeddings are compared with the document text vector embeddings to find the most relevant results, as shown in the following diagram.

    Neural search at search time diagram

  • Semantic search: Employs neural search in order to determine the intention of the user’s query in the search context, thereby improving search relevance.

  • Hybrid search: Combines semantic and keyword search to improve search relevance.

In this tutorial, you’ll implement semantic search using the following OpenSearch components:

You’ll find descriptions of all these components as you follow the tutorial, so don’t worry if you’re not familiar with some of them. Each link in the preceding list will take you to the documentation section for the corresponding component.

Prerequisites

For this simple setup, you’ll use an OpenSearch-provided machine learning (ML) model and a cluster with no dedicated ML nodes. To ensure that this basic local setup works, send the following request to update ML-related cluster settings:

  1. PUT _cluster/settings
  2. {
  3. "persistent": {
  4. "plugins": {
  5. "ml_commons": {
  6. "only_run_on_ml_node": "false",
  7. "model_access_control_enabled": "true",
  8. "native_memory_threshold": "99"
  9. }
  10. }
  11. }
  12. }

copy

Advanced

For a custom local model setup, note the following requirements:

  • To register a custom local model, you need to specify an additional "allow_registering_model_via_url": "true" cluster setting.
  • In production, it’s best practice to separate the workloads by having dedicated ML nodes. On clusters with dedicated ML nodes, specify "only_run_on_ml_node": "true" for improved performance.

For more information about ML-related cluster settings, see ML Commons cluster settings.

Tutorial overview

This tutorial consists of the following steps:

  1. Set up an ML language model.
    1. Choose a language model.
    2. Register a model group.
    3. Register the model to the model group.
    4. Deploy the model.
  2. Ingest data with neural search.
    1. Create an ingest pipeline for neural search.
    2. Create a k-NN index.
    3. Ingest documents into the index.
  3. Search the data.

Some steps in the tutorial contain optional Test it sections. You can ensure that the step was successful by running requests in these sections.

After you’re done, follow the steps in the Clean up section to delete all created components.

Tutorial

You can follow this tutorial using your command line or the OpenSearch Dashboards Dev Tools console.

Step 1: Set up an ML language model

Neural search requires a language model in order to generate vector embeddings from text fields, both at ingestion time and query time.

Step 1(a): Choose a language model

For this tutorial, you’ll use the DistilBERT model from Hugging Face. It is one of the pretrained sentence transformer models available in OpenSearch that has shown some of the best results in benchmarking tests (for details, see this blog post). You’ll need the name, version, and dimension of the model to register it. You can find this information in the pretrained model table by selecting the config_url link corresponding to the model’s TorchScript artifact:

  • The model name is huggingface/sentence-transformers/msmarco-distilbert-base-tas-b.
  • The model version is 1.0.1.
  • The number of dimensions for this model is 768.

Take note of the dimensionality of the model because you’ll need it when you set up a k-NN index.

Advanced: Using a different model

Alternatively, you can choose one of the following options for your model:

For information about choosing a model, see Further reading.

Step 1(b): Register a model group

For access control, models are organized into model groups (collections of versions of a particular model). Each model group name in the cluster must be globally unique. Registering a model group ensures the uniqueness of the model group name.

If you are registering the first version of a model without first registering the model group, a new model group is created automatically. For more information, see Model access control.

To register a model group with the access mode set to public, send the following request:

  1. POST /_plugins/_ml/model_groups/_register
  2. {
  3. "name": "NLP_model_group",
  4. "description": "A model group for NLP models",
  5. "access_mode": "public"
  6. }

copy

OpenSearch sends back the model group ID:

  1. {
  2. "model_group_id": "Z1eQf4oB5Vm0Tdw8EIP2",
  3. "status": "CREATED"
  4. }

You’ll use this ID to register the chosen model to the model group.

Test it

Search for the newly created model group by providing its model group ID in the request:

  1. POST /_plugins/_ml/model_groups/_search
  2. {
  3. "query": {
  4. "match": {
  5. "_id": "Z1eQf4oB5Vm0Tdw8EIP2"
  6. }
  7. }
  8. }

copy

The response contains the model group:

  1. {
  2. "took": 0,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 1,
  13. "relation": "eq"
  14. },
  15. "max_score": 1,
  16. "hits": [
  17. {
  18. "_index": ".plugins-ml-model-group",
  19. "_id": "Z1eQf4oB5Vm0Tdw8EIP2",
  20. "_version": 1,
  21. "_seq_no": 14,
  22. "_primary_term": 2,
  23. "_score": 1,
  24. "_source": {
  25. "created_time": 1694357262582,
  26. "access": "public",
  27. "latest_version": 0,
  28. "last_updated_time": 1694357262582,
  29. "name": "NLP_model_group",
  30. "description": "A model group for NLP models"
  31. }
  32. }
  33. ]
  34. }
  35. }

Step 1(c): Register the model to the model group

To register the model to the model group, provide the model group ID in the register request:

  1. POST /_plugins/_ml/models/_register
  2. {
  3. "name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
  4. "version": "1.0.1",
  5. "model_group_id": "Z1eQf4oB5Vm0Tdw8EIP2",
  6. "model_format": "TORCH_SCRIPT"
  7. }

copy

Registering a model is an asynchronous task. OpenSearch sends back a task ID for this task:

  1. {
  2. "task_id": "aFeif4oB5Vm0Tdw8yoN7",
  3. "status": "CREATED"
  4. }

OpenSearch downloads the config file for the model and the model contents from the URL. Because the model is larger than 10 MB in size, OpenSearch splits it into chunks of up to 10 MB and saves those chunks in the model index. You can check the status of the task by using the Tasks API:

  1. GET /_plugins/_ml/tasks/aFeif4oB5Vm0Tdw8yoN7

copy

Once the task is complete, the task state will be COMPLETED and the Tasks API response will contain a model ID for the registered model:

  1. {
  2. "model_id": "aVeif4oB5Vm0Tdw8zYO2",
  3. "task_type": "REGISTER_MODEL",
  4. "function_name": "TEXT_EMBEDDING",
  5. "state": "COMPLETED",
  6. "worker_node": [
  7. "4p6FVOmJRtu3wehDD74hzQ"
  8. ],
  9. "create_time": 1694358489722,
  10. "last_update_time": 1694358499139,
  11. "is_async": true
  12. }

You’ll need the model ID in order to use this model for several of the following steps.

Test it

Search for the newly created model by providing its ID in the request:

  1. GET /_plugins/_ml/models/aVeif4oB5Vm0Tdw8zYO2

copy

The response contains the model:

  1. {
  2. "name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
  3. "model_group_id": "Z1eQf4oB5Vm0Tdw8EIP2",
  4. "algorithm": "TEXT_EMBEDDING",
  5. "model_version": "1",
  6. "model_format": "TORCH_SCRIPT",
  7. "model_state": "REGISTERED",
  8. "model_content_size_in_bytes": 266352827,
  9. "model_content_hash_value": "acdc81b652b83121f914c5912ae27c0fca8fabf270e6f191ace6979a19830413",
  10. "model_config": {
  11. "model_type": "distilbert",
  12. "embedding_dimension": 768,
  13. "framework_type": "SENTENCE_TRANSFORMERS",
  14. "all_config": """{"_name_or_path":"old_models/msmarco-distilbert-base-tas-b/0_Transformer","activation":"gelu","architectures":["DistilBertModel"],"attention_dropout":0.1,"dim":768,"dropout":0.1,"hidden_dim":3072,"initializer_range":0.02,"max_position_embeddings":512,"model_type":"distilbert","n_heads":12,"n_layers":6,"pad_token_id":0,"qa_dropout":0.1,"seq_classif_dropout":0.2,"sinusoidal_pos_embds":false,"tie_weights_":true,"transformers_version":"4.7.0","vocab_size":30522}"""
  15. },
  16. "created_time": 1694482261832,
  17. "last_updated_time": 1694482324282,
  18. "last_registered_time": 1694482270216,
  19. "last_deployed_time": 1694482324282,
  20. "total_chunks": 27,
  21. "planning_worker_node_count": 1,
  22. "current_worker_node_count": 1,
  23. "planning_worker_nodes": [
  24. "4p6FVOmJRtu3wehDD74hzQ"
  25. ],
  26. "deploy_to_all_nodes": true
  27. }

The response contains the model information. You can see that the model_state is REGISTERED. Additionally, the model was split into 27 chunks, as shown in the total_chunks field.

Advanced: Registering a custom model

To register a custom model, you must provide a model configuration in the register request. For example, the following is a register request containing the full format for the model used in this tutorial:

  1. POST /_plugins/_ml/models/_register
  2. {
  3. "name": "sentence-transformers/msmarco-distilbert-base-tas-b",
  4. "version": "1.0.1",
  5. "description": "This is a port of the DistilBert TAS-B Model to sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and is optimized for the task of semantic search.",
  6. "model_task_type": "TEXT_EMBEDDING",
  7. "model_format": "ONNX",
  8. "model_content_size_in_bytes": 266291330,
  9. "model_content_hash_value": "a3c916f24239fbe32c43be6b24043123d49cd2c41b312fc2b29f2fc65e3c424c",
  10. "model_config": {
  11. "model_type": "distilbert",
  12. "embedding_dimension": 768,
  13. "framework_type": "huggingface_transformers",
  14. "pooling_mode": "CLS",
  15. "normalize_result": false,
  16. "all_config": "{\"_name_or_path\":\"old_models/msmarco-distilbert-base-tas-b/0_Transformer\",\"activation\":\"gelu\",\"architectures\":[\"DistilBertModel\"],\"attention_dropout\":0.1,\"dim\":768,\"dropout\":0.1,\"hidden_dim\":3072,\"initializer_range\":0.02,\"max_position_embeddings\":512,\"model_type\":\"distilbert\",\"n_heads\":12,\"n_layers\":6,\"pad_token_id\":0,\"qa_dropout\":0.1,\"seq_classif_dropout\":0.2,\"sinusoidal_pos_embds\":false,\"tie_weights_\":true,\"transformers_version\":\"4.7.0\",\"vocab_size\":30522}"
  17. },
  18. "created_time": 1676074079195,
  19. "model_group_id": "Z1eQf4oB5Vm0Tdw8EIP2",
  20. "url": "https://artifacts.opensearch.org/models/ml-models/huggingface/sentence-transformers/msmarco-distilbert-base-tas-b/1.0.1/onnx/sentence-transformers_msmarco-distilbert-base-tas-b-1.0.1-onnx.zip"
  21. }

For more information, see Using ML models within OpenSearch.

Step 1(d): Deploy the model

Once the model is registered, it is saved in the model index. Next, you’ll need to deploy the model. Deploying a model creates a model instance and caches the model in memory. To deploy the model, provide its model ID to the _deploy endpoint:

  1. POST /_plugins/_ml/models/aVeif4oB5Vm0Tdw8zYO2/_deploy

copy

Like the register operation, the deploy operation is asynchronous, so you’ll get a task ID in the response:

  1. {
  2. "task_id": "ale6f4oB5Vm0Tdw8NINO",
  3. "status": "CREATED"
  4. }

You can check the status of the task by using the Tasks API:

  1. GET /_plugins/_ml/tasks/ale6f4oB5Vm0Tdw8NINO

copy

Once the task is complete, the task state will be COMPLETED:

  1. {
  2. "model_id": "aVeif4oB5Vm0Tdw8zYO2",
  3. "task_type": "DEPLOY_MODEL",
  4. "function_name": "TEXT_EMBEDDING",
  5. "state": "COMPLETED",
  6. "worker_node": [
  7. "4p6FVOmJRtu3wehDD74hzQ"
  8. ],
  9. "create_time": 1694360024141,
  10. "last_update_time": 1694360027940,
  11. "is_async": true
  12. }

Test it

Search for the deployed model by providing its ID in the request:

  1. GET /_plugins/_ml/models/aVeif4oB5Vm0Tdw8zYO2

copy

The response shows the model state as DEPLOYED:

  1. {
  2. "name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
  3. "model_group_id": "Z1eQf4oB5Vm0Tdw8EIP2",
  4. "algorithm": "TEXT_EMBEDDING",
  5. "model_version": "1",
  6. "model_format": "TORCH_SCRIPT",
  7. "model_state": "DEPLOYED",
  8. "model_content_size_in_bytes": 266352827,
  9. "model_content_hash_value": "acdc81b652b83121f914c5912ae27c0fca8fabf270e6f191ace6979a19830413",
  10. "model_config": {
  11. "model_type": "distilbert",
  12. "embedding_dimension": 768,
  13. "framework_type": "SENTENCE_TRANSFORMERS",
  14. "all_config": """{"_name_or_path":"old_models/msmarco-distilbert-base-tas-b/0_Transformer","activation":"gelu","architectures":["DistilBertModel"],"attention_dropout":0.1,"dim":768,"dropout":0.1,"hidden_dim":3072,"initializer_range":0.02,"max_position_embeddings":512,"model_type":"distilbert","n_heads":12,"n_layers":6,"pad_token_id":0,"qa_dropout":0.1,"seq_classif_dropout":0.2,"sinusoidal_pos_embds":false,"tie_weights_":true,"transformers_version":"4.7.0","vocab_size":30522}"""
  15. },
  16. "created_time": 1694482261832,
  17. "last_updated_time": 1694482324282,
  18. "last_registered_time": 1694482270216,
  19. "last_deployed_time": 1694482324282,
  20. "total_chunks": 27,
  21. "planning_worker_node_count": 1,
  22. "current_worker_node_count": 1,
  23. "planning_worker_nodes": [
  24. "4p6FVOmJRtu3wehDD74hzQ"
  25. ],
  26. "deploy_to_all_nodes": true
  27. }

You can also receive statistics for all deployed models in your cluster by sending a Models Profile API request:

  1. GET /_plugins/_ml/profile/models

Neural search uses a language model to transform text into vector embeddings. During ingestion, neural search creates vector embeddings for the text fields in the request. During search, you can generate vector embeddings for the query text by applying the same model, allowing you to perform vector similarity search on the documents.

Now that you have deployed a model, you can use this model to configure neural search. First, you need to create an ingest pipeline that contains one processor: a task that transforms document fields before documents are ingested into an index. For neural search, you’ll set up a text_embedding processor that creates vector embeddings from text. You’ll need the model_id of the model you set up in the previous section and a field_map, which specifies the name of the field from which to take the text (text) and the name of the field in which to record embeddings (passage_embedding):

  1. PUT /_ingest/pipeline/nlp-ingest-pipeline
  2. {
  3. "description": "An NLP ingest pipeline",
  4. "processors": [
  5. {
  6. "text_embedding": {
  7. "model_id": "aVeif4oB5Vm0Tdw8zYO2",
  8. "field_map": {
  9. "text": "passage_embedding"
  10. }
  11. }
  12. }
  13. ]
  14. }

copy

Test it

Search for the created ingest pipeline by using the Ingest API:

  1. GET /_ingest/pipeline

copy

The response contains the ingest pipeline:

  1. {
  2. "nlp-ingest-pipeline": {
  3. "description": "An NLP ingest pipeline",
  4. "processors": [
  5. {
  6. "text_embedding": {
  7. "model_id": "aVeif4oB5Vm0Tdw8zYO2",
  8. "field_map": {
  9. "text": "passage_embedding"
  10. }
  11. }
  12. }
  13. ]
  14. }
  15. }

Step 2(b): Create a k-NN index

Now you’ll create a k-NN index with a field named text, which contains an image description, and a knn_vector field named passage_embedding, which contains the vector embedding of the text. Additionally, set the default ingest pipeline to the nlp-ingest-pipeline you created in the previous step:

  1. PUT /my-nlp-index
  2. {
  3. "settings": {
  4. "index.knn": true,
  5. "default_pipeline": "nlp-ingest-pipeline"
  6. },
  7. "mappings": {
  8. "properties": {
  9. "id": {
  10. "type": "text"
  11. },
  12. "passage_embedding": {
  13. "type": "knn_vector",
  14. "dimension": 768,
  15. "method": {
  16. "engine": "lucene",
  17. "space_type": "l2",
  18. "name": "hnsw",
  19. "parameters": {}
  20. }
  21. },
  22. "text": {
  23. "type": "text"
  24. }
  25. }
  26. }
  27. }

copy

Setting up a k-NN index allows you to later perform a vector search on the passage_embedding field.

Test it

Use the following requests to get the settings and the mappings of the created index:

  1. GET /my-nlp-index/_settings

copy

  1. GET /my-nlp-index/_mappings

copy

Step 2(c): Ingest documents into the index

In this step, you’ll ingest several sample documents into the index. The sample data is taken from the Flickr image dataset. Each document contains a text field corresponding to the image description and an id field corresponding to the image ID:

  1. PUT /my-nlp-index/_doc/1
  2. {
  3. "text": "A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .",
  4. "id": "4319130149.jpg"
  5. }

copy

  1. PUT /my-nlp-index/_doc/2
  2. {
  3. "text": "A wild animal races across an uncut field with a minimal amount of trees .",
  4. "id": "1775029934.jpg"
  5. }

copy

  1. PUT /my-nlp-index/_doc/3
  2. {
  3. "text": "People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco .",
  4. "id": "2664027527.jpg"
  5. }

copy

  1. PUT /my-nlp-index/_doc/4
  2. {
  3. "text": "A man who is riding a wild horse in the rodeo is very near to falling off .",
  4. "id": "4427058951.jpg"
  5. }

copy

  1. PUT /my-nlp-index/_doc/5
  2. {
  3. "text": "A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse .",
  4. "id": "2691147709.jpg"
  5. }

copy

When the documents are ingested into the index, the text_embedding processor creates an additional field that contains vector embeddings and adds that field to the document. To see an example document that is indexed, search for document 1:

  1. GET /my-nlp-index/_doc/1

copy

The response includes the document _source containing the original text and id fields and the added passage_embedding field:

  1. {
  2. "_index": "my-nlp-index",
  3. "_id": "1",
  4. "_version": 1,
  5. "_seq_no": 0,
  6. "_primary_term": 1,
  7. "found": true,
  8. "_source": {
  9. "passage_embedding": [
  10. 0.04491629,
  11. -0.34105563,
  12. 0.036822468,
  13. -0.14139028,
  14. ...
  15. ],
  16. "text": "A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .",
  17. "id": "4319130149.jpg"
  18. }
  19. }

Step 3: Search the data

Now you’ll search the index using keyword search, neural search, and a combination of the two.

To search using a keyword search, use a match query. You’ll exclude embeddings from the results:

  1. GET /my-nlp-index/_search
  2. {
  3. "_source": {
  4. "excludes": [
  5. "passage_embedding"
  6. ]
  7. },
  8. "query": {
  9. "match": {
  10. "text": {
  11. "query": "wild west"
  12. }
  13. }
  14. }
  15. }

copy

Document 3 is not returned because it does not contain the specified keywords. Documents containing the words rodeo and cowboy are scored lower because semantic meaning is not considered:

Results

  1. {
  2. "took": 647,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 4,
  13. "relation": "eq"
  14. },
  15. "max_score": 1.7878418,
  16. "hits": [
  17. {
  18. "_index": "my-nlp-index",
  19. "_id": "1",
  20. "_score": 1.7878418,
  21. "_source": {
  22. "text": "A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .",
  23. "id": "4319130149.jpg"
  24. }
  25. },
  26. {
  27. "_index": "my-nlp-index",
  28. "_id": "2",
  29. "_score": 0.58093566,
  30. "_source": {
  31. "text": "A wild animal races across an uncut field with a minimal amount of trees .",
  32. "id": "1775029934.jpg"
  33. }
  34. },
  35. {
  36. "_index": "my-nlp-index",
  37. "_id": "5",
  38. "_score": 0.55228686,
  39. "_source": {
  40. "text": "A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse .",
  41. "id": "2691147709.jpg"
  42. }
  43. },
  44. {
  45. "_index": "my-nlp-index",
  46. "_id": "4",
  47. "_score": 0.53899646,
  48. "_source": {
  49. "text": "A man who is riding a wild horse in the rodeo is very near to falling off .",
  50. "id": "4427058951.jpg"
  51. }
  52. }
  53. ]
  54. }
  55. }

To search using a neural search, use a neural query and provide the model ID of the model you set up earlier so that vector embeddings for the query text are generated with the model used at ingestion time:

  1. GET /my-nlp-index/_search
  2. {
  3. "_source": {
  4. "excludes": [
  5. "passage_embedding"
  6. ]
  7. },
  8. "query": {
  9. "neural": {
  10. "passage_embedding": {
  11. "query_text": "wild west",
  12. "model_id": "aVeif4oB5Vm0Tdw8zYO2",
  13. "k": 5
  14. }
  15. }
  16. }
  17. }

copy

This time, the response not only contains all five documents, but the document order is also improved because neural search considers semantic meaning:

Results

  1. {
  2. "took": 25,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 5,
  13. "relation": "eq"
  14. },
  15. "max_score": 0.01585195,
  16. "hits": [
  17. {
  18. "_index": "my-nlp-index",
  19. "_id": "4",
  20. "_score": 0.01585195,
  21. "_source": {
  22. "text": "A man who is riding a wild horse in the rodeo is very near to falling off .",
  23. "id": "4427058951.jpg"
  24. }
  25. },
  26. {
  27. "_index": "my-nlp-index",
  28. "_id": "2",
  29. "_score": 0.015748845,
  30. "_source": {
  31. "text": "A wild animal races across an uncut field with a minimal amount of trees.",
  32. "id": "1775029934.jpg"
  33. }
  34. },
  35. {
  36. "_index": "my-nlp-index",
  37. "_id": "5",
  38. "_score": 0.015177963,
  39. "_source": {
  40. "text": "A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse .",
  41. "id": "2691147709.jpg"
  42. }
  43. },
  44. {
  45. "_index": "my-nlp-index",
  46. "_id": "1",
  47. "_score": 0.013272902,
  48. "_source": {
  49. "text": "A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .",
  50. "id": "4319130149.jpg"
  51. }
  52. },
  53. {
  54. "_index": "my-nlp-index",
  55. "_id": "3",
  56. "_score": 0.011347735,
  57. "_source": {
  58. "text": "People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco .",
  59. "id": "2664027527.jpg"
  60. }
  61. }
  62. ]
  63. }
  64. }

Hybrid search combines keyword and neural search to improve search relevance. To implement hybrid search, you need to set up a search pipeline that runs at search time. The search pipeline you’ll configure intercepts search results at an intermediate stage and applies the normalization-processor to them. The normalization-processor normalizes and combines the document scores from multiple query clauses, rescoring the documents according to the chosen normalization and combination techniques.

Step 1: Configure a search pipeline

To configure a search pipeline with a normalization-processor, use the following request. The normalization technique in the processor is set to min_max, and the combination technique is set to arithmetic_mean. The weights array specifies the weights assigned to each query clause as decimal percentages:

  1. PUT /_search/pipeline/nlp-search-pipeline
  2. {
  3. "description": "Post processor for hybrid search",
  4. "phase_results_processors": [
  5. {
  6. "normalization-processor": {
  7. "normalization": {
  8. "technique": "min_max"
  9. },
  10. "combination": {
  11. "technique": "arithmetic_mean",
  12. "parameters": {
  13. "weights": [
  14. 0.3,
  15. 0.7
  16. ]
  17. }
  18. }
  19. }
  20. }
  21. ]
  22. }

copy

Step 2: Search with the hybrid query

You’ll use the hybrid query to combine the match and neural query clauses. Make sure to apply the previously created nlp-search-pipeline to the request in the query parameter:

  1. GET /my-nlp-index/_search?search_pipeline=nlp-search-pipeline
  2. {
  3. "_source": {
  4. "exclude": [
  5. "passage_embedding"
  6. ]
  7. },
  8. "query": {
  9. "hybrid": {
  10. "queries": [
  11. {
  12. "match": {
  13. "text": {
  14. "query": "cowboy rodeo bronco"
  15. }
  16. }
  17. },
  18. {
  19. "neural": {
  20. "passage_embedding": {
  21. "query_text": "wild west",
  22. "model_id": "aVeif4oB5Vm0Tdw8zYO2",
  23. "k": 5
  24. }
  25. }
  26. }
  27. ]
  28. }
  29. }
  30. }

copy

Not only does OpenSearch return documents that match the semantic meaning of wild west, but now the documents containing words related to the wild west theme are also scored higher relative to the others:

Results

  1. {
  2. "took": 27,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 5,
  13. "relation": "eq"
  14. },
  15. "max_score": 0.86481035,
  16. "hits": [
  17. {
  18. "_index": "my-nlp-index",
  19. "_id": "5",
  20. "_score": 0.86481035,
  21. "_source": {
  22. "text": "A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse .",
  23. "id": "2691147709.jpg"
  24. }
  25. },
  26. {
  27. "_index": "my-nlp-index",
  28. "_id": "4",
  29. "_score": 0.7003,
  30. "_source": {
  31. "text": "A man who is riding a wild horse in the rodeo is very near to falling off .",
  32. "id": "4427058951.jpg"
  33. }
  34. },
  35. {
  36. "_index": "my-nlp-index",
  37. "_id": "2",
  38. "_score": 0.6839765,
  39. "_source": {
  40. "text": "A wild animal races across an uncut field with a minimal amount of trees.",
  41. "id": "1775029934.jpg"
  42. }
  43. },
  44. {
  45. "_index": "my-nlp-index",
  46. "_id": "3",
  47. "_score": 0.3007,
  48. "_source": {
  49. "text": "People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco .",
  50. "id": "2664027527.jpg"
  51. }
  52. },
  53. {
  54. "_index": "my-nlp-index",
  55. "_id": "1",
  56. "_score": 0.29919013,
  57. "_source": {
  58. "text": "A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .",
  59. "id": "4319130149.jpg"
  60. }
  61. }
  62. ]
  63. }
  64. }

Instead of specifying the search pipeline in every request, you can set it as a default search pipeline for the index as follows:

  1. PUT /my-nlp-index/_settings
  2. {
  3. "index.search.default_pipeline" : "nlp-search-pipeline"
  4. }

copy

You can now experiment with different weights, normalization techniques, and combination techniques. For more information, see the normalization-processor and hybrid query documentation.

Advanced

You can parameterize the search by using search templates. Search templates hide implementation details, reducing the number of nested levels and thus the query complexity. For more information, see search templates.

Clean up

After you’re done, delete the components you’ve created in this tutorial from the cluster:

  1. DELETE /my-nlp-index

copy

  1. DELETE /_search/pipeline/nlp-search-pipeline

copy

  1. DELETE /_ingest/pipeline/nlp-ingest-pipeline

copy

  1. POST /_plugins/_ml/models/aVeif4oB5Vm0Tdw8zYO2/_undeploy

copy

  1. DELETE /_plugins/_ml/models/aVeif4oB5Vm0Tdw8zYO2

copy

  1. DELETE /_plugins/_ml/model_groups/Z1eQf4oB5Vm0Tdw8EIP2

copy

Further reading