Neural Sparse Search tool

Introduced 2.13

The NeuralSparseSearchTool performs sparse vector retrieval. For more information about neural sparse search, see Neural sparse search.

Step 1: Register and deploy a sparse encoding model

OpenSearch supports several pretrained sparse encoding models. You can either use one of those models or your own custom model. For a list of supported pretrained models, see Sparse encoding models. For more information, see OpenSearch-provided pretrained models and Custom local models.

In this example, you’ll use the amazon/neural-sparse/opensearch-neural-sparse-encoding-v1 pretrained model for both ingestion and search. To register and deploy the model to OpenSearch, send the following request:

  1. POST /_plugins/_ml/models/_register?deploy=true
  2. {
  3. "name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-v1",
  4. "version": "1.0.1",
  5. "model_format": "TORCH_SCRIPT"
  6. }

copy

OpenSearch responds with a task ID for the model registration and deployment task:

  1. {
  2. "task_id": "M_9KY40Bk4MTqirc5lP8",
  3. "status": "CREATED"
  4. }

You can monitor the status of the task by calling the Tasks API:

  1. GET _plugins/_ml/tasks/M_9KY40Bk4MTqirc5lP8

copy

Once the model is registered and deployed, the task state changes to COMPLETED and OpenSearch returns a model ID for the model:

  1. {
  2. "model_id": "Nf9KY40Bk4MTqirc6FO7",
  3. "task_type": "REGISTER_MODEL",
  4. "function_name": "SPARSE_ENCODING",
  5. "state": "COMPLETED",
  6. "worker_node": [
  7. "UyQSTQ3nTFa3IP6IdFKoug"
  8. ],
  9. "create_time": 1706767869692,
  10. "last_update_time": 1706767935556,
  11. "is_async": true
  12. }

Step 2: Ingest data into an index

First, you’ll set up an ingest pipeline to encode documents using the sparse encoding model set up in the previous step:

  1. PUT /_ingest/pipeline/pipeline-sparse
  2. {
  3. "description": "An sparse encoding ingest pipeline",
  4. "processors": [
  5. {
  6. "sparse_encoding": {
  7. "model_id": "Nf9KY40Bk4MTqirc6FO7",
  8. "field_map": {
  9. "passage_text": "passage_embedding"
  10. }
  11. }
  12. }
  13. ]
  14. }

copy

Next, create an index specifying the pipeline as the default pipeline:

  1. PUT index_for_neural_sparse
  2. {
  3. "settings": {
  4. "default_pipeline": "pipeline-sparse"
  5. },
  6. "mappings": {
  7. "properties": {
  8. "passage_embedding": {
  9. "type": "rank_features"
  10. },
  11. "passage_text": {
  12. "type": "text"
  13. }
  14. }
  15. }
  16. }

copy

Last, ingest data into the index by sending a bulk request:

  1. POST _bulk
  2. { "index" : { "_index" : "index_for_neural_sparse", "_id" : "1" } }
  3. { "passage_text" : "company AAA has a history of 123 years" }
  4. { "index" : { "_index" : "index_for_neural_sparse", "_id" : "2" } }
  5. { "passage_text" : "company AAA has over 7000 employees" }
  6. { "index" : { "_index" : "index_for_neural_sparse", "_id" : "3" } }
  7. { "passage_text" : "Jack and Mark established company AAA" }
  8. { "index" : { "_index" : "index_for_neural_sparse", "_id" : "4" } }
  9. { "passage_text" : "company AAA has a net profit of 13 millions in 2022" }
  10. { "index" : { "_index" : "index_for_neural_sparse", "_id" : "5" } }
  11. { "passage_text" : "company AAA focus on the large language models domain" }

copy

Step 3: Register a flow agent that will run the NeuralSparseSearchTool

A flow agent runs a sequence of tools in order and returns the last tool’s output. To create a flow agent, send the following request, providing the model ID for the model set up in Step 1. This model will encode your queries into sparse vector embeddings:

  1. POST /_plugins/_ml/agents/_register
  2. {
  3. "name": "Test_Neural_Sparse_Agent_For_RAG",
  4. "type": "flow",
  5. "tools": [
  6. {
  7. "type": "NeuralSparseSearchTool",
  8. "parameters": {
  9. "description":"use this tool to search data from the knowledge base of company AAA",
  10. "model_id": "Nf9KY40Bk4MTqirc6FO7",
  11. "index": "index_for_neural_sparse",
  12. "embedding_field": "passage_embedding",
  13. "source_field": ["passage_text"],
  14. "input": "${parameters.question}",
  15. "doc_size":2
  16. }
  17. }
  18. ]
  19. }

copy

For parameter descriptions, see Register parameters.

OpenSearch responds with an agent ID:

  1. {
  2. "agent_id": "9X7xWI0Bpc3sThaJdY9i"
  3. }

Step 4: Run the agent

Before you run the agent, make sure that you add the sample OpenSearch Dashboards Sample web logs dataset. To learn more, see Adding sample data.

Then, run the agent by sending the following request:

  1. POST /_plugins/_ml/agents/9X7xWI0Bpc3sThaJdY9i/_execute
  2. {
  3. "parameters": {
  4. "question":"how many employees does AAA have?"
  5. }
  6. }

copy

OpenSearch returns the inference results:

  1. {
  2. "inference_results": [
  3. {
  4. "output": [
  5. {
  6. "name": "response",
  7. "result": """{"_index":"index_for_neural_sparse","_source":{"passage_text":"company AAA has over 7000 employees"},"_id":"2","_score":30.586042}
  8. {"_index":"index_for_neural_sparse","_source":{"passage_text":"company AAA has a history of 123 years"},"_id":"1","_score":16.088133}
  9. """
  10. }
  11. ]
  12. }
  13. ]
  14. }

Register parameters

The following table lists all tool parameters that are available when registering an agent.

ParameterTypeRequired/OptionalDescription
model_idStringRequiredThe model ID of the sparse encoding model to use at search time.
indexStringRequiredThe index to search.
embedding_fieldStringRequiredWhen the neural sparse model encodes raw text documents, the encoding result is saved in a field. Specify this field as the embedding_field. Neural sparse search matches documents to the query by calculating the similarity score between the query text and the text in the document’s embedding_field.
source_fieldStringRequiredThe document field or fields to return. You can provide a list of multiple fields as an array of strings, for example, [“field1”, “field2”].
inputStringRequired for flow agentRuntime input sourced from flow agent parameters. If using a large language model (LLM), this field is populated with the LLM response.
nameStringOptionalThe tool name. Useful when an LLM needs to select an appropriate tool for a task.
descriptionStringOptionalA description of the tool. Useful when an LLM needs to select an appropriate tool for a task.
doc_sizeIntegerOptionalThe number of documents to fetch. Default is 2.

Execute parameters

The following table lists all tool parameters that are available when running the agent.

ParameterTypeRequired/OptionalDescription
questionStringRequiredThe natural language question to send to the LLM.