Neural sparse search
Introduced 2.11
Semantic search relies on dense retrieval that is based on text embedding models. However, dense methods use k-NN search, which consumes a large amount of memory and CPU resources. An alternative to semantic search, neural sparse search is implemented using an inverted index and is thus as efficient as BM25. Neural sparse search is facilitated by sparse embedding models. When you perform a neural sparse search, it creates a sparse vector (a list of token: weight
key-value pairs representing an entry and its weight) and ingests data into a rank features index.
To further boost search relevance, you can combine neural sparse search with dense semantic search using a hybrid query.
You can configure neural sparse search in the following ways:
- Generate vector embeddings within OpenSearch: Configure an ingest pipeline to generate and store sparse vector embeddings from document text at ingestion time. At query time, input plain text, which will be automatically converted into vector embeddings for search. For complete setup steps, see Configuring ingest pipelines for neural sparse search.
- Ingest raw sparse vectors and search using sparse vectors directly. For complete setup steps, see Ingesting and searching raw vectors.
To learn more about splitting long text into passages for neural search, see Text chunking.
Accelerating neural sparse search
Starting with OpenSearch version 2.15, you can significantly accelerate the search process by creating a search pipeline with a neural_sparse_two_phase_processor
.
To create a search pipeline with a two-phase processor for neural sparse search, use the following request:
PUT /_search/pipeline/two_phase_search_pipeline
{
"request_processors": [
{
"neural_sparse_two_phase_processor": {
"tag": "neural-sparse",
"description": "Creates a two-phase processor for neural sparse search."
}
}
]
}
copy
Then choose the index you want to configure with the search pipeline and set the index.search.default_pipeline
to the pipeline name, as shown in the following example:
PUT /my-nlp-index/_settings
{
"index.search.default_pipeline" : "two_phase_search_pipeline"
}
copy
For information about two_phase_search_pipeline
, see Neural sparse query two-phase processor.
Further reading
- Learn more about how sparse encoding models work and explore OpenSearch neural sparse search benchmarks in Improving document retrieval with sparse semantic encoders.
- Learn the fundamentals of neural sparse search and its efficiency in A deep dive into faster semantic sparse retrieval in OpenSearch 2.12.