Reranking search results using the Cohere Rerank model

A reranking pipeline can rerank search results, providing a relevance score for each document in the search results with respect to the search query. The relevance score is calculated by a cross-encoder model.

This tutorial illustrates how to use the Cohere Rerank model in a reranking pipeline.

Replace the placeholders beginning with the prefix your_ with your own values.

Step 1: Register a Cohere Rerank model

Create a connector for the Cohere Rerank model:

  1. POST /_plugins/_ml/connectors/_create
  2. {
  3. "name": "cohere-rerank",
  4. "description": "The connector to Cohere reanker model",
  5. "version": "1",
  6. "protocol": "http",
  7. "credential": {
  8. "cohere_key": "your_cohere_api_key"
  9. },
  10. "parameters": {
  11. "model": "rerank-english-v2.0"
  12. },
  13. "actions": [
  14. {
  15. "action_type": "predict",
  16. "method": "POST",
  17. "url": "https://api.cohere.ai/v1/rerank",
  18. "headers": {
  19. "Authorization": "Bearer ${credential.cohere_key}"
  20. },
  21. "request_body": "{ \"documents\": ${parameters.documents}, \"query\": \"${parameters.query}\", \"model\": \"${parameters.model}\", \"top_n\": ${parameters.top_n} }",
  22. "pre_process_function": "connector.pre_process.cohere.rerank",
  23. "post_process_function": "connector.post_process.cohere.rerank"
  24. }
  25. ]
  26. }

copy

Use the connector ID from the response to register a Cohere Rerank model:

  1. POST /_plugins/_ml/models/_register?deploy=true
  2. {
  3. "name": "cohere rerank model",
  4. "function_name": "remote",
  5. "description": "test rerank model",
  6. "connector_id": "your_connector_id"
  7. }

copy

Note the model ID in the response; you’ll use it in the following steps.

Test the model by calling the Predict API:

  1. POST _plugins/_ml/models/your_model_id/_predict
  2. {
  3. "parameters": {
  4. "query": "What is the capital of the United States?",
  5. "documents": [
  6. "Carson City is the capital city of the American state of Nevada.",
  7. "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
  8. "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.",
  9. "Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states."
  10. ],
  11. "top_n": 4
  12. }
  13. }

To ensure compatibility with the rerank pipeline, the top_n value must be the same as the length of the documents list.

You can customize the number of top documents returned in the response by providing the size parameter. For more information, see Step 2.3.

OpenSearch responds with the inference results:

  1. {
  2. "inference_results": [
  3. {
  4. "output": [
  5. {
  6. "name": "similarity",
  7. "data_type": "FLOAT32",
  8. "shape": [
  9. 1
  10. ],
  11. "data": [
  12. 0.10194652
  13. ]
  14. },
  15. {
  16. "name": "similarity",
  17. "data_type": "FLOAT32",
  18. "shape": [
  19. 1
  20. ],
  21. "data": [
  22. 0.0721122
  23. ]
  24. },
  25. {
  26. "name": "similarity",
  27. "data_type": "FLOAT32",
  28. "shape": [
  29. 1
  30. ],
  31. "data": [
  32. 0.98005307
  33. ]
  34. },
  35. {
  36. "name": "similarity",
  37. "data_type": "FLOAT32",
  38. "shape": [
  39. 1
  40. ],
  41. "data": [
  42. 0.27904198
  43. ]
  44. }
  45. ],
  46. "status_code": 200
  47. }
  48. ]
  49. }

The response contains four similarity objects. For each similarity object, the data array contains a relevance score for each document with respect to the query. The similarity objects are provided in the order of the input documents; the first object pertains to the first document. This differs from the default output of the Cohere Rerank model, which orders documents by relevance score. The document order is changed in the connector.post_process.cohere.rerank post-processing function in order to make the output compatible with a reranking pipeline.

Step 2: Configure a reranking pipeline

Follow these steps to configure a reranking pipeline.

Step 2.1: Ingest test data

Send a bulk request to ingest test data:

  1. POST _bulk
  2. { "index": { "_index": "my-test-data" } }
  3. { "passage_text" : "Carson City is the capital city of the American state of Nevada." }
  4. { "index": { "_index": "my-test-data" } }
  5. { "passage_text" : "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan." }
  6. { "index": { "_index": "my-test-data" } }
  7. { "passage_text" : "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district." }
  8. { "index": { "_index": "my-test-data" } }
  9. { "passage_text" : "Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states." }

copy

Step 2.2: Create a reranking pipeline

Create a reranking pipeline with the Cohere Rerank model:

  1. PUT /_search/pipeline/rerank_pipeline_cohere
  2. {
  3. "description": "Pipeline for reranking with Cohere Rerank model",
  4. "response_processors": [
  5. {
  6. "rerank": {
  7. "ml_opensearch": {
  8. "model_id": "your_model_id_created_in_step1"
  9. },
  10. "context": {
  11. "document_fields": ["passage_text"]
  12. }
  13. }
  14. }
  15. ]
  16. }

copy

Step 2.3: Test the reranking

To limit the number of returned results, you can specify the size parameter. For example, set "size": 2 to return the top two documents:

  1. GET my-test-data/_search?search_pipeline=rerank_pipeline_cohere
  2. {
  3. "query": {
  4. "match_all": {}
  5. },
  6. "size": 4,
  7. "ext": {
  8. "rerank": {
  9. "query_context": {
  10. "query_text": "What is the capital of the United States?"
  11. }
  12. }
  13. }
  14. }

copy

The response contains the two most relevant documents:

  1. {
  2. "took": 0,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 4,
  13. "relation": "eq"
  14. },
  15. "max_score": 0.98005307,
  16. "hits": [
  17. {
  18. "_index": "my-test-data",
  19. "_id": "zbUOw40B8vrNLhb9vBif",
  20. "_score": 0.98005307,
  21. "_source": {
  22. "passage_text": "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district."
  23. }
  24. },
  25. {
  26. "_index": "my-test-data",
  27. "_id": "zrUOw40B8vrNLhb9vBif",
  28. "_score": 0.27904198,
  29. "_source": {
  30. "passage_text": "Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states."
  31. }
  32. },
  33. {
  34. "_index": "my-test-data",
  35. "_id": "y7UOw40B8vrNLhb9vBif",
  36. "_score": 0.10194652,
  37. "_source": {
  38. "passage_text": "Carson City is the capital city of the American state of Nevada."
  39. }
  40. },
  41. {
  42. "_index": "my-test-data",
  43. "_id": "zLUOw40B8vrNLhb9vBif",
  44. "_score": 0.0721122,
  45. "_source": {
  46. "passage_text": "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan."
  47. }
  48. }
  49. ]
  50. },
  51. "profile": {
  52. "shards": []
  53. }
  54. }

To compare these results to results without reranking, run the search without a reranking pipeline:

  1. GET my-test-data/_search
  2. {
  3. "query": {
  4. "match_all": {}
  5. },
  6. "ext": {
  7. "rerank": {
  8. "query_context": {
  9. "query_text": "What is the capital of the United States?"
  10. }
  11. }
  12. }
  13. }

copy

The first document in the response pertains to Carson City, which is not the capital of the United States:

  1. {
  2. "took": 0,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 4,
  13. "relation": "eq"
  14. },
  15. "max_score": 1,
  16. "hits": [
  17. {
  18. "_index": "my-test-data",
  19. "_id": "y7UOw40B8vrNLhb9vBif",
  20. "_score": 1,
  21. "_source": {
  22. "passage_text": "Carson City is the capital city of the American state of Nevada."
  23. }
  24. },
  25. {
  26. "_index": "my-test-data",
  27. "_id": "zLUOw40B8vrNLhb9vBif",
  28. "_score": 1,
  29. "_source": {
  30. "passage_text": "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan."
  31. }
  32. },
  33. {
  34. "_index": "my-test-data",
  35. "_id": "zbUOw40B8vrNLhb9vBif",
  36. "_score": 1,
  37. "_source": {
  38. "passage_text": "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district."
  39. }
  40. },
  41. {
  42. "_index": "my-test-data",
  43. "_id": "zrUOw40B8vrNLhb9vBif",
  44. "_score": 1,
  45. "_source": {
  46. "passage_text": "Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states."
  47. }
  48. }
  49. ]
  50. }
  51. }