Reranking by a field using an externally hosted cross-encoder model

Introduced 2.18

In this tutorial, you’ll learn how to use a cross-encoder model hosted on Amazon SageMaker to rerank search results and improve search relevance.

To rerank documents, you’ll configure a search pipeline that processes search results at query time. The pipeline intercepts search results and passes them to the ml_inference search response processor, which invokes the cross-encoder model. The model generates scores used to rerank the matching documents by_field.

Prerequisite: Deploy a model on Amazon SageMaker

Run the following code to deploy a model on Amazon SageMaker. For this example, you’ll use the ms-marco-MiniLM-L-6-v2 Hugging Face cross-encoder model hosted on Amazon SageMaker. We recommend using a GPU for better performance:

  1. import sagemaker
  2. import boto3
  3. from sagemaker.huggingface import HuggingFaceModel
  4. sess = sagemaker.Session()
  5. role = sagemaker.get_execution_role()
  6. hub = {
  7. 'HF_MODEL_ID':'cross-encoder/ms-marco-MiniLM-L-6-v2',
  8. 'HF_TASK':'text-classification'
  9. }
  10. huggingface_model = HuggingFaceModel(
  11. transformers_version='4.37.0',
  12. pytorch_version='2.1.0',
  13. py_version='py310',
  14. env=hub,
  15. role=role,
  16. )
  17. predictor = huggingface_model.deploy(
  18. initial_instance_count=1, # number of instances
  19. instance_type='ml.m5.xlarge' # ec2 instance type
  20. )

copy

After deploying the model, you can find the model endpoint by going to the Amazon SageMaker console in the AWS Management Console and selecting Inference > Endpoints on the left tab. Note the URL for the created model; you’ll use it to create a connector.

Running a search with reranking

To run a search with reranking, follow these steps:

  1. Create a connector.
  2. Register the model.
  3. Ingest documents into an index.
  4. Create a search pipeline.
  5. Search using reranking.

Step 1: Create a connector

Create a connector to the cross-encoder model by providing the model URL in the actions.url parameter:

  1. POST /_plugins/_ml/connectors/_create
  2. {
  3. "name": "SageMaker cross-encoder model",
  4. "description": "Test connector for SageMaker cross-encoder hosted model",
  5. "version": 1,
  6. "protocol": "aws_sigv4",
  7. "credential": {
  8. "access_key": "<YOUR_ACCESS_KEY>",
  9. "secret_key": "<YOUR_SECRET_KEY>",
  10. "session_token": "<YOUR_SESSION_TOKEN>"
  11. },
  12. "parameters": {
  13. "region": "<REGION>",
  14. "service_name": "sagemaker"
  15. },
  16. "actions": [
  17. {
  18. "action_type": "predict",
  19. "method": "POST",
  20. "url": "<YOUR_SAGEMAKER_ENDPOINT_URL>",
  21. "headers": {
  22. "content-type": "application/json"
  23. },
  24. "request_body": "{ \"inputs\": { \"text\": \"${parameters.text}\", \"text_pair\": \"${parameters.text_pair}\" }}"
  25. }
  26. ]
  27. }

copy

Note the connector ID contained in the response; you’ll use it in the following step.

Step 2: Register the model

To register the model, provide the connector ID in the connector_id parameter:

  1. POST /_plugins/_ml/models/_register
  2. {
  3. "name": "Cross encoder model",
  4. "version": "1.0.1",
  5. "function_name": "remote",
  6. "description": "Using a SageMaker endpoint to apply a cross encoder model",
  7. "connector_id": "<YOUR_CONNECTOR_ID>"
  8. }

copy

Step 3: Ingest documents into an index

Create an index and ingest sample documents containing facts about the New York City boroughs:

  1. POST /nyc_areas/_bulk
  2. { "index": { "_id": 1 } }
  3. { "borough": "Queens", "area_name": "Astoria", "description": "Astoria is a neighborhood in the western part of Queens, New York City, known for its diverse community and vibrant cultural scene.", "population": 93000, "facts": "Astoria is home to many artists and has a large Greek-American community. The area also boasts some of the best Mediterranean food in NYC." }
  4. { "index": { "_id": 2 } }
  5. { "borough": "Queens", "area_name": "Flushing", "description": "Flushing is a neighborhood in the northern part of Queens, famous for its Asian-American population and bustling business district.", "population": 227000, "facts": "Flushing is one of the most ethnically diverse neighborhoods in NYC, with a large Chinese and Korean population. It is also home to the USTA Billie Jean King National Tennis Center." }
  6. { "index": { "_id": 3 } }
  7. { "borough": "Brooklyn", "area_name": "Williamsburg", "description": "Williamsburg is a trendy neighborhood in Brooklyn known for its hipster culture, vibrant art scene, and excellent restaurants.", "population": 150000, "facts": "Williamsburg is a hotspot for young professionals and artists. The neighborhood has seen rapid gentrification over the past two decades." }
  8. { "index": { "_id": 4 } }
  9. { "borough": "Manhattan", "area_name": "Harlem", "description": "Harlem is a historic neighborhood in Upper Manhattan, known for its significant African-American cultural heritage.", "population": 116000, "facts": "Harlem was the birthplace of the Harlem Renaissance, a cultural movement that celebrated Black culture through art, music, and literature." }
  10. { "index": { "_id": 5 } }
  11. { "borough": "The Bronx", "area_name": "Riverdale", "description": "Riverdale is a suburban-like neighborhood in the Bronx, known for its leafy streets and affluent residential areas.", "population": 48000, "facts": "Riverdale is one of the most affluent areas in the Bronx, with beautiful parks, historic homes, and excellent schools." }
  12. { "index": { "_id": 6 } }
  13. { "borough": "Staten Island", "area_name": "St. George", "description": "St. George is the main commercial and cultural center of Staten Island, offering stunning views of Lower Manhattan.", "population": 15000, "facts": "St. George is home to the Staten Island Ferry terminal and is a gateway to Staten Island, offering stunning views of the Statue of Liberty and Ellis Island." }

copy

Step 4: Create a search pipeline

Next, create a search pipeline for reranking. In the search pipeline configuration, the input_map and output_map define how the input data is prepared for the cross-encoder model and how the model’s output is interpreted for reranking:

  • The input_map specifies which fields in the search documents and the query should be used as model inputs:

    • The text field maps to the facts field in the indexed documents. It provides the document-specific content that the model will analyze.
    • The text_pair field dynamically retrieves the search query text (multi_match.query) from the search request.

    The combination of text (document facts) and text_pair (search query) allows the cross-encoder model to compare the relevance of the document to the query, considering their semantic relationship.

  • The output_map field specifies how the output of the model is mapped to the fields in the response:

    • The rank_score field in the response will store the model’s relevance score, which will be used to perform reranking.

When using the by_field rerank type, the rank_score field will contain the same score as the _score field. To remove the rank_score field from the search results, set remove_target_field to true. The original BM25 score, before reranking, is included for debugging purposes by setting keep_previous_score to true. This allows you to compare the original score with the reranked score to evaluate improvements in search relevance.

To create the search pipeline, send the following request:

  1. PUT /_search/pipeline/my_pipeline
  2. {
  3. "response_processors": [
  4. {
  5. "ml_inference": {
  6. "tag": "ml_inference",
  7. "description": "This processor runs ml inference during search response",
  8. "model_id": "<model_id_from_step_3>",
  9. "function_name": "REMOTE",
  10. "input_map": [
  11. {
  12. "text": "facts",
  13. "text_pair":"$._request.query.multi_match.query"
  14. }
  15. ],
  16. "output_map": [
  17. {
  18. "rank_score": "$.score"
  19. }
  20. ],
  21. "full_response_path": false,
  22. "model_config": {},
  23. "ignore_missing": false,
  24. "ignore_failure": false,
  25. "one_to_one": true
  26. },
  27. "rerank": {
  28. "by_field": {
  29. "target_field": "rank_score",
  30. "remove_target_field": true,
  31. "keep_previous_score" : true
  32. }
  33. }
  34. }
  35. ]
  36. }

copy

Step 5: Search using reranking

Use the following request to search indexed documents and rerank them using the cross-encoder model. The request retrieves documents containing any of the specified terms in the description or facts fields. These terms are then used to compare and rerank the matched documents:

  1. POST /nyc_areas/_search?search_pipeline=my_pipeline
  2. {
  3. "query": {
  4. "multi_match": {
  5. "query": "artists art creative community",
  6. "fields": ["description", "facts"]
  7. }
  8. }
  9. }

copy

In the response, the previous_score field contains the document’s BM25 score, which it would have received if you hadn’t applied the pipeline. Note that while BM25 ranked “Astoria” the highest, the cross-encoder model prioritized “Harlem” because it matched more search terms:

  1. {
  2. "took": 4,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 3,
  13. "relation": "eq"
  14. },
  15. "max_score": 0.03418137,
  16. "hits": [
  17. {
  18. "_index": "nyc_areas",
  19. "_id": "4",
  20. "_score": 0.03418137,
  21. "_source": {
  22. "area_name": "Harlem",
  23. "description": "Harlem is a historic neighborhood in Upper Manhattan, known for its significant African-American cultural heritage.",
  24. "previous_score": 1.6489418,
  25. "borough": "Manhattan",
  26. "facts": "Harlem was the birthplace of the Harlem Renaissance, a cultural movement that celebrated Black culture through art, music, and literature.",
  27. "population": 116000
  28. }
  29. },
  30. {
  31. "_index": "nyc_areas",
  32. "_id": "1",
  33. "_score": 0.0090838,
  34. "_source": {
  35. "area_name": "Astoria",
  36. "description": "Astoria is a neighborhood in the western part of Queens, New York City, known for its diverse community and vibrant cultural scene.",
  37. "previous_score": 2.519608,
  38. "borough": "Queens",
  39. "facts": "Astoria is home to many artists and has a large Greek-American community. The area also boasts some of the best Mediterranean food in NYC.",
  40. "population": 93000
  41. }
  42. },
  43. {
  44. "_index": "nyc_areas",
  45. "_id": "3",
  46. "_score": 0.0032599436,
  47. "_source": {
  48. "area_name": "Williamsburg",
  49. "description": "Williamsburg is a trendy neighborhood in Brooklyn known for its hipster culture, vibrant art scene, and excellent restaurants.",
  50. "previous_score": 1.5632852,
  51. "borough": "Brooklyn",
  52. "facts": "Williamsburg is a hotspot for young professionals and artists. The neighborhood has seen rapid gentrification over the past two decades.",
  53. "population": 150000
  54. }
  55. }
  56. ]
  57. },
  58. "profile": {
  59. "shards": []
  60. }
  61. }