Radial search

Radial search enhances the k-NN plugin’s capabilities beyond approximate top-k searches. With radial search, you can search all points within a vector space that reside within a specified maximum distance or minimum score threshold from a query point. This provides increased flexibility and utility in search operations.

Parameter type

max_distance allows users to specify a physical distance within the vector space, identifying all points that are within this distance from the query point. This approach is particularly useful for applications requiring spatial proximity or absolute distance measurements.

min_score enables the specification of a similarity score, facilitating the retrieval of points that meet or exceed this score in relation to the query point. This method is ideal in scenarios where relative similarity, based on a specific metric, is more critical than physical proximity.

Only one query variable, either k, max_distance, or min_score, is required to be specified during radial search. For more information about the vector spaces, see Spaces.

Supported cases

You can perform radial search with either the Lucene or Faiss engines. The following table summarizes radial search use cases by engine.

Engine supportedFilter supportedNested field supportedSearch type
Lucenetruefalseapproximate
Faisstruetrueapproximate

Spaces

A space corresponds to the function used to measure the distance between two points in order to determine the k-nearest neighbors. When using k-NN, a lower score equates to a closer and better result. This is the opposite of how OpenSearch scores results, where a greater score equates to a better result. To convert distances to OpenSearch scores, radial search uses the following formula: 1 / (1 + distance). The k-NN plugin supports the following spaces. Not every method supports each of these spaces. Be sure to refer to the method documentation to verify that the space you want to use is supported.

Space typeDistance function (d)OpenSearch score
l1[ d(\mathbf{x}, \mathbf{y}) = \sum{i=1}^n |x_i - y_i| ][ score = {1 \over 1 + d } ]
l2[ d(\mathbf{x}, \mathbf{y}) = \sum{i=1}^n (xi - y_i)^2 ][ score = {1 \over 1 + d } ]
linf[ d(\mathbf{x}, \mathbf{y}) = max(|x_i - y_i|) ][ score = {1 \over 1 + d } ]
cosinesimil[ d(\mathbf{x}, \mathbf{y}) = 1 - cos { \theta } = 1 - {\mathbf{x} · \mathbf{y} \over |\mathbf{x}| · |\mathbf{y}|}][ = 1 - {\sum{i=1}^n xi y_i \over \sqrt{\sum{i=1}^n xi^2} · \sqrt{\sum{i=1}^n yi^2}}] where (|\mathbf{x}|) and (|\mathbf{y}|) represent the norms of vectors x and y respectively.nmslib and Faiss:[ score = {1 \over 1 + d } ]
Lucene:[ score = {2 - d \over 2}]
innerproduct (supported for Lucene in OpenSearch version 2.13 and later)[ d(\mathbf{x}, \mathbf{y}) = - {\mathbf{x} · \mathbf{y}} = - \sum{i=1}^n xi y_i ]
Lucene: [ d(\mathbf{x}, \mathbf{y}) = {\mathbf{x} · \mathbf{y}} = \sum
{i=1}^n x_i y_i ]
[ \text{If} d \ge 0, ] [score = {1 \over 1 + d }] [\text{If} d < 0, score = −d + 1]
Lucene: [ \text{If} d > 0, score = d + 1 ] [\text{If} d \le 0] [score = {1 \over 1 + (-1 · d) }]

The cosine similarity formula does not include the 1 - prefix. However, because similarity search libraries equate lower scores with closer results, they return 1 - cosineSimilarity for the cosine similarity space. This is why 1 - is included in the distance function.

With cosine similarity, it is not valid to pass a zero vector ([0, 0, ...]) as an input. This is because the magnitude of such a vector is 0, which raises a divide by 0 exception in the corresponding formula. Requests containing a zero vector will be rejected, and a corresponding exception will be thrown.

Examples

The following examples can help you to get started with radial search.

Prerequisites

To use a k-NN index with radial search, create a k-NN index by setting index.knn to true. Specify one or more fields of the knn_vector data type, as shown in the following example:

  1. PUT knn-index-test
  2. {
  3. "settings": {
  4. "number_of_shards": 1,
  5. "number_of_replicas": 1,
  6. "index.knn": true
  7. },
  8. "mappings": {
  9. "properties": {
  10. "my_vector": {
  11. "type": "knn_vector",
  12. "dimension": 2,
  13. "method": {
  14. "name": "hnsw",
  15. "space_type": "l2",
  16. "engine": "faiss",
  17. "parameters": {
  18. "ef_construction": 100,
  19. "m": 16,
  20. "ef_search": 100
  21. }
  22. }
  23. }
  24. }
  25. }
  26. }

copy

After you create the index, add some data similar to the following:

  1. PUT _bulk?refresh=true
  2. {"index": {"_index": "knn-index-test", "_id": "1"}}
  3. {"my_vector": [7.0, 8.2], "price": 4.4}
  4. {"index": {"_index": "knn-index-test", "_id": "2"}}
  5. {"my_vector": [7.1, 7.4], "price": 14.2}
  6. {"index": {"_index": "knn-index-test", "_id": "3"}}
  7. {"my_vector": [7.3, 8.3], "price": 19.1}
  8. {"index": {"_index": "knn-index-test", "_id": "4"}}
  9. {"my_vector": [6.5, 8.8], "price": 1.2}
  10. {"index": {"_index": "knn-index-test", "_id": "5"}}
  11. {"my_vector": [5.7, 7.9], "price": 16.5}

copy

Example: Radial search with max_distance

The following example shows a radial search performed with max_distance:

  1. GET knn-index-test/_search
  2. {
  3. "query": {
  4. "knn": {
  5. "my_vector": {
  6. "vector": [
  7. 7.1,
  8. 8.3
  9. ],
  10. "max_distance": 2
  11. }
  12. }
  13. }
  14. }

copy

All documents that fall within the squared Euclidean distance (l2^2) of 2 are returned, as shown in the following response:

Results

  1. {
  2. "took": 6,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 4,
  13. "relation": "eq"
  14. },
  15. "max_score": 0.98039204,
  16. "hits": [
  17. {
  18. "_index": "knn-index-test",
  19. "_id": "1",
  20. "_score": 0.98039204,
  21. "_source": {
  22. "my_vector": [
  23. 7.0,
  24. 8.2
  25. ],
  26. "price": 4.4
  27. }
  28. },
  29. {
  30. "_index": "knn-index-test",
  31. "_id": "3",
  32. "_score": 0.9615384,
  33. "_source": {
  34. "my_vector": [
  35. 7.3,
  36. 8.3
  37. ],
  38. "price": 19.1
  39. }
  40. },
  41. {
  42. "_index": "knn-index-test",
  43. "_id": "4",
  44. "_score": 0.62111807,
  45. "_source": {
  46. "my_vector": [
  47. 6.5,
  48. 8.8
  49. ],
  50. "price": 1.2
  51. }
  52. },
  53. {
  54. "_index": "knn-index-test",
  55. "_id": "2",
  56. "_score": 0.5524861,
  57. "_source": {
  58. "my_vector": [
  59. 7.1,
  60. 7.4
  61. ],
  62. "price": 14.2
  63. }
  64. }
  65. ]
  66. }
  67. }

Example: Radial search with max_distance and a filter

The following example shows a radial search performed with max_distance and a response filter:

  1. GET knn-index-test/_search
  2. {
  3. "query": {
  4. "knn": {
  5. "my_vector": {
  6. "vector": [7.1, 8.3],
  7. "max_distance": 2,
  8. "filter": {
  9. "range": {
  10. "price": {
  11. "gte": 1,
  12. "lte": 5
  13. }
  14. }
  15. }
  16. }
  17. }
  18. }
  19. }

copy

All documents that fall within the squared Euclidean distance (l2^2) of 2 and have a price within the range of 1 to 5 are returned, as shown in the following response:

Results

  1. {
  2. "took": 4,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 2,
  13. "relation": "eq"
  14. },
  15. "max_score": 0.98039204,
  16. "hits": [
  17. {
  18. "_index": "knn-index-test",
  19. "_id": "1",
  20. "_score": 0.98039204,
  21. "_source": {
  22. "my_vector": [
  23. 7.0,
  24. 8.2
  25. ],
  26. "price": 4.4
  27. }
  28. },
  29. {
  30. "_index": "knn-index-test",
  31. "_id": "4",
  32. "_score": 0.62111807,
  33. "_source": {
  34. "my_vector": [
  35. 6.5,
  36. 8.8
  37. ],
  38. "price": 1.2
  39. }
  40. }
  41. ]
  42. }
  43. }

Example: Radial search with min_score

The following example shows a radial search performed with min_score:

  1. GET knn-index-test/_search
  2. {
  3. "query": {
  4. "knn": {
  5. "my_vector": {
  6. "vector": [7.1, 8.3],
  7. "min_score": 0.95
  8. }
  9. }
  10. }
  11. }

copy

All documents with a score of 0.9 or higher are returned, as shown in the following response:

Results

  1. {
  2. "took": 3,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 2,
  13. "relation": "eq"
  14. },
  15. "max_score": 0.98039204,
  16. "hits": [
  17. {
  18. "_index": "knn-index-test",
  19. "_id": "1",
  20. "_score": 0.98039204,
  21. "_source": {
  22. "my_vector": [
  23. 7.0,
  24. 8.2
  25. ],
  26. "price": 4.4
  27. }
  28. },
  29. {
  30. "_index": "knn-index-test",
  31. "_id": "3",
  32. "_score": 0.9615384,
  33. "_source": {
  34. "my_vector": [
  35. 7.3,
  36. 8.3
  37. ],
  38. "price": 19.1
  39. }
  40. }
  41. ]
  42. }
  43. }

Example: Radial search with min_score and a filter

The following example shows a radial search performed with min_score and a response filter:

  1. GET knn-index-test/_search
  2. {
  3. "query": {
  4. "knn": {
  5. "my_vector": {
  6. "vector": [
  7. 7.1,
  8. 8.3
  9. ],
  10. "min_score": 0.95,
  11. "filter": {
  12. "range": {
  13. "price": {
  14. "gte": 1,
  15. "lte": 5
  16. }
  17. }
  18. }
  19. }
  20. }
  21. }
  22. }

copy

All documents that have a score of 0.9 or higher and a price within the range of 1 to 5 are returned, as shown in the following example:

Results

  1. {
  2. "took": 4,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 1,
  13. "relation": "eq"
  14. },
  15. "max_score": 0.98039204,
  16. "hits": [
  17. {
  18. "_index": "knn-index-test",
  19. "_id": "1",
  20. "_score": 0.98039204,
  21. "_source": {
  22. "my_vector": [
  23. 7.0,
  24. 8.2
  25. ],
  26. "price": 4.4
  27. }
  28. }
  29. ]
  30. }
  31. }