Sparse vector data type

Sparse vector data type

Deprecated in 7.6.

The sparse_vector type is deprecated and will be removed in 8.0.

A sparse_vector field stores sparse vectors of float values. The maximum number of dimensions that can be in a vector should not exceed 1024. The number of dimensions can be different across documents. A sparse_vector field is a single-valued field.

These vectors can be used for document scoring. For example, a document score can represent a distance between a given query vector and the indexed document vector.

You represent a sparse vector as an object, where object fields are dimensions, and fields values are values for these dimensions. Dimensions are integer values from 0 to 65535 encoded as strings. Dimensions don’t need to be in order.

  1. PUT my-index-000001
  2. {
  3. "mappings": {
  4. "properties": {
  5. "my_vector": {
  6. "type": "sparse_vector"
  7. },
  8. "my_text" : {
  9. "type" : "keyword"
  10. }
  11. }
  12. }
  13. }
  1. PUT my-index-000001/_doc/1
  2. {
  3. "my_text" : "text1",
  4. "my_vector" : {"1": 0.5, "5": -0.5, "100": 1}
  5. }
  6. PUT my-index-000001/_doc/2
  7. {
  8. "my_text" : "text2",
  9. "my_vector" : {"103": 0.5, "4": -0.5, "5": 1, "11" : 1.2}
  10. }

Internally, each document’s sparse vector is encoded as a binary doc value. Its size in bytes is equal to 6 * NUMBER_OF_DIMENSIONS + 4, where NUMBER_OF_DIMENSIONS - number of the vector’s dimensions.