Flatten graph token filter

The flatten_graph token filter is used to handle complex token relationships that occur when multiple tokens are generated at the same position in a graph structure. Some token filters, like synonym_graph and word_delimiter_graph, generate multi-position tokens—tokens that overlap or span multiple positions. These token graphs are useful for search queries but are not directly supported during indexing. The flatten_graph token filter resolves multi-position tokens into a linear sequence of tokens. Flattening the graph ensures compatibility with the indexing process.

Token graph flattening is a lossy process. Whenever possible, avoid using the flatten_graph filter. Instead, apply graph token filters exclusively in search analyzers, removing the need for the flatten_graph filter.

Example

The following example request creates a new index named test_index and configures an analyzer with a flatten_graph filter:

  1. PUT /test_index
  2. {
  3. "settings": {
  4. "analysis": {
  5. "analyzer": {
  6. "my_index_analyzer": {
  7. "type": "custom",
  8. "tokenizer": "standard",
  9. "filter": [
  10. "my_custom_filter",
  11. "flatten_graph"
  12. ]
  13. }
  14. },
  15. "filter": {
  16. "my_custom_filter": {
  17. "type": "word_delimiter_graph",
  18. "catenate_all": true
  19. }
  20. }
  21. }
  22. }
  23. }

copy

Generated tokens

Use the following request to examine the tokens generated using the analyzer:

  1. POST /test_index/_analyze
  2. {
  3. "analyzer": "my_index_analyzer",
  4. "text": "OpenSearch helped many employers"
  5. }

copy

The response contains the generated tokens:

  1. {
  2. "tokens": [
  3. {
  4. "token": "OpenSearch",
  5. "start_offset": 0,
  6. "end_offset": 10,
  7. "type": "<ALPHANUM>",
  8. "position": 0,
  9. "positionLength": 2
  10. },
  11. {
  12. "token": "Open",
  13. "start_offset": 0,
  14. "end_offset": 4,
  15. "type": "<ALPHANUM>",
  16. "position": 0
  17. },
  18. {
  19. "token": "Search",
  20. "start_offset": 4,
  21. "end_offset": 10,
  22. "type": "<ALPHANUM>",
  23. "position": 1
  24. },
  25. {
  26. "token": "helped",
  27. "start_offset": 11,
  28. "end_offset": 17,
  29. "type": "<ALPHANUM>",
  30. "position": 2
  31. },
  32. {
  33. "token": "many",
  34. "start_offset": 18,
  35. "end_offset": 22,
  36. "type": "<ALPHANUM>",
  37. "position": 3
  38. },
  39. {
  40. "token": "employers",
  41. "start_offset": 23,
  42. "end_offset": 32,
  43. "type": "<ALPHANUM>",
  44. "position": 4
  45. }
  46. ]
  47. }