Limit token filter
The limit
token filter is used to limit the number of tokens passed through the analysis chain.
Parameters
The limit
token filter can be configured with the following parameters.
Parameter | Required/Optional | Data type | Description |
---|---|---|---|
max_token_count | Optional | Integer | The maximum number of tokens to be generated. Default is 1 . |
consume_all_tokens | Optional | Boolean | (Expert-level setting) Uses all tokens from the tokenizer, even if the result exceeds max_token_count . When this parameter is set, the output still only contains the number of tokens specified by max_token_count . However, all tokens generated by the tokenizer are processed. Default is false . |
Example
The following example request creates a new index named my_index
and configures an analyzer with a limit
filter:
PUT my_index
{
"settings": {
"analysis": {
"analyzer": {
"three_token_limit": {
"tokenizer": "standard",
"filter": [ "custom_token_limit" ]
}
},
"filter": {
"custom_token_limit": {
"type": "limit",
"max_token_count": 3
}
}
}
}
}
copy
Generated tokens
Use the following request to examine the tokens generated using the analyzer:
GET /my_index/_analyze
{
"analyzer": "three_token_limit",
"text": "OpenSearch is a powerful and flexible search engine."
}
copy
The response contains the generated tokens:
{
"tokens": [
{
"token": "OpenSearch",
"start_offset": 0,
"end_offset": 10,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "is",
"start_offset": 11,
"end_offset": 13,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "a",
"start_offset": 14,
"end_offset": 15,
"type": "<ALPHANUM>",
"position": 2
}
]
}
当前内容版权归 OpenSearch 或其关联方所有,如需对内容或内容相关联开源项目进行关注与资助,请访问 OpenSearch .