Lowercase token filter
The lowercase
token filter is used to convert all characters in the token stream to lowercase, making searches case insensitive.
Parameters
The lowercase
token filter can be configured with the following parameter.
Parameter | Required/Optional | Description |
---|---|---|
language | Optional | Specifies a language-specific token filter. Valid values are: - greek - irish - turkish. Default is the Lucene LowerCaseFilter. |
Example
The following example request creates a new index named custom_lowercase_example
. It configures an analyzer with a lowercase
filter and specifies greek
as the language
:
PUT /custom_lowercase_example
{
"settings": {
"analysis": {
"analyzer": {
"greek_lowercase_example": {
"type": "custom",
"tokenizer": "standard",
"filter": ["greek_lowercase"]
}
},
"filter": {
"greek_lowercase": {
"type": "lowercase",
"language": "greek"
}
}
}
}
}
copy
Generated tokens
Use the following request to examine the tokens generated using the analyzer:
GET /custom_lowercase_example/_analyze
{
"analyzer": "greek_lowercase_example",
"text": "Αθήνα ΕΛΛΑΔΑ"
}
copy
The response contains the generated tokens:
{
"tokens": [
{
"token": "αθηνα",
"start_offset": 0,
"end_offset": 5,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "ελλαδα",
"start_offset": 6,
"end_offset": 12,
"type": "<ALPHANUM>",
"position": 1
}
]
}
当前内容版权归 OpenSearch 或其关联方所有,如需对内容或内容相关联开源项目进行关注与资助,请访问 OpenSearch .