Uppercase token filter
Uppercase token filter
Changes token text to uppercase. For example, you can use the uppercase
filter to change the Lazy DoG
to THE LAZY DOG
.
This filter uses Lucene’s UpperCaseFilter.
Depending on the language, an uppercase character can map to multiple lowercase characters. Using the uppercase
filter could result in the loss of lowercase character information.
To avoid this loss but still have a consistent letter case, use the lowercase filter instead.
Example
The following analyze API request uses the default uppercase
filter to change the the Quick FoX JUMPs
to uppercase:
GET _analyze
{
"tokenizer" : "standard",
"filter" : ["uppercase"],
"text" : "the Quick FoX JUMPs"
}
The filter produces the following tokens:
[ THE, QUICK, FOX, JUMPS ]
Add to an analyzer
The following create index API request uses the uppercase
filter to configure a new custom analyzer.
PUT uppercase_example
{
"settings": {
"analysis": {
"analyzer": {
"whitespace_uppercase": {
"tokenizer": "whitespace",
"filter": [ "uppercase" ]
}
}
}
}
}