Lowercase token filter
Lowercase token filter
Changes token text to lowercase. For example, you can use the lowercase
filter to change THE Lazy DoG
to the lazy dog
.
In addition to a default filter, the lowercase
token filter provides access to Lucene’s language-specific lowercase filters for Greek, Irish, and Turkish.
Example
The following analyze API request uses the default lowercase
filter to change the THE Quick FoX JUMPs
to lowercase:
resp = client.indices.analyze(
tokenizer="standard",
filter=[
"lowercase"
],
text="THE Quick FoX JUMPs",
)
print(resp)
response = client.indices.analyze(
body: {
tokenizer: 'standard',
filter: [
'lowercase'
],
text: 'THE Quick FoX JUMPs'
}
)
puts response
const response = await client.indices.analyze({
tokenizer: "standard",
filter: ["lowercase"],
text: "THE Quick FoX JUMPs",
});
console.log(response);
GET _analyze
{
"tokenizer" : "standard",
"filter" : ["lowercase"],
"text" : "THE Quick FoX JUMPs"
}
The filter produces the following tokens:
[ the, quick, fox, jumps ]
Add to an analyzer
The following create index API request uses the lowercase
filter to configure a new custom analyzer.
resp = client.indices.create(
index="lowercase_example",
settings={
"analysis": {
"analyzer": {
"whitespace_lowercase": {
"tokenizer": "whitespace",
"filter": [
"lowercase"
]
}
}
}
},
)
print(resp)
response = client.indices.create(
index: 'lowercase_example',
body: {
settings: {
analysis: {
analyzer: {
whitespace_lowercase: {
tokenizer: 'whitespace',
filter: [
'lowercase'
]
}
}
}
}
}
)
puts response
const response = await client.indices.create({
index: "lowercase_example",
settings: {
analysis: {
analyzer: {
whitespace_lowercase: {
tokenizer: "whitespace",
filter: ["lowercase"],
},
},
},
},
});
console.log(response);
PUT lowercase_example
{
"settings": {
"analysis": {
"analyzer": {
"whitespace_lowercase": {
"tokenizer": "whitespace",
"filter": [ "lowercase" ]
}
}
}
}
}
Configurable parameters
language
(Optional, string) Language-specific lowercase token filter to use. Valid values include:
greek
Uses Lucene’s GreekLowerCaseFilter
irish
Uses Lucene’s IrishLowerCaseFilter
turkish
Uses Lucene’s TurkishLowerCaseFilter
If not specified, defaults to Lucene’s LowerCaseFilter.
Customize
To customize the lowercase
filter, duplicate it to create the basis for a new custom token filter. You can modify the filter using its configurable parameters.
For example, the following request creates a custom lowercase
filter for the Greek language:
resp = client.indices.create(
index="custom_lowercase_example",
settings={
"analysis": {
"analyzer": {
"greek_lowercase_example": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"greek_lowercase"
]
}
},
"filter": {
"greek_lowercase": {
"type": "lowercase",
"language": "greek"
}
}
}
},
)
print(resp)
response = client.indices.create(
index: 'custom_lowercase_example',
body: {
settings: {
analysis: {
analyzer: {
greek_lowercase_example: {
type: 'custom',
tokenizer: 'standard',
filter: [
'greek_lowercase'
]
}
},
filter: {
greek_lowercase: {
type: 'lowercase',
language: 'greek'
}
}
}
}
}
)
puts response
const response = await client.indices.create({
index: "custom_lowercase_example",
settings: {
analysis: {
analyzer: {
greek_lowercase_example: {
type: "custom",
tokenizer: "standard",
filter: ["greek_lowercase"],
},
},
filter: {
greek_lowercase: {
type: "lowercase",
language: "greek",
},
},
},
},
});
console.log(response);
PUT custom_lowercase_example
{
"settings": {
"analysis": {
"analyzer": {
"greek_lowercase_example": {
"type": "custom",
"tokenizer": "standard",
"filter": ["greek_lowercase"]
}
},
"filter": {
"greek_lowercase": {
"type": "lowercase",
"language": "greek"
}
}
}
}
}