Conditional token filter
Conditional token filter
Applies a set of token filters to tokens that match conditions in a provided predicate script.
This filter uses Lucene’s ConditionalTokenFilter.
Example
The following analyze API request uses the condition
filter to match tokens with fewer than 5 characters in THE QUICK BROWN FOX
. It then applies the lowercase filter to those matching tokens, converting them to lowercase.
resp = client.indices.analyze(
tokenizer="standard",
filter=[
{
"type": "condition",
"filter": [
"lowercase"
],
"script": {
"source": "token.getTerm().length() < 5"
}
}
],
text="THE QUICK BROWN FOX",
)
print(resp)
response = client.indices.analyze(
body: {
tokenizer: 'standard',
filter: [
{
type: 'condition',
filter: [
'lowercase'
],
script: {
source: 'token.getTerm().length() < 5'
}
}
],
text: 'THE QUICK BROWN FOX'
}
)
puts response
const response = await client.indices.analyze({
tokenizer: "standard",
filter: [
{
type: "condition",
filter: ["lowercase"],
script: {
source: "token.getTerm().length() < 5",
},
},
],
text: "THE QUICK BROWN FOX",
});
console.log(response);
GET /_analyze
{
"tokenizer": "standard",
"filter": [
{
"type": "condition",
"filter": [ "lowercase" ],
"script": {
"source": "token.getTerm().length() < 5"
}
}
],
"text": "THE QUICK BROWN FOX"
}
The filter produces the following tokens:
[ the, QUICK, BROWN, fox ]
Configurable parameters
filter
(Required, array of token filters) Array of token filters. If a token matches the predicate script in the script
parameter, these filters are applied to the token in the order provided.
These filters can include custom token filters defined in the index mapping.
script
(Required, script object) Predicate script used to apply token filters. If a token matches this script, the filters in the filter
parameter are applied to the token.
For valid parameters, see How to write scripts. Only inline scripts are supported. Painless scripts are executed in the analysis predicate context and require a token
property.
Customize and add to an analyzer
To customize the condition
filter, duplicate it to create the basis for a new custom token filter. You can modify the filter using its configurable parameters.
For example, the following create index API request uses a custom condition
filter to configure a new custom analyzer. The custom condition
filter matches the first token in a stream. It then reverses that matching token using the reverse filter.
resp = client.indices.create(
index="palindrome_list",
settings={
"analysis": {
"analyzer": {
"whitespace_reverse_first_token": {
"tokenizer": "whitespace",
"filter": [
"reverse_first_token"
]
}
},
"filter": {
"reverse_first_token": {
"type": "condition",
"filter": [
"reverse"
],
"script": {
"source": "token.getPosition() === 0"
}
}
}
}
},
)
print(resp)
response = client.indices.create(
index: 'palindrome_list',
body: {
settings: {
analysis: {
analyzer: {
whitespace_reverse_first_token: {
tokenizer: 'whitespace',
filter: [
'reverse_first_token'
]
}
},
filter: {
reverse_first_token: {
type: 'condition',
filter: [
'reverse'
],
script: {
source: 'token.getPosition() === 0'
}
}
}
}
}
}
)
puts response
const response = await client.indices.create({
index: "palindrome_list",
settings: {
analysis: {
analyzer: {
whitespace_reverse_first_token: {
tokenizer: "whitespace",
filter: ["reverse_first_token"],
},
},
filter: {
reverse_first_token: {
type: "condition",
filter: ["reverse"],
script: {
source: "token.getPosition() === 0",
},
},
},
},
},
});
console.log(response);
PUT /palindrome_list
{
"settings": {
"analysis": {
"analyzer": {
"whitespace_reverse_first_token": {
"tokenizer": "whitespace",
"filter": [ "reverse_first_token" ]
}
},
"filter": {
"reverse_first_token": {
"type": "condition",
"filter": [ "reverse" ],
"script": {
"source": "token.getPosition() === 0"
}
}
}
}
}
}