Lowercase token filter

Lowercase token filter

Changes token text to lowercase. For example, you can use the lowercase filter to change THE Lazy DoG to the lazy dog.

In addition to a default filter, the lowercase token filter provides access to Lucene’s language-specific lowercase filters for Greek, Irish, and Turkish.

Example

The following analyze API request uses the default lowercase filter to change the THE Quick FoX JUMPs to lowercase:

  1. resp = client.indices.analyze(
  2. tokenizer="standard",
  3. filter=[
  4. "lowercase"
  5. ],
  6. text="THE Quick FoX JUMPs",
  7. )
  8. print(resp)
  1. response = client.indices.analyze(
  2. body: {
  3. tokenizer: 'standard',
  4. filter: [
  5. 'lowercase'
  6. ],
  7. text: 'THE Quick FoX JUMPs'
  8. }
  9. )
  10. puts response
  1. const response = await client.indices.analyze({
  2. tokenizer: "standard",
  3. filter: ["lowercase"],
  4. text: "THE Quick FoX JUMPs",
  5. });
  6. console.log(response);
  1. GET _analyze
  2. {
  3. "tokenizer" : "standard",
  4. "filter" : ["lowercase"],
  5. "text" : "THE Quick FoX JUMPs"
  6. }

The filter produces the following tokens:

  1. [ the, quick, fox, jumps ]

Add to an analyzer

The following create index API request uses the lowercase filter to configure a new custom analyzer.

  1. resp = client.indices.create(
  2. index="lowercase_example",
  3. settings={
  4. "analysis": {
  5. "analyzer": {
  6. "whitespace_lowercase": {
  7. "tokenizer": "whitespace",
  8. "filter": [
  9. "lowercase"
  10. ]
  11. }
  12. }
  13. }
  14. },
  15. )
  16. print(resp)
  1. response = client.indices.create(
  2. index: 'lowercase_example',
  3. body: {
  4. settings: {
  5. analysis: {
  6. analyzer: {
  7. whitespace_lowercase: {
  8. tokenizer: 'whitespace',
  9. filter: [
  10. 'lowercase'
  11. ]
  12. }
  13. }
  14. }
  15. }
  16. }
  17. )
  18. puts response
  1. const response = await client.indices.create({
  2. index: "lowercase_example",
  3. settings: {
  4. analysis: {
  5. analyzer: {
  6. whitespace_lowercase: {
  7. tokenizer: "whitespace",
  8. filter: ["lowercase"],
  9. },
  10. },
  11. },
  12. },
  13. });
  14. console.log(response);
  1. PUT lowercase_example
  2. {
  3. "settings": {
  4. "analysis": {
  5. "analyzer": {
  6. "whitespace_lowercase": {
  7. "tokenizer": "whitespace",
  8. "filter": [ "lowercase" ]
  9. }
  10. }
  11. }
  12. }
  13. }

Configurable parameters

language

(Optional, string) Language-specific lowercase token filter to use. Valid values include:

If not specified, defaults to Lucene’s LowerCaseFilter.

Customize

To customize the lowercase filter, duplicate it to create the basis for a new custom token filter. You can modify the filter using its configurable parameters.

For example, the following request creates a custom lowercase filter for the Greek language:

  1. resp = client.indices.create(
  2. index="custom_lowercase_example",
  3. settings={
  4. "analysis": {
  5. "analyzer": {
  6. "greek_lowercase_example": {
  7. "type": "custom",
  8. "tokenizer": "standard",
  9. "filter": [
  10. "greek_lowercase"
  11. ]
  12. }
  13. },
  14. "filter": {
  15. "greek_lowercase": {
  16. "type": "lowercase",
  17. "language": "greek"
  18. }
  19. }
  20. }
  21. },
  22. )
  23. print(resp)
  1. response = client.indices.create(
  2. index: 'custom_lowercase_example',
  3. body: {
  4. settings: {
  5. analysis: {
  6. analyzer: {
  7. greek_lowercase_example: {
  8. type: 'custom',
  9. tokenizer: 'standard',
  10. filter: [
  11. 'greek_lowercase'
  12. ]
  13. }
  14. },
  15. filter: {
  16. greek_lowercase: {
  17. type: 'lowercase',
  18. language: 'greek'
  19. }
  20. }
  21. }
  22. }
  23. }
  24. )
  25. puts response
  1. const response = await client.indices.create({
  2. index: "custom_lowercase_example",
  3. settings: {
  4. analysis: {
  5. analyzer: {
  6. greek_lowercase_example: {
  7. type: "custom",
  8. tokenizer: "standard",
  9. filter: ["greek_lowercase"],
  10. },
  11. },
  12. filter: {
  13. greek_lowercase: {
  14. type: "lowercase",
  15. language: "greek",
  16. },
  17. },
  18. },
  19. },
  20. });
  21. console.log(response);
  1. PUT custom_lowercase_example
  2. {
  3. "settings": {
  4. "analysis": {
  5. "analyzer": {
  6. "greek_lowercase_example": {
  7. "type": "custom",
  8. "tokenizer": "standard",
  9. "filter": ["greek_lowercase"]
  10. }
  11. },
  12. "filter": {
  13. "greek_lowercase": {
  14. "type": "lowercase",
  15. "language": "greek"
  16. }
  17. }
  18. }
  19. }
  20. }