Classic token filter

Classic token filter

Performs optional post-processing of terms generated by the classic tokenizer.

This filter removes the english possessive ('s) from the end of words and removes dots from acronyms. It uses Lucene’s ClassicFilter.

Example

The following analyze API request demonstrates how the classic token filter works.

  1. resp = client.indices.analyze(
  2. tokenizer="classic",
  3. filter=[
  4. "classic"
  5. ],
  6. text="The 2 Q.U.I.C.K. Brown-Foxes jumped over the lazy dog's bone.",
  7. )
  8. print(resp)
  1. response = client.indices.analyze(
  2. body: {
  3. tokenizer: 'classic',
  4. filter: [
  5. 'classic'
  6. ],
  7. text: "The 2 Q.U.I.C.K. Brown-Foxes jumped over the lazy dog's bone."
  8. }
  9. )
  10. puts response
  1. const response = await client.indices.analyze({
  2. tokenizer: "classic",
  3. filter: ["classic"],
  4. text: "The 2 Q.U.I.C.K. Brown-Foxes jumped over the lazy dog's bone.",
  5. });
  6. console.log(response);
  1. GET /_analyze
  2. {
  3. "tokenizer" : "classic",
  4. "filter" : ["classic"],
  5. "text" : "The 2 Q.U.I.C.K. Brown-Foxes jumped over the lazy dog's bone."
  6. }

The filter produces the following tokens:

  1. [ The, 2, QUICK, Brown, Foxes, jumped, over, the, lazy, dog, bone ]

Add to an analyzer

The following create index API request uses the classic token filter to configure a new custom analyzer.

  1. resp = client.indices.create(
  2. index="classic_example",
  3. settings={
  4. "analysis": {
  5. "analyzer": {
  6. "classic_analyzer": {
  7. "tokenizer": "classic",
  8. "filter": [
  9. "classic"
  10. ]
  11. }
  12. }
  13. }
  14. },
  15. )
  16. print(resp)
  1. response = client.indices.create(
  2. index: 'classic_example',
  3. body: {
  4. settings: {
  5. analysis: {
  6. analyzer: {
  7. classic_analyzer: {
  8. tokenizer: 'classic',
  9. filter: [
  10. 'classic'
  11. ]
  12. }
  13. }
  14. }
  15. }
  16. }
  17. )
  18. puts response
  1. const response = await client.indices.create({
  2. index: "classic_example",
  3. settings: {
  4. analysis: {
  5. analyzer: {
  6. classic_analyzer: {
  7. tokenizer: "classic",
  8. filter: ["classic"],
  9. },
  10. },
  11. },
  12. },
  13. });
  14. console.log(response);
  1. PUT /classic_example
  2. {
  3. "settings": {
  4. "analysis": {
  5. "analyzer": {
  6. "classic_analyzer": {
  7. "tokenizer": "classic",
  8. "filter": [ "classic" ]
  9. }
  10. }
  11. }
  12. }
  13. }