ignore_malformed

ignore_malformed

Sometimes you don’t have much control over the data that you receive. One user may send a login field that is a date, and another sends a login field that is an email address.

Trying to index the wrong data type into a field throws an exception by default, and rejects the whole document. The ignore_malformed parameter, if set to true, allows the exception to be ignored. The malformed field is not indexed, but other fields in the document are processed normally.

For example:

  1. resp = client.indices.create(
  2. index="my-index-000001",
  3. mappings={
  4. "properties": {
  5. "number_one": {
  6. "type": "integer",
  7. "ignore_malformed": True
  8. },
  9. "number_two": {
  10. "type": "integer"
  11. }
  12. }
  13. },
  14. )
  15. print(resp)
  16. resp1 = client.index(
  17. index="my-index-000001",
  18. id="1",
  19. document={
  20. "text": "Some text value",
  21. "number_one": "foo"
  22. },
  23. )
  24. print(resp1)
  25. resp2 = client.index(
  26. index="my-index-000001",
  27. id="2",
  28. document={
  29. "text": "Some text value",
  30. "number_two": "foo"
  31. },
  32. )
  33. print(resp2)
  1. response = client.indices.create(
  2. index: 'my-index-000001',
  3. body: {
  4. mappings: {
  5. properties: {
  6. number_one: {
  7. type: 'integer',
  8. ignore_malformed: true
  9. },
  10. number_two: {
  11. type: 'integer'
  12. }
  13. }
  14. }
  15. }
  16. )
  17. puts response
  18. response = client.index(
  19. index: 'my-index-000001',
  20. id: 1,
  21. body: {
  22. text: 'Some text value',
  23. number_one: 'foo'
  24. }
  25. )
  26. puts response
  27. response = client.index(
  28. index: 'my-index-000001',
  29. id: 2,
  30. body: {
  31. text: 'Some text value',
  32. number_two: 'foo'
  33. }
  34. )
  35. puts response
  1. const response = await client.indices.create({
  2. index: "my-index-000001",
  3. mappings: {
  4. properties: {
  5. number_one: {
  6. type: "integer",
  7. ignore_malformed: true,
  8. },
  9. number_two: {
  10. type: "integer",
  11. },
  12. },
  13. },
  14. });
  15. console.log(response);
  16. const response1 = await client.index({
  17. index: "my-index-000001",
  18. id: 1,
  19. document: {
  20. text: "Some text value",
  21. number_one: "foo",
  22. },
  23. });
  24. console.log(response1);
  25. const response2 = await client.index({
  26. index: "my-index-000001",
  27. id: 2,
  28. document: {
  29. text: "Some text value",
  30. number_two: "foo",
  31. },
  32. });
  33. console.log(response2);
  1. PUT my-index-000001
  2. {
  3. "mappings": {
  4. "properties": {
  5. "number_one": {
  6. "type": "integer",
  7. "ignore_malformed": true
  8. },
  9. "number_two": {
  10. "type": "integer"
  11. }
  12. }
  13. }
  14. }
  15. PUT my-index-000001/_doc/1
  16. {
  17. "text": "Some text value",
  18. "number_one": "foo"
  19. }
  20. PUT my-index-000001/_doc/2
  21. {
  22. "text": "Some text value",
  23. "number_two": "foo"
  24. }

This document will have the text field indexed, but not the number_one field.

This document will be rejected because number_two does not allow malformed values.

The ignore_malformed setting is currently supported by the following mapping types:

Numeric

long, integer, short, byte, double, float, half_float, scaled_float

Boolean

boolean

Date

date

Date nanoseconds

date_nanos

Geopoint

geo_point for lat/lon points

Geoshape

geo_shape for complex shapes like polygons

IP

ip for IPv4 and IPv6 addresses

The ignore_malformed setting value can be updated on existing fields using the update mapping API.

Index-level default

The index.mapping.ignore_malformed setting can be set on the index level to ignore malformed content globally across all allowed mapping types. Mapping types that don’t support the setting will ignore it if set on the index level.

  1. resp = client.indices.create(
  2. index="my-index-000001",
  3. settings={
  4. "index.mapping.ignore_malformed": True
  5. },
  6. mappings={
  7. "properties": {
  8. "number_one": {
  9. "type": "byte"
  10. },
  11. "number_two": {
  12. "type": "integer",
  13. "ignore_malformed": False
  14. }
  15. }
  16. },
  17. )
  18. print(resp)
  1. response = client.indices.create(
  2. index: 'my-index-000001',
  3. body: {
  4. settings: {
  5. 'index.mapping.ignore_malformed' => true
  6. },
  7. mappings: {
  8. properties: {
  9. number_one: {
  10. type: 'byte'
  11. },
  12. number_two: {
  13. type: 'integer',
  14. ignore_malformed: false
  15. }
  16. }
  17. }
  18. }
  19. )
  20. puts response
  1. const response = await client.indices.create({
  2. index: "my-index-000001",
  3. settings: {
  4. "index.mapping.ignore_malformed": true,
  5. },
  6. mappings: {
  7. properties: {
  8. number_one: {
  9. type: "byte",
  10. },
  11. number_two: {
  12. type: "integer",
  13. ignore_malformed: false,
  14. },
  15. },
  16. },
  17. });
  18. console.log(response);
  1. PUT my-index-000001
  2. {
  3. "settings": {
  4. "index.mapping.ignore_malformed": true
  5. },
  6. "mappings": {
  7. "properties": {
  8. "number_one": {
  9. "type": "byte"
  10. },
  11. "number_two": {
  12. "type": "integer",
  13. "ignore_malformed": false
  14. }
  15. }
  16. }
  17. }

The number_one field inherits the index-level setting.

The number_two field overrides the index-level setting to turn off ignore_malformed.

Dealing with malformed fields

Malformed fields are silently ignored at indexing time when ignore_malformed is turned on. Whenever possible it is recommended to keep the number of documents that have a malformed field contained, or queries on this field will become meaningless. Elasticsearch makes it easy to check how many documents have malformed fields by using exists,term or terms queries on the special _ignored field.

Limits for JSON Objects

You can’t use ignore_malformed with the following data types:

You also can’t use ignore_malformed to ignore JSON objects submitted to fields of the wrong data type. A JSON object is any data surrounded by curly brackets "{}" and includes data mapped to the nested, object, and range data types.

If you submit a JSON object to an unsupported field, Elasticsearch will return an error and reject the entire document regardless of the ignore_malformed setting.