Ingest processor reference

Ingest processor reference

An ingest pipeline is made up of a sequence of processors that are applied to documents as they are ingested into an index. Each processor performs a specific task, such as filtering, transforming, or enriching data.

Each successive processor depends on the output of the previous processor, so the order of processors is important. The modified documents are indexed into Elasticsearch after all processors are applied.

Elasticsearch includes over 40 configurable processors. The subpages in this section contain reference documentation for each processor. To get a list of available processors, use the nodes info API.

  1. resp = client.nodes.info(
  2. node_id="ingest",
  3. filter_path="nodes.*.ingest.processors",
  4. )
  5. print(resp)
  1. response = client.nodes.info(
  2. node_id: 'ingest',
  3. filter_path: 'nodes.*.ingest.processors'
  4. )
  5. puts response
  1. const response = await client.nodes.info({
  2. node_id: "ingest",
  3. filter_path: "nodes.*.ingest.processors",
  4. });
  5. console.log(response);
  1. GET _nodes/ingest?filter_path=nodes.*.ingest.processors

Ingest processors by category

We’ve categorized the available processors on this page and summarized their functions. This will help you find the right processor for your use case.

Data enrichment processors

General outcomes

append processor

Appends a value to a field.

date_index_name processor

Points documents to the right time-based index based on a date or timestamp field.

enrich processor

Enriches documents with data from another index.

Refer to Enrich your data for detailed examples of how to use the enrich processor to add data from your existing indices to incoming documents during ingest.

inference processor

Uses machine learning to classify and tag text fields.

Specific outcomes

attachment processor

Parses and indexes binary data, such as PDFs and Word documents.

circle processor

Converts a location field to a Geo-Point field.

community_id processor

Computes the Community ID for network flow data.

fingerprint processor

Computes a hash of the document’s content.

geo_grid processor

Converts geo-grid definitions of grid tiles or cells to regular bounding boxes or polygons which describe their shape.

geoip processor

Adds information about the geographical location of an IPv4 or IPv6 address from a Maxmind database.

ip_location processor

Adds information about the geographical location of an IPv4 or IPv6 address from an ip geolocation database.

network_direction processor

Calculates the network direction given a source IP address, destination IP address, and a list of internal networks.

registered_domain processor

Extracts the registered domain (also known as the effective top-level domain or eTLD), sub-domain, and top-level domain from a fully qualified domain name (FQDN).

set_security_user processor

Sets user-related details (such as username, roles, email, full_name,metadata, api_key, realm and authentication_type) from the current authenticated user to the current document by pre-processing the ingest.

uri_parts processor

Parses a Uniform Resource Identifier (URI) string and extracts its components as an object.

urldecode processor

URL-decodes a string.

user_agent processor

Parses user-agent strings to extract information about web clients.

Data transformation processors

General outcomes

convert processor

Converts a field in the currently ingested document to a different type, such as converting a string to an integer.

dissect processor

Extracts structured fields out of a single text field within a document. Unlike the grok processor, dissect does not use regular expressions. This makes the dissect’s a simpler and often faster alternative.

grok processor

Extracts structured fields out of a single text field within a document, using the Grok regular expression dialect that supports reusable aliased expressions.

gsub processor

Converts a string field by applying a regular expression and a replacement.

redact processor

Uses the Grok rules engine to obscure text in the input document matching the given Grok patterns.

rename processor

Renames an existing field.

set processor

Sets a value on a field.

Specific outcomes

bytes processor

Converts a human-readable byte value to its value in bytes (for example 1kb becomes 1024).

csv processor

Extracts a single line of CSV data from a text field.

date processor

Extracts and converts date fields.

dot_expand processor

Expands a field with dots into an object field.

html_strip processor

Removes HTML tags from a field.

join processor

Joins each element of an array into a single string using a separator character between each element.

kv processor

Parse messages (or specific event fields) containing key-value pairs.

lowercase processor and uppercase processor

Converts a string field to lowercase or uppercase.

split processor

Splits a field into an array of values.

trim processor

Trims whitespace from field.

Data filtering processors

drop processor

Drops the document without raising any errors.

remove processor

Removes fields from documents.

Pipeline handling processors

fail processor

Raises an exception. Useful for when you expect a pipeline to fail and want to relay a specific message to the requester.

pipeline processor

Executes another pipeline.

reroute processor

Reroutes documents to another target index or data stream.

terminate processor

Terminates the current ingest pipeline, causing no further processors to be run.

Array/JSON handling processors

for_each processor

Runs an ingest processor on each element of an array or object.

json processor

Converts a JSON string into a structured JSON object.

script processor

Runs an inline or stored script on incoming documents. The script runs in the painless ingest context.

sort processor

Sorts the elements of an array in ascending or descending order.

Add additional processors

You can install additional processors as plugins.

You must install any plugin processors on all nodes in your cluster. Otherwise, Elasticsearch will fail to create pipelines containing the processor.

Mark a plugin as mandatory by setting plugin.mandatory in elasticsearch.yml. A node will fail to start if a mandatory plugin is not installed.

  1. plugin.mandatory: my-ingest-plugin