Ingest processor reference
Ingest processor reference
An ingest pipeline is made up of a sequence of processors that are applied to documents as they are ingested into an index. Each processor performs a specific task, such as filtering, transforming, or enriching data.
Each successive processor depends on the output of the previous processor, so the order of processors is important. The modified documents are indexed into Elasticsearch after all processors are applied.
Elasticsearch includes over 40 configurable processors. The subpages in this section contain reference documentation for each processor. To get a list of available processors, use the nodes info API.
resp = client.nodes.info(
node_id="ingest",
filter_path="nodes.*.ingest.processors",
)
print(resp)
response = client.nodes.info(
node_id: 'ingest',
filter_path: 'nodes.*.ingest.processors'
)
puts response
const response = await client.nodes.info({
node_id: "ingest",
filter_path: "nodes.*.ingest.processors",
});
console.log(response);
GET _nodes/ingest?filter_path=nodes.*.ingest.processors
Ingest processors by category
We’ve categorized the available processors on this page and summarized their functions. This will help you find the right processor for your use case.
- Data enrichment processors
- Data transformation processors
- Data filtering processors
- Pipeline handling processors
- Array/JSON handling processors
Data enrichment processors
General outcomes
Appends a value to a field.
Points documents to the right time-based index based on a date or timestamp field.
Enriches documents with data from another index.
Refer to Enrich your data for detailed examples of how to use the enrich
processor to add data from your existing indices to incoming documents during ingest.
Uses machine learning to classify and tag text fields.
Specific outcomes
Parses and indexes binary data, such as PDFs and Word documents.
Converts a location field to a Geo-Point field.
Computes the Community ID for network flow data.
Computes a hash of the document’s content.
Converts geo-grid definitions of grid tiles or cells to regular bounding boxes or polygons which describe their shape.
Adds information about the geographical location of an IPv4 or IPv6 address from a Maxmind database.
Adds information about the geographical location of an IPv4 or IPv6 address from an ip geolocation database.
Calculates the network direction given a source IP address, destination IP address, and a list of internal networks.
Extracts the registered domain (also known as the effective top-level domain or eTLD), sub-domain, and top-level domain from a fully qualified domain name (FQDN).
Sets user-related details (such as username
, roles
, email
, full_name
,metadata
, api_key
, realm
and authentication_type
) from the current authenticated user to the current document by pre-processing the ingest.
Parses a Uniform Resource Identifier (URI) string and extracts its components as an object.
URL-decodes a string.
Parses user-agent strings to extract information about web clients.
Data transformation processors
General outcomes
Converts a field in the currently ingested document to a different type, such as converting a string to an integer.
Extracts structured fields out of a single text field within a document. Unlike the grok processor, dissect does not use regular expressions. This makes the dissect’s a simpler and often faster alternative.
Extracts structured fields out of a single text field within a document, using the Grok regular expression dialect that supports reusable aliased expressions.
Converts a string field by applying a regular expression and a replacement.
Uses the Grok rules engine to obscure text in the input document matching the given Grok patterns.
Renames an existing field.
Sets a value on a field.
Specific outcomes
Converts a human-readable byte value to its value in bytes (for example 1kb
becomes 1024
).
Extracts a single line of CSV data from a text field.
Extracts and converts date fields.
dot_expand processor
Expands a field with dots into an object field.
Removes HTML tags from a field.
Joins each element of an array into a single string using a separator character between each element.
Parse messages (or specific event fields) containing key-value pairs.
lowercase processor and uppercase processor
Converts a string field to lowercase or uppercase.
Splits a field into an array of values.
Trims whitespace from field.
Data filtering processors
Drops the document without raising any errors.
Removes fields from documents.
Pipeline handling processors
Raises an exception. Useful for when you expect a pipeline to fail and want to relay a specific message to the requester.
Executes another pipeline.
Reroutes documents to another target index or data stream.
Terminates the current ingest pipeline, causing no further processors to be run.
Array/JSON handling processors
Runs an ingest processor on each element of an array or object.
Converts a JSON string into a structured JSON object.
Runs an inline or stored script on incoming documents. The script runs in the painless ingest context.
Sorts the elements of an array in ascending or descending order.
Add additional processors
You can install additional processors as plugins.
You must install any plugin processors on all nodes in your cluster. Otherwise, Elasticsearch will fail to create pipelines containing the processor.
Mark a plugin as mandatory by setting plugin.mandatory
in elasticsearch.yml
. A node will fail to start if a mandatory plugin is not installed.
plugin.mandatory: my-ingest-plugin