Error Tracking
Airflow can be set up to send errors to Sentry.
Setup
First you must install sentry requirement:
pip install 'apache-airflow[sentry]'
After that, you need to enable the integration by setting the sentry_on
option in the [sentry]
section to True
.
Add your SENTRY_DSN
to your configuration file e.g. airflow.cfg
in [sentry]
section. Its template resembles the following: {PROTOCOL}://{PUBLIC_KEY}@{HOST}/{PROJECT_ID}
[sentry]
sentry_on = True
sentry_dsn = http://foo@sentry.io/123
Note
If this value is not provided, the SDK will try to read it from the SENTRY_DSN
environment variable.
The before_send
option can be used to modify or drop events before they are sent to Sentry. To set this option, provide a dotted path to a before_send function that the sentry SDK should be configured to use.
[sentry]
before_send = path.to.my.sentry.before_send
You can supply additional configuration options based on the Python platform via [sentry]
section. Unsupported options: integrations
, in_app_include
, in_app_exclude
, ignore_errors
, before_breadcrumb
, transport
.
Tags
Name | Description |
---|---|
| Dag name of the dag that failed |
| Task name of the task that failed |
| Start of data interval when the task failed |
| End of data interval when the task failed |
| Operator name of the task that failed |
For backward compatibility, an additional tag execution_date
is also available to represent the logical date. The tag should be considered deprecated in favor of data_interval_start
.
Breadcrumbs
When a task fails with an error breadcrumbs will be added for the other tasks in the current DAG run.
Name | Description |
---|---|
| Task ID of task that executed before failed task |
| Final state of task that executed before failed task (only Success and Failed states are captured) |
| Task operator of task that executed before failed task |
| Duration in seconds of task that executed before failed task |
Impact of Sentry on Environment variables passed to Subprocess Hook
When Sentry is enabled, by default it changes the standard library to pass all environment variables to subprocesses opened by Airflow. This changes the default behaviour of airflow.hooks.subprocess.SubprocessHook - always all environment variables are passed to the subprocess executed with specific set of environment variables. In this case not only the specified environment variables are passed but also all existing environment variables are passed with SUBPROCESS_
prefix added. This happens also for all other subprocesses.
This behaviour can be disabled by setting default_integrations
sentry configuration parameter to False
which disables StdlibIntegration
. However, this also disables other default integrations, so you need to enable them manually if you want them to remain enabled (see Sentry Default Integrations).
[sentry]
default_integrations = False