Outbox Event Router

Example outbox message

To understand how the Debezium outbox event router SMT is configured, review the following example of a Debezium outbox message:

  1. # Kafka Topic: outbox.event.order
  2. # Kafka Message key: "1"
  3. # Kafka Message Headers: "id=4d47e190-0402-4048-bc2c-89dd54343cdc"
  4. # Kafka Message Timestamp: 1556890294484
  5. {
  6. "{\"id\": 1, \"lineItems\": [{\"id\": 1, \"item\": \"Debezium in Action\", \"status\": \"ENTERED\", \"quantity\": 2, \"totalPrice\": 39.98}, {\"id\": 2, \"item\": \"Debezium for Dummies\", \"status\": \"ENTERED\", \"quantity\": 1, \"totalPrice\": 29.99}], \"orderDate\": \"2019-01-31T12:13:01\", \"customerId\": 123}"
  7. }

A Debezium connector that is configured to apply the outbox event router SMT generates the above message by transforming a Debezium raw message like this:

  1. # Kafka Message key: "406c07f3-26f0-4eea-a50c-109940064b8f"
  2. # Kafka Message Headers: ""
  3. # Kafka Message Timestamp: 1556890294484
  4. {
  5. "before": null,
  6. "after": {
  7. "id": "406c07f3-26f0-4eea-a50c-109940064b8f",
  8. "aggregateid": "1",
  9. "aggregatetype": "Order",
  10. "payload": "{\"id\": 1, \"lineItems\": [{\"id\": 1, \"item\": \"Debezium in Action\", \"status\": \"ENTERED\", \"quantity\": 2, \"totalPrice\": 39.98}, {\"id\": 2, \"item\": \"Debezium for Dummies\", \"status\": \"ENTERED\", \"quantity\": 1, \"totalPrice\": 29.99}], \"orderDate\": \"2019-01-31T12:13:01\", \"customerId\": 123}",
  11. "timestamp": 1556890294344,
  12. "type": "OrderCreated"
  13. },
  14. "source": {
  15. "version": "2.5.4.Final",
  16. "connector": "postgresql",
  17. "name": "dbserver1-bare",
  18. "db": "orderdb",
  19. "ts_usec": 1556890294448870,
  20. "txId": 584,
  21. "lsn": 24064704,
  22. "schema": "inventory",
  23. "table": "outboxevent",
  24. "snapshot": false,
  25. "last_snapshot_record": null,
  26. "xmin": null
  27. },
  28. "op": "c",
  29. "ts_ms": 1556890294484
  30. }

This example of a Debezium outbox message is based on the default outbox event router configuration, which assumes an outbox table structure and event routing based on aggregates. To customize behavior, the outbox event router SMT provides numerous configuration options.

Basic outbox table

To apply the default outbox event router SMT configuration, your outbox table is assumed to have the following columns:

  1. Column | Type | Modifiers
  2. --------------+------------------------+-----------
  3. id | uuid | not null
  4. aggregatetype | character varying(255) | not null
  5. aggregateid | character varying(255) | not null
  6. type | character varying(255) | not null
  7. payload | jsonb |
Table 1. Descriptions of expected outbox table columns
ColumnEffect

id

Contains the unique ID of the event. In an outbox message, this value is a header. You can use this ID, for example, to remove duplicate messages.

To obtain the unique ID of the event from a different outbox table column, set the table.field.event.id SMT option in the connector configuration.

aggregatetype

Contains a value that the SMT appends to the name of the topic to which the connector emits an outbox message. The default behavior is that this value replaces the default ${routedByValue} variable in the route.topic.replacement SMT option.

For example, in a default configuration, the route.by.field SMT option is set to aggregatetype and the route.topic.replacement SMT option is set to outbox.event.${routedByValue}. Suppose that your application adds two records to the outbox table. In the first record, the value in the aggregatetype column is customers. In the second record, the value in the aggregatetype column is orders. The connector emits the first record to the outbox.event.customers topic. The connector emits the second record to the outbox.event.orders topic.

To obtain this value from a different outbox table column, set the route.by.field SMT option in the connector configuration.

aggregateid

Contains the event key, which provides an ID for the payload. The SMT uses this value as the key in the emitted outbox message. This is important for maintaining correct order in Kafka partitions.

To obtain the event key from a different outbox table column, set the table.field.event.key SMT option in the connector configuration.

payload

A representation of the outbox change event. The default structure is JSON. By default, the Kafka message value is solely comprised of the payload value. However, if the outbox event is configured to include additional fields, the Kafka message value contains an envelope encapsulating both payload and the additional fields, and each field is represented separately. For more information, see Emitting messages with additional fields.

To obtain the event payload from a different outbox table column, set the table.field.event.payload SMT option in the connector configuration.

Additional custom columns

Any additional columns from the outbox table can be added to outbox events either within the payload section or as a message header.

One example could be a column eventType which conveys a user-defined value that helps to categorize or organize events.

Basic configuration

To configure a Debezium connector to support the outbox pattern, configure the outbox.EventRouter SMT. To obtain the default behavior of the SMT, add it to the connector configuration without specifying any options, as in the following example:

  1. transforms=outbox,...
  2. transforms.outbox.type=io.debezium.transforms.outbox.EventRouter

Customizing the configuration

The connector might emit many types of event messages (for example, heartbeat messages, tombstone messages, or metadata messages about transactions or schema changes). To apply the transformation only to events that originate in the outbox table, define an SMT predicate statement that selectively applies the transformation to those events only.

Options for applying the transformation selectively

In addition to the change event messages that a Debezium connector emits when a database change occurs, the connector also emits other types of messages, including heartbeat messages, and metadata messages about schema changes and transactions. Because the structure of these other messages differs from the structure of the change event messages that the SMT is designed to process, it’s best to configure the connector to selectively apply the SMT, so that it processes only the intended data change messages. You can use one of the following methods to configure the connector to apply the SMT selectively:

Using Avro as the payload format

The outbox event router SMT supports arbitrary payload formats. The payload column value in an outbox table is passed on transparently. An alternative to working with JSON is to use Avro. This can be beneficial for message format governance and for ensuring that outbox event schemas evolve in a backwards-compatible way.

How a source application produces Avro formatted content for outbox message payloads is out of the scope of this documentation. One possibility is to leverage the KafkaAvroSerializer class to serialize GenericRecord instances. To ensure that the Kafka message value is the exact Avro binary data, apply the following configuration to the connector:

  1. transforms=outbox,...
  2. transforms.outbox.type=io.debezium.transforms.outbox.EventRouter
  3. value.converter=io.debezium.converters.BinaryDataConverter

By default, the payload column value (the Avro data) is the only message value. Configuration of BinaryDataConverter as the value converter propagates the payload column value as-is into the Kafka message value.

The Debezium connectors may be configured to emit heartbeat, transaction metadata, or schema change events (support varies by connector). These events cannot be serialized by the BinaryDataConverter so additional configuration must be provided so the converter knows how to serialize these events. As an example, the following configuration illustrates using the Apache Kafka JsonConverter with no schemas:

  1. transforms=outbox,...
  2. transforms.outbox.type=io.debezium.transforms.outbox.EventRouter
  3. value.converter=io.debezium.converters.BinaryDataConverter
  4. value.converter.delegate.converter.type=org.apache.kafka.connect.json.JsonConverter
  5. value.converter.delegate.converter.type.schemas.enable=false

The delegate Converter implementation is specified by the delegate.converter.type option. If any extra configuration options are needed by the converter, they can also be specified, such as the disablement of schemas shown above using schemas.enable=false.

The converter io.debezium.converters.ByteBufferConverter has been deprecated since Debezium version 1.9, and has been removed in 2.0. Furthermore, when using Kafka Connect the connector’s configuration must be updated before upgrading to Debezium 2.x

Emitting messages with additional fields

Your outbox table might contain columns whose values you want to add to the emitted outbox messages. For example, consider an outbox table that has a value of purchase-order in the aggregatetype column and another column, eventType, whose possible values are order-created and order-shipped. Additional fields can be added with the syntax column:placement:alias.

The allowed values for placement are: - header - envelope - partition

To emit the eventType column value in the outbox message header, configure the SMT like this:

  1. transforms=outbox,...
  2. transforms.outbox.type=io.debezium.transforms.outbox.EventRouter
  3. transforms.outbox.table.fields.additional.placement=eventType:header:type

The result will be a header on the Kafka message with type as its key, and the value of the eventType column as its value.

To emit the eventType column value in the outbox message envelope, configure the SMT like this:

  1. transforms=outbox,...
  2. transforms.outbox.type=io.debezium.transforms.outbox.EventRouter
  3. transforms.outbox.table.fields.additional.placement=eventType:envelope:type

To control which partition the outbox message is produced on, configure the SMT like this:

  1. transforms=outbox,...
  2. transforms.outbox.type=io.debezium.transforms.outbox.EventRouter
  3. transforms.outbox.table.fields.additional.placement=partitionColumn:partition

Note that for the partition placement, adding an alias will have no effect.

Expanding escaped JSON String as JSON

You may have noticed that the Debezium outbox message contains the payload represented as a String. So when this string, is actually JSON, it appears as escaped in the result Kafka message like shown below:

  1. # Kafka Topic: outbox.event.order
  2. # Kafka Message key: "1"
  3. # Kafka Message Headers: "id=4d47e190-0402-4048-bc2c-89dd54343cdc"
  4. # Kafka Message Timestamp: 1556890294484
  5. {
  6. "{\"id\": 1, \"lineItems\": [{\"id\": 1, \"item\": \"Debezium in Action\", \"status\": \"ENTERED\", \"quantity\": 2, \"totalPrice\": 39.98}, {\"id\": 2, \"item\": \"Debezium for Dummies\", \"status\": \"ENTERED\", \"quantity\": 1, \"totalPrice\": 29.99}], \"orderDate\": \"2019-01-31T12:13:01\", \"customerId\": 123}"
  7. }

The outbox event router allows you to expand this message content to “real” JSON with the companion schema being deduced from the JSON document itself. That way the result in Kafka message looks like:

  1. # Kafka Topic: outbox.event.order
  2. # Kafka Message key: "1"
  3. # Kafka Message Headers: "id=4d47e190-0402-4048-bc2c-89dd54343cdc"
  4. # Kafka Message Timestamp: 1556890294484
  5. {
  6. "id": 1, "lineItems": [{"id": 1, "item": "Debezium in Action", "status": "ENTERED", "quantity": 2, "totalPrice": 39.98}, {"id": 2, "item": "Debezium for Dummies", "status": "ENTERED", "quantity": 1, "totalPrice": 29.99}], "orderDate": "2019-01-31T12:13:01", "customerId": 123
  7. }

To enable this transformation, you have to set the table.expand.json.payload to true and use the JsonConverter like below:

  1. transforms=outbox,...
  2. transforms.outbox.type=io.debezium.transforms.outbox.EventRouter
  3. transforms.outbox.table.expand.json.payload=true
  4. value.converter=org.apache.kafka.connect.json.JsonConverter

Configuration options

The following table describes the options that you can specify for the outbox event router SMT. In the table, the Group column indicates a configuration option classification for Kafka.

Table 2. Descriptions of outbox event router SMT configuration options
OptionDefaultGroupDescription

warn

Table

Determines the behavior of the SMT when there is an UPDATE operation on the outbox table. Possible settings are:

  • warn - The SMT logs a warning and continues to the next outbox table record.

  • error - The SMT logs an error and continues to the next outbox table record.

  • fatal - The SMT logs an error and the connector stops processing.

All changes in an outbox table are expected to be INSERT operations. That is, an outbox table functions as a queue; updates to records in an outbox table are not allowed. The SMT automatically filters out DELETE operations on an outbox table.

id

Table

Specifies the outbox table column that contains the unique event ID. This ID will be stored in the emitted event’s headers under the id key.

aggregateid

Table

Specifies the outbox table column that contains the event key. When this column contains a value, the SMT uses that value as the key in the emitted outbox message. This is important for maintaining correct order in Kafka partitions.

Table

By default, the timestamp in the emitted outbox message is the Debezium event timestamp. To use a different timestamp in outbox messages, set this option to an outbox table column that contains the timestamp that you want to be in emitted outbox messages.

payload

Table

Specifies the outbox table column that contains the event payload.

false

Table

Specifies whether the JSON expansion of a String payload should be done. If no content found or in case of parsing error, the content is kept “as is”.

Fore more details, please see the expanding escaped json section.

ignore

Table

When enable JSON expansion property table.expand.json.payload, determines the behavior of json payload that including an null value on the outbox table. Possible settings are:

  • ignore - Ignore the null value.

  • optional_bytes - Keep the null value, and treat null as optional bytes of connect.

Table, Envelope

Specifies one or more outbox table columns that you want to add to outbox message headers or envelopes. Specify a comma-separated list of pairs. In each pair, specify the name of a column and whether you want the value to be in the header or the envelope. Separate the values in the pair with a colon, for example:

id:header,my-field:envelope

To specify an alias for the column, specify a trio with the alias as the third value, for example:

id:header,my-field:envelope:my-alias

The second value is the placement and it must always be header or envelope.

true

Table, Envelope

Specifies whether this transformation throws an error if a field specified by the table.fields.additional.placement property is not found in the Outbox payload.

Table, Schema

When set, this value is used as the schema version as described in the Kafka Connect Schema Javadoc.

aggregatetype

Router

Specifies the name of a column in the outbox table. The default behavior is that the value in this column becomes a part of the name of the topic to which the connector emits the outbox messages. An example is in the description of the expected outbox table.

(?<routedByValue>.*)

Router

Specifies a regular expression that the outbox SMT applies in the RegexRouter to outbox table records. This regular expression is part of the setting of the route.topic.replacement SMT option.

The default behavior is that the SMT replaces the default ${routedByValue} variable in the setting of the route.topic.replacement SMT option with the setting of the route.by.field outbox SMT option.

outbox.event​.${routedByValue}

Router

Specifies the name of the topic to which the connector emits outbox messages. The default topic name is outbox.event. followed by the aggregatetype column value in the outbox table record. For example, if the aggregatetype value is customers, the topic name is outbox.event.customers.

To change the topic name, you can:

false

Router

Indicates whether an empty or null payload causes the connector to emit a tombstone event.

tracingspancontext

Tracing

The name of the field containing tracing span context.

debezium-read

Tracing

The operation name representing the Debezium processing span.

false

Tracing

When true only events that have serialized context field should be traced.

Distributed tracing

The outbox event routing SMT has support for distributed tracing. See tracing documentation for more details.