Post-commit Callback

Apache Hudi provides the ability to post a callback notification about a write commit. This may be valuable if you need an event notification stream to take actions with other services after a Hudi write commit. You can push a write commit callback notification into HTTP endpoints or to a Kafka server.

HTTP Endpoints

You can push a commit notification to an HTTP URL and can specify custom values by extending a callback class defined below.

ConfigDescriptionRequiredDefault
TURN_CALLBACK_ONTurn commit callback on/offoptionalfalse (callbacks off)
CALLBACK_HTTP_URLCallback host to be sent along with callback messagesrequiredN/A
CALLBACK_HTTP_TIMEOUT_IN_SECONDSCallback timeout in secondsoptional3
CALLBACK_CLASS_NAMEFull path of callback class and must be a subclass of HoodieWriteCommitCallback class, org.apache.hudi.callback.impl.HoodieWriteCommitHttpCallback by defaultoptionalorg.apache.hudi.callback.impl.HoodieWriteCommitHttpCallback
CALLBACK_HTTP_API_KEY_VALUEHttp callback API keyoptionalhudi_write_commit_http_callback

Kafka Endpoints

You can push a commit notification to a Kafka topic so it can be used by other real time systems.

ConfigDescriptionRequiredDefault
TOPICKafka topic name to publish timeline activity into.requiredN/A
PARTITIONIt may be desirable to serialize all changes into a single Kafka partition for providing strict ordering. By default, Kafka messages are keyed by table name, which guarantees ordering at the table level, but not globally (or when new partitions are added)requiredN/A
RETRIESTimes to retry the produceoptional3
ACKSkafka acks level, all by default to ensure strong durabilityoptionalall
BOOTSTRAP_SERVERSBootstrap servers of kafka cluster, to be used for publishing commit metadatarequiredN/A

Pulsar Endpoints

You can push a commit notification to a Pulsar topic so it can be used by other real time systems.

ConfigDescriptionRequiredDefault
hoodie.write.commit.callback.pulsar.broker.service.urlServer’s Url of pulsar cluster to use to publish commit metadata.requiredN/A
hoodie.write.commit.callback.pulsar.topicPulsar topic name to publish timeline activity intorequiredN/A
hoodie.write.commit.callback.pulsar.producer.route-modeMessage routing logic for producers on partitioned topics.optionalRoundRobinPartition
hoodie.write.commit.callback.pulsar.producer.pending-queue-sizeThe maximum size of a queue holding pending messages.optional1000
hoodie.write.commit.callback.pulsar.producer.pending-total-sizeThe maximum number of pending messages across partitions.required50000
hoodie.write.commit.callback.pulsar.producer.block-if-queue-fullWhen the queue is full, the method is blocked instead of an exception is thrown.optionaltrue
hoodie.write.commit.callback.pulsar.producer.send-timeoutThe timeout in each sending to pulsar.optional30s
hoodie.write.commit.callback.pulsar.operation-timeoutDuration of waiting for completing an operation.optional30s
hoodie.write.commit.callback.pulsar.connection-timeoutDuration of waiting for a connection to a broker to be established.optional10s
hoodie.write.commit.callback.pulsar.request-timeoutDuration of waiting for completing a request.optional60s
hoodie.write.commit.callback.pulsar.keepalive-intervalDuration of keeping alive interval for each client broker connection.optional30s

Bring your own implementation

You can extend the HoodieWriteCommitCallback class to implement your own way to asynchronously handle the callback of a successful write. Use this public API:

https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/callback/HoodieWriteCommitCallback.java