Pulsar connector overview
当一起使用消息系统和外部系统(例如,数据库或其它消息系统)时,消息系统的功能十分强大。
Pulsar IO connectors enable you to easily create, deploy, and manage connectors that interact with external systems, such as Apache Cassandra, Aerospike, and many others.
概念
Pulsar IO connectors come in two types: source and sink.
下图说明了 source、Pulasr 和 sink 之间的关系:
Source
Sources feed data from external systems into Pulsar.
通用的 source 包括其它消息系统和流水式数据管道 API。
想要了解 Pulsar 内置 source 连接器的完整列表,参阅 source connector。
Sink
Sinks feed data from Pulsar into external systems.
Common sinks include other messaging systems and SQL and NoSQL databases.
了解 Pulsar 内置 sink 连接器的完整列表,参阅 sink connector。
Processing guarantee
Processing guarantees 用于处理向 Pulsar 主题写入消息时发生的错误。
Pulsar connectors and Functions use the same processing guarantees as below.
传递语义 | Description |
---|---|
at-most-once | Each message sent to a connector is to be processed once or not to be processed. |
at-least-once | Each message sent to a connector is to be processed once or more than once. |
effectively-once | Each message sent to a connector has one output associated with it. |
Processing guarantees for connectors not just rely on Pulsar guarantee but also relate to external systems, that is, the implementation of source and sink.
Source: Pulsar ensures that writing messages to Pulsar topics respects to the processing guarantees. It is within Pulsar’s control.
Sink: the processing guarantees rely on the sink implementation. If the sink implementation does not handle retries in an idempotent way, the sink does not respect to the processing guarantees.
Set
创建连接器时,可以使用以下语义设置 processing guarantee:
ATLEAST_ONCE
ATMOST_ONCE
EFFECTIVELY_ONCE
如果在创建连接器时,没有指定
--processing-guarantees
,则默认语义为ATLEAST_ONCE
。
Here takes Admin CLI as an example. For more information about REST API or JAVA Admin API, see here.
Source
Sink
$ bin/pulsar-admin sources create \ --processing-guarantees ATMOST_ONCE \ # Other source configs
了解更多关于 pulsar-admin 源代码创建
的信息,参阅这里。
$ bin/pulsar-admin sinks create \ --processing-guarantees EFFECTIVELY_ONCE \ # Other sink configs
了解更多关于 pulsar-admin sinks 创建
的信息,参阅这里。
更新
创建连接器后,可以使用以下语义更新 processing guarantee:
ATLEAST_ONCE
ATMOST_ONCE
EFFECTIVELY_ONCE
Here takes Admin CLI as an example. For more information about REST API or JAVA Admin API, see here.
Source
Sink
$ bin/pulsar-admin sources update \ --processing-guarantees EFFECTIVELY_ONCE \ # Other source configs
了解更多关于 pulsar-admin 源代码更新
的信息,参阅这里。
$ bin/pulsar-admin sinks update \ --processing-guarantees ATMOST_ONCE \ # Other sink configs
了解更多关于 pulsar-admin sinks 更新
的信息,参阅这里。
使用连接器
可以通过 Connector Admin CLI、sources 和 sinks 子命令来管理 Pulsar 连接器(例如,创建、更新、启动、停止、重启、重载等操作)。
连接器(sources 和 sinks)和 Functions 是实例的组成部分,都在 Functions workers 上运行。 通过 Connector Admin CLI 或 Functions Admin CLI 管理 source 时,实例在 worker 上启动。 了解更多信息,参阅 Functions worker。