Pulsar Terminology
下面是 Apache Pulsar 相关的一些术语:
概念
Pulsar
Pulsar 是一个分布式消息系统,最初由 Yahoo 创建,现在是 Apache 软件基金会的一个顶级项目。
消息(Message)
Messages are the basic unit of Pulsar. They’re what producers publish to topics and what consumers then consume from topics.
主题(Topic)
Topic 接收 producer 发布的消息后,将消息传递给 consumer,由 consumer 消费这些消息。
分区主题(Partitioned Topic)
由多个 Pulsar broker 处理分区 topic,提高吞吐量。
命名空间(Namespace)
相关 topic 间的分组机制。
命名空间 Bundle(Namespace Bundle)
同一个命名空间下的虚拟 topic 组。 命名空间 bundle 是 32 位的哈希值,取值范围从 0x00000000 到 0xffffffff。
租户(Tenant)
用于分配容量和执行身份验证/授权方案的管理单元。
订阅(Subscription)
A lease on a topic established by a group of consumers. Pulsar has four subscription modes (exclusive, shared, failover and key_shared).
发布 - 订阅(Pub-Sub)
一种消息传递模式,即 producer 进程发布消息到 topic,consumer 进程消费(处理)这些消息。
生产者(Producer)
消费者(Consumer)
订阅 Pulsar topic 并处理消息(由 producer 发布到该 topic)的进程。
Reader
消息处理程序,与 Pulsar consumer 非常相似。二者之间有两个主要区别:
- you can specify where on a topic readers begin processing messages (consumers always begin with the latest available unacked message);
- Reader 不保留数据,也不确认消息。
游标(Cursor)
Consumer 的订阅位置。
消息确认(ack)
Consumer 发送 ack 到 Pulsar broker,表明消息处理成功。 Pulsar 通过 Ack 确认是否可以将消息从系统中删除。如果没有收到 Ack,则一直保留消息到处理完成。
Nack(Negative Acknowledgment)
当应用程序无法处理特定消息时,应用程序向 Pulsar 发送 Nack,表示可以在一段时间后重新消费此消息。 (默认情况下,发送失败的消息会在一分钟之后被重新消费) 请注意,订单订阅类型为否定, 比如Exclusive,Failover和Key_Shared之类的消息可能会导致发送失败的消息以不符合原始顺序的方式到达使用者。
Unacknowledged
A message that has been delivered to a consumer for processing but not yet confirmed as processed by the consumer.
Retention Policy
在命名空间级别,通过设置消息大小和消息保留时间,为消息(已被 Ack)设置保留策略。
Multi-Tenancy
The ability to isolate namespaces, specify quotas, and configure authentication and authorization on a per-tenant basis.
架构
Standalone
A lightweight Pulsar broker in which all components run in a single Java Virtual Machine (JVM) process. Standalone clusters can be run on a single machine and are useful for development purposes.
Cluster
A set of Pulsar brokers and BookKeeper servers (aka bookies). Clusters can reside in different geographical regions and replicate messages to one another in a process called geo-replication.
Instance
A group of Pulsar clusters that act together as a single unit.
Geo-Replication
Replication of messages across Pulsar clusters, potentially in different datacenters or geographical regions.
Configuration Store
Pulsar’s configuration store (previously known as configuration store) is a ZooKeeper quorum that is used for configuration-specific tasks. A multi-cluster Pulsar installation requires just one configuration store across all clusters.
Topic Lookup
A service provided by Pulsar brokers that enables connecting clients to automatically determine which Pulsar cluster is responsible for a topic (and thus where message traffic for the topic needs to be routed).
Service Discovery
A mechanism provided by Pulsar that enables connecting clients to use just a single URL to interact with all the brokers in a cluster.
Broker
A stateless component of Pulsar clusters that runs two other components: an HTTP server exposing a REST interface for administration and topic lookup and a dispatcher that handles all message transfers. Pulsar clusters typically consist of multiple brokers.
Dispatcher
An asynchronous TCP server used for all data transfers in-and-out a Pulsar broker. The Pulsar dispatcher uses a custom binary protocol for all communications.
存储
BookKeeper
Apache BookKeeper is a scalable, low-latency persistent log storage service that Pulsar uses to store data.
Bookie
Bookie is the name of an individual BookKeeper server. It is effectively the storage server of Pulsar.
Ledger
An append-only data structure in BookKeeper that is used to persistently store messages in Pulsar topics.
Functions
Pulsar Functions are lightweight functions that can consume messages from Pulsar topics, apply custom processing logic, and, if desired, publish results to topics.