Monitor
你能够通过不同的方式去监控一个 Pulsar 集群。可以通过主题使用相关的指标和集群每个组件的总体健康指标,来衡量集群是否健康。
指标采集
你能够采集 broker、Zookeeper、Bookeeper 的统计信息。
Borker 统计信息
You can collect Pulsar broker metrics from brokers and export the metrics in JSON format. The Pulsar broker metrics mainly have two types:
Destination dumps, which contain stats for each individual topic. You can fetch the destination dumps using the command below:
bin/pulsar-admin broker-stats destinations
Broker metrics, which contain the broker information and topics stats aggregated at namespace level. You can fetch the broker metrics by using the following command:
bin/pulsar-admin broker-stats monitoring-metrics
所有的指标都是每分钟更新一次。
The aggregated broker metrics are also exposed in the Prometheus format at:
http://$BROKER_ADDRESS:8080/metrics
Zookeeper 统计信息
Pulsar 自带的本地 Zookeeper 、配置存储服务和客户端,都能够通过 Prometheus 公开详细的统计信息。
http://$LOCAL_ZK_SERVER:8000/metrics
http://$GLOBAL_ZK_SERVER:8001/metrics
本地 Zookeeper 集群的默认端口是8000
,配置存储集群的默认端口是8001
。 你能够通过修改配置项stats_server_port
去改变本地 Zookeeper 和配置存储集群的默认端口。
Bookeeper 统计信息
你能够通过修改配置文件conf/bookkeeper.conf
中的配置项statsProviderClass
,来修改 Bookeeper 的统计框架。
The default BookKeeper configuration enables the Prometheus exporter. The configuration is included with Pulsar distribution.
http://$BOOKIE_ADDRESS:8000/metrics
The default port for bookie is 8000
. You can change the port by configuring prometheusStatsHttpPort
in the conf/bookkeeper.conf
file.
Managed cursor acknowledgment state
The acknowledgment state is persistent to the ledger first. When the acknowledgment state fails to be persistent to the ledger, they are persistent to ZooKeeper. To track the stats of acknowledgement, you can configure the metrics for the managed cursor.
brk_ml_cursor_persistLedgerSucceed(namespace="", ledger_name="", cursor_name:"")
brk_ml_cursor_persistLedgerErrors(namespace="", ledger_name="", cursor_name:"")
brk_ml_cursor_persistZookeeperSucceed(namespace="", ledger_name="", cursor_name:"")
brk_ml_cursor_persistZookeeperErrors(namespace="", ledger_name="", cursor_name:"")
brk_ml_cursor_nonContiguousDeletedMessagesRange(namespace="", ledger_name="", cursor_name:"")
Those metrics are added in the Prometheus interface, you can monitor and check the metrics stats in the Grafana.
Function and connector stats
You can collect functions worker stats from functions-worker
and export the metrics in JSON formats, which contain functions worker JVM metrics.
pulsar-admin functions-worker monitoring-metrics
You can collect functions and connectors metrics from functions-worker
and export the metrics in JSON formats.
pulsar-admin functions-worker function-stats
The aggregated functions and connectors metrics can be exposed in Prometheus formats as below. You can get FUNCTIONS_WORKER_ADDRESS
and WORKER_PORT
from the functions_worker.yml
file.
http://$FUNCTIONS_WORKER_ADDRESS:$WORKER_PORT/metrics:
Prometheus 配置
你能够使用 prometheus 来采集 Pular 组件暴露出来的所有指标,并使用 Grafana 去展示这些指标。可以用这种方式来监控 Pulsar 集群。 For details, refer to Prometheus guide.
当 Pulsar 运行在裸机上时,你需要提供一个需要探测的节点列表。 当 Pulsar 运行在 Kubernetes 集群时,监控系统是自动启动的。 For details, refer to Kubernetes instructions.
监控面板
When you collect time series statistics, the major problem is to make sure the number of dimensions attached to the data does not explode. 因此,只需要按照命名空间维度去采集时序指标,再做聚合。
Pulsar 主题维度监控面板
Pulsar Manager 提供了主题维度的监控面板。
Grafana
你能够使用 grafana 创建一个监控面板,底层的数据来源是在 Prometheus 里面。
When you deploy Pulsar on Kubernetes, a pulsar-grafana
Docker image is enabled by default. You can use the docker image with the principal dashboards.
Enter the command below to use the dashboard manually:
docker run -p3000:3000 \
-e PROMETHEUS_URL=http://$PROMETHEUS_HOST:9090/ \
apachepulsar/pulsar-grafana:latest
The following are some Grafana dashboards examples:
- pulsar-grafana: 当 Pulsar 集群运行在 Kubernetes 时,用来展示 Prometheus 采集的指标项的 Grafana 面板。
- apache-pulsar-grafana-dashboard: 不同 Pulsar 组件的 Grafana 监控面板模板集合。运行在 Kubernetes 和 本地机器时都可以用。
告警规则
您能够通过 Pulsar 环境设置告警规则。 如果要设置 Apache Pulsar 的告警规则,你可以参考StreamNative 平台 的例子或者 Alert Manager 告警规则。