Kafka Exporter
grafana-agent内置了kafka_exporter,来采集kafka的metrics指标。
我们强烈推荐您使用独立的账号运行grafana-agent,并做好访问kafka实例的最小化授权,避免过度授权带来的安全隐患,更多可以参考documentation。
配置并启用kafka_exporter
kafka_exporter:
enabled: true
# Address array (host:port) of Kafka server
kafka_uris: ['xxx','yyy']
采集的关键指标列表
kafka_brokers: count of kafka_brokers (gauge)
kafka_topic_partitions: Number of partitions for this Topic (gauge)
kafka_topic_partition_current_offset: Current Offset of a Broker at Topic/Partition (gauge)
kafka_consumergroup_current_offset: Current Offset of a ConsumerGroup at Topic/Partition (gauge)
kafka_consumer_lag_millis: Current approximation of consumer lag for a ConsumerGroup at Topic/Partition (gauge)
kafka_topic_partition_under_replicated_partition: 1 if Topic/Partition is under Replicated
完整地配置项说明
# Enables the kafka_exporter integration, allowing the Agent to automatically
# collect system metrics from the configured dnsmasq server address
[enabled: <boolean> | default = false]
# Sets an explicit value for the instance label when the integration is
# self-scraped. Overrides inferred values.
#
# The default value for this integration is inferred from the hostname
# portion of the first kafka_uri value. If there is more than one string
# in kafka_uri, the integration will fail to load and an instance value
# must be manually provided.
[instance: <string>]
# Automatically collect metrics from this integration. If disabled,
# the dnsmasq_exporter integration will be run but not scraped and thus not
# remote-written. Metrics for the integration will be exposed at
# /integrations/dnsmasq_exporter/metrics and can be scraped by an external
# process.
[scrape_integration: <boolean> | default = <integrations_config.scrape_integrations>]
# How often should the metrics be collected? Defaults to
# prometheus.global.scrape_interval.
[scrape_interval: <duration> | default = <global_config.scrape_interval>]
# The timeout before considering the scrape a failure. Defaults to
# prometheus.global.scrape_timeout.
[scrape_timeout: <duration> | default = <global_config.scrape_timeout>]
# Allows for relabeling labels on the target.
relabel_configs:
[- <relabel_config> ... ]
# Relabel metrics coming from the integration, allowing to drop series
# from the integration that you don't care about.
metric_relabel_configs:
[ - <relabel_config> ... ]
# How frequent to truncate the WAL for this integration.
[wal_truncate_frequency: <duration> | default = "60m"]
# Monitor the exporter itself and include those metrics in the results.
[include_exporter_metrics: <bool> | default = false]
# Address array (host:port) of Kafka server
[kafka_uris: <[]string>]
# Connect using SASL/PLAIN
[use_sasl: <bool>]
# Only set this to false if using a non-Kafka SASL proxy
[use_sasl_handshake: <bool> | default = true]
# SASL user name
[sasl_username: <string>]
# SASL user password
[sasl_password: <string>]
# The SASL SCRAM SHA algorithm sha256 or sha512 as mechanism
[sasl_mechanism: <string>]
# Connect using TLS
[use_tls: <bool>]
# The optional certificate authority file for TLS client authentication
[ca_file: <string>]
# The optional certificate file for TLS client authentication
[cert_file: <string>]
# The optional key file for TLS client authentication
[key_file: <string>]
# If true, the server's certificate will not be checked for validity. This will make your HTTPS connections insecure
[insecure_skip_verify: <bool>]
# Kafka broker version
[kafka_version: <string> | default = "2.0.0"]
# if you need to use a group from zookeeper
[use_zookeeper_lag: <bool>]
# Address array (hosts) of zookeeper server.
[zookeeper_uris: <[]string>]
# Kafka cluster name
[kafka_cluster_name: <string>]
# Metadata refresh interval
[metadata_refresh_interval: <duration> | default = "1m"]
# If true, all scrapes will trigger kafka operations otherwise, they will share results. WARN: This should be disabled on large clusters
[allow_concurrency: <bool> | default = true]
# Maximum number of offsets to store in the interpolation table for a partition
[max_offsets: <int> | default = 1000]
# How frequently should the interpolation table be pruned, in seconds
[prune_interval_seconds: <int> | default = 30]
# Regex filter for topics to be monitored
[topics_filter_regex: <string> | default = ".*"]
# Regex filter for consumer groups to be monitored
[groups_filter_regex: <string> | default = ".*"]