grafana-agent内置了mongodb_exporter,可以采集mongodb的metrics。

该mongodb_exporter,不支持同时配置多个mongodb node,目前只支持配置一个mongodb node,对其进行数据采集。此外您需要通过relabel_configs对label做自定义处理,一个是service_name,用来标识mongodb node(例如ReplicaSet1-Node1);另一个是mongodb_cluster,标识该mongodb cluster(比如prod-cluster)

一个relabel_configs的例子:

  1. relabel_configs:
  2. - source_labels: [__address__]
  3. target_label: service_name
  4. replacement: 'replicaset1-node1'
  5. - source_labels: [__address__]
  6. target_label: mongodb_cluster
  7. replacement: 'prod-cluster'

强烈推荐您为grafana-agent设置一个单独的账号来访问您的mongodb,以避免过度授权带来的安全隐患,具体可以参考official documentation

配置并启用mongodb_exporter

  1. # grafana-agent 本身的配置
  2. server:
  3. log_level: info
  4. http_listen_port: 12345
  5. # grafana-agent 抓取 metrics 的相关配置(类似于prometheus的scrape_configs)
  6. metrics:
  7. global:
  8. scrape_interval: 15s
  9. scrape_timeout: 10s
  10. remote_write:
  11. - url: https://n9e-server:19000/prometheus/v1/write
  12. basic_auth:
  13. username: <string>
  14. password: <string>
  15. integrations:
  16. mongodb_exporter:
  17. enabled: true

采集的关键指标列表

  1. # Whether MongoDB is up.
  2. # 实例是否存活
  3. # Gauge
  4. mongodb_up
  5. # The number of seconds that the current MongoDB process has been active
  6. # 实例启动累计时间(秒)
  7. # Counter
  8. mongodb_instance_uptime_seconds
  9. # The amount of memory, in mebibyte (MiB), currently used by the database process
  10. # 内存占用(MiB)
  11. # Gauge
  12. # mongodb_memory
  13. # The total combined latency in microseconds
  14. # 累计操作耗时(毫秒)
  15. mongodb_mongod_op_latencies_latency_total
  16. # The total number of operations performed since startup
  17. # 累计操作次数
  18. # Counter
  19. mongodb_mongod_op_latencies_ops_total
  20. # The total number of operations received since the mongod instance last started
  21. # 累计接收的操作请求次数(即使操作不成功也会增加)
  22. # Counter
  23. mongodb_op_counters_total
  24. # The number of incoming connections from clients to the database server. This number includes the current shell session
  25. # 连接数
  26. # Gauge
  27. # mongodb_connections
  28. # The number of open cursors
  29. # 打开游标数量
  30. # Gauge
  31. mongodb_mongod_metrics_cursor_open
  32. # The total number of document access and modification patterns
  33. # 累计文档操作次数
  34. # Counter
  35. mongodb_mongod_metrics_document_total
  36. # The total number of operations queued waiting for the lock
  37. # 当前排队等待获取锁的操作个数
  38. # Gauge
  39. mongodb_mongod_global_lock_current_queue
  40. # The total number of (index or document) items scanned during queries and query-plan evaluation
  41. # 查询和查询计划评估过程扫描的(索引或文档)条目总数
  42. # Counter
  43. mongodb_mongod_metrics_query_executor_total
  44. # The number of assertions raised since the MongoDB process started
  45. # 累计断言错误次数
  46. # Counter
  47. mongodb_asserts_total
  48. # The total number of getLastError operations with a specified write concern (i.e. w) that wait for one or more members of a replica set to acknowledge the write operation (i.e. a w value greater than 1.)
  49. # 累计getLastError操作数量
  50. # Counter
  51. mongodb_mongod_metrics_get_last_error_wtime_num_total
  52. # The number of times that write concern operations have timed out as a result of the wtimeout threshold to getLastError. This number increments for both default and non-default write concern specifications.
  53. # 累计getLastError超时操作数量
  54. # Counter
  55. mongodb_mongod_metrics_get_last_error_wtimeouts_total
  56. # Size in byte of the data currently in cache
  57. # 当前缓存数据大小(byte)
  58. # Gauge
  59. mongodb_mongod_wiredtiger_cache_bytes
  60. # Size in byte of the data read into or write from cache
  61. # 写入或读取的缓存数据大小(byte)
  62. # Counter
  63. mongodb_mongod_wiredtiger_cache_bytes_total
  64. # Number of pages currently held in the cache
  65. # 当前缓存页数量
  66. # Gauge
  67. mongodb_mongod_wiredtiger_cache_pages
  68. # The total number of pages (modified or unmodified) evicted
  69. # 累计缓存移除页数量
  70. # Counter
  71. mongodb_mongod_wiredtiger_cache_evicted_total
  72. # The total number of page faults
  73. # 累计缺页中断次数
  74. # Counter
  75. mongodb_extra_info_page_faults_total
  76. # The total number of bytes that the server has sent over network connections initiated by clients or other mongod or mongos instances.
  77. # 累计发送网络流量(byte)
  78. # Counter
  79. mongodb_ss_network_bytesOut
  80. # The total number of bytes that the server has received over network connections initiated by clients or other mongod or mongos instances
  81. # 累计接收网络流量(byte)
  82. # Counter
  83. mongodb_ss_network_bytesIn
  84. # The timestamp the node was elected as replica leader
  85. # 副本集选主时间
  86. # Gauge
  87. mongodb_mongod_replset_member_election_date
  88. # The replication lag that this member has with the primary
  89. # 副本集成员主从延迟(秒)
  90. # Gauge
  91. mongodb_mongod_replset_member_replication_lag

完整地配置项说明

  1. # Enables the mongodb_exporter integration
  2. [enabled: <boolean> | default = false]
  3. # Sets an explicit value for the instance label when the integration is
  4. # self-scraped. Overrides inferred values.
  5. #
  6. # The default value for this integration is inferred from the hostname
  7. # portion of the mongodb_uri field.
  8. [instance: <string>]
  9. # Automatically collect metrics from this integration. If disabled,
  10. # the mongodb_exporter integration will be run but not scraped and thus not
  11. # remote-written. Metrics for the integration will be exposed at
  12. # /integrations/mongodb_exporter/metrics and can be scraped by an external
  13. # process.
  14. [scrape_integration: <boolean> | default = <integrations_config.scrape_integrations>]
  15. # How often should the metrics be collected? Defaults to
  16. # metrics.global.scrape_interval.
  17. [scrape_interval: <duration> | default = <global_config.scrape_interval>]
  18. # The timeout before considering the scrape a failure. Defaults to
  19. # metrics.global.scrape_timeout.
  20. [scrape_timeout: <duration> | default = <global_config.scrape_timeout>]
  21. # Allows for relabeling labels on the target.
  22. relabel_configs:
  23. [- <relabel_config> ... ]
  24. # Relabel metrics coming from the integration, allowing to drop series
  25. # from the integration that you don't care about.
  26. metric_relabel_configs:
  27. [ - <relabel_config> ... ]
  28. # How frequent to truncate the WAL for this integration.
  29. [wal_truncate_frequency: <duration> | default = "60m"]
  30. # Monitor the exporter itself and include those metrics in the results.
  31. [include_exporter_metrics: <bool> | default = false]
  32. #
  33. # Exporter-specific configuration options
  34. #
  35. # MongoDB node connection URL, which must be in the [`Standard Connection String Format`](https://docs.mongodb.com/manual/reference/connection-string/#std-label-connections-standard-connection-string-format)
  36. [mongodb_uri: <string>]