metricbeat

使用 beat 监控服务性能指标是 ElasticStack 一个常见的使用场景。2.x 时代要求用户对每类常见都需要单独开发自己的 xxxbeat 工具,然后各自编译使用。于是 Elastic.co 公司最终干脆把这件事情统一成了 metricbeat。

目前 metricbeat 支持以下服务性能指标:

  • Apache
  • HAProxy
  • MongoDB
  • MySQL
  • Nginx
  • PostgreSQL
  • Redis
  • System
  • Zookeeper

配置示例

  1. metricbeat.modules:
  2. - module: system
  3. metricsets:
  4. - cpu
  5. - filesystem
  6. - memory
  7. - network
  8. - process
  9. enabled: true
  10. period: 10s
  11. processes: ['.*']
  12. cpu_ticks: false
  13. - module: apache
  14. metricsets: ["status"]
  15. enabled: true
  16. period: 1s
  17. hosts: ["http://127.0.0.1"]

Apache

Apache 模块支持 2.2.31 以上的 2.2 系列,或 2.4.16 以上的 2.4 系列版本。

使用该模块要求被监控的 Apache 服务器上安装配置有 mod_status 扩展。通过该扩展可以监控到的 status 数据示例如下:

  1. "apache": {
  2. "status": {
  3. "bytes_per_request": 1024,
  4. "bytes_per_sec": 0.201113,
  5. "connections": {
  6. "async": {
  7. "closing": 0,
  8. "keep_alive": 0,
  9. "writing": 0
  10. },
  11. "total": 0
  12. },
  13. "cpu": {
  14. "children_system": 0,
  15. "children_user": 0,
  16. "load": 0.00652482,
  17. "system": 1.46,
  18. "user": 1.53
  19. },
  20. "hostname": "apache",
  21. "load": {
  22. "1": 0.55,
  23. "15": 0.31,
  24. "5": 0.31
  25. },
  26. "requests_per_sec": 0.000196399,
  27. "scoreboard": {
  28. "closing_connection": 0,
  29. "dns_lookup": 0,
  30. "gracefully_finishing": 0,
  31. "idle_cleanup": 0,
  32. "keepalive": 0,
  33. "logging": 0,
  34. "open_slot": 325,
  35. "reading_request": 0,
  36. "sending_reply": 1,
  37. "starting_up": 0,
  38. "total": 400,
  39. "waiting_for_connection": 74
  40. },
  41. "total_accesses": 9,
  42. "total_kbytes": 9,
  43. "uptime": {
  44. "server_uptime": 45825,
  45. "uptime": 45825
  46. },
  47. "workers": {
  48. "busy": 1,
  49. "idle": 74
  50. }
  51. }
  52. }

模块携带有一个预一定好的仪表盘,效果如下:
metricbeat - 图1

HAProxy

HAProxy 模块支持 HAProxy 服务器 1.6 版本。

使用该模块要求在 HAProxy 服务器配置的 globaldefault 区域写有如下配置:

  1. stats socket 127.0.0.1:14567

模块可以采集两类信息:info 和 stat。

其中 info 的返回数据示例如下:

  1. "haproxy": {
  2. "info": {
  3. "compress_bps_in": 0,
  4. "compress_bps_out": 0,
  5. "compress_bps_rate_limit": 0,
  6. "conn_rate": 0,
  7. "conn_rate_limit": 0,
  8. "cum_conns": 67,
  9. "cum_req": 67,
  10. "cum_ssl_conns": 0,
  11. "curr_conns": 0,
  12. "curr_ssl_conns": 0,
  13. "hard_max_conn": 4000,
  14. "idle_pct": 100,
  15. "max_conn": 4000,
  16. "max_conn_rate": 5,
  17. "max_pipes": 0,
  18. "max_sess_rate": 5,
  19. "max_sock": 8033,
  20. "max_ssl_conns": 0,
  21. "max_ssl_rate": 0,
  22. "max_zlib_mem_usage": 0,
  23. "mem_max_mb": 0,
  24. "nb_proc": 1,
  25. "pid": 53858,
  26. "pipes_free": 0,
  27. "pipes_used": 0,
  28. "process_num": 1,
  29. "run_queue": 2,
  30. "sess_rate": 0,
  31. "sess_rate_limit": 0,
  32. "ssl_babckend_key_rate": 0,
  33. "ssl_backend_max_key_rate": 0,
  34. "ssl_cache_misses": 0,
  35. "ssl_cached_lookups": 0,
  36. "ssl_frontend_key_rate": 0,
  37. "ssl_frontend_max_key_rate": 0,
  38. "ssl_frontend_session_reuse_pct": 0,
  39. "ssl_rate": 0,
  40. "ssl_rate_limit": 0,
  41. "tasks": 7,
  42. "ulimit_n": 8033,
  43. "uptime_sec": 13700,
  44. "zlib_mem_usage": 0
  45. }
  46. },

stat 的返回数据示例如下:

  1. "haproxy": {
  2. "stat": {
  3. "act": 1,
  4. "bck": 0,
  5. "bin": 0,
  6. "bout": 0,
  7. "check_duration": 0,
  8. "check_status": "L4CON",
  9. "chkdown": 1,
  10. "chkfail": 1,
  11. "cli_abrt": 0,
  12. "ctime": 0,
  13. "downtime": 13700,
  14. "dresp": 0,
  15. "econ": 0,
  16. "eresp": 0,
  17. "hanafail": 0,
  18. "hrsp_1xx": 0,
  19. "hrsp_2xx": 0,
  20. "hrsp_3xx": 0,
  21. "hrsp_4xx": 0,
  22. "hrsp_5xx": 0,
  23. "hrsp_other": 0,
  24. "iid": 3,
  25. "last_chk": "Connection refused",
  26. "lastchg": 13700,
  27. "lastsess": -1,
  28. "lbtot": 0,
  29. "pid": 1,
  30. "qcur": 0,
  31. "qmax": 0,
  32. "qtime": 0,
  33. "rate": 0,
  34. "rate_max": 0,
  35. "rtime": 0,
  36. "scur": 0,
  37. "sid": 1,
  38. "smax": 0,
  39. "srv_abrt": 0,
  40. "status": "DOWN",
  41. "stot": 0,
  42. "svname": "log1",
  43. "ttime": 0,
  44. "weight": 1,
  45. "wredis": 0,
  46. "wretr": 0
  47. }
  48. }

对这些 stat 数据名称有疑惑的,可以查阅 http://www.haproxy.org/download/1.6/doc/management.txt 文档。

MongoDB

该模块支持 MongoDB 2.8 及以上版本。

  1. "mongodb": {
  2. "status": {
  3. "asserts": {
  4. "msg": 0,
  5. "regular": 0,
  6. "rollovers": 0,
  7. "user": 0,
  8. "warning": 0
  9. },
  10. "background_flushing": {
  11. "average": {
  12. "ms": 16
  13. },
  14. "flushes": 37,
  15. "last": {
  16. "ms": 18
  17. },
  18. "last_finished": "2016-09-06T07:32:58.228Z",
  19. "total": {
  20. "ms": 624
  21. }
  22. },
  23. "connections": {
  24. "available": 838859,
  25. "current": 1,
  26. "total_created": 10
  27. },
  28. "extra_info": {
  29. "heap_usage": {
  30. "bytes": 62895448
  31. },
  32. "page_faults": 71
  33. },
  34. "journaling": {
  35. "commits": 1,
  36. "commits_in_write_lock": 0,
  37. "compression": 0,
  38. "early_commits": 0,
  39. "journaled": {
  40. "mb": 0
  41. },
  42. "times": {
  43. "commits": {
  44. "ms": 0
  45. },
  46. "commits_in_write_lock": {
  47. "ms": 0
  48. },
  49. "dt": {
  50. "ms": 0
  51. },
  52. "prep_log_buffer": {
  53. "ms": 0
  54. },
  55. "remap_private_view": {
  56. "ms": 0
  57. },
  58. "write_to_data_files": {
  59. "ms": 0
  60. },
  61. "write_to_journal": {
  62. "ms": 0
  63. }
  64. },
  65. "write_to_data_files": {
  66. "mb": 0
  67. }
  68. },
  69. "local_time": "2016-09-06T07:33:15.546Z",
  70. "memory": {
  71. "bits": 64,
  72. "mapped": {
  73. "mb": 80
  74. },
  75. "mapped_with_journal": {
  76. "mb": 160
  77. },
  78. "resident": {
  79. "mb": 57
  80. },
  81. "virtual": {
  82. "mb": 356
  83. }
  84. },
  85. "network": {
  86. "in": {
  87. "bytes": 2258
  88. },
  89. "out": {
  90. "bytes": 93486
  91. },
  92. "requests": 39
  93. },
  94. "opcounters": {
  95. "command": 40,
  96. "delete": 0,
  97. "getmore": 0,
  98. "insert": 0,
  99. "query": 1,
  100. "update": 0
  101. },
  102. "opcounters_replicated": {
  103. "command": 0,
  104. "delete": 0,
  105. "getmore": 0,
  106. "insert": 0,
  107. "query": 0,
  108. "update": 0
  109. },
  110. "storage_engine": {
  111. "name": "mmapv1"
  112. },
  113. "uptime": {
  114. "ms": 45828938
  115. },
  116. "version": "3.0.12",
  117. "write_backs_queued": false
  118. }
  119. }

MySQL

该模块支持 MySQL 5.7.0 及以上版本。

  1. "mysql": {
  2. "status": {
  3. "aborted": {
  4. "clients": 13,
  5. "connects": 16
  6. },
  7. "binlog": {
  8. "cache": {
  9. "disk_use": 0,
  10. "use": 0
  11. }
  12. },
  13. "bytes": {
  14. "received": 2100,
  15. "sent": 92281
  16. },
  17. "connections": 33,
  18. "created": {
  19. "tmp": {
  20. "disk_tables": 0,
  21. "files": 6,
  22. "tables": 0
  23. }
  24. },
  25. "delayed": {
  26. "errors": 0,
  27. "insert_threads": 0,
  28. "writes": 0
  29. },
  30. "flush_commands": 1,
  31. "max_used_connections": 2,
  32. "open": {
  33. "files": 14,
  34. "streams": 0,
  35. "tables": 106
  36. },
  37. "opened_tables": 113
  38. }
  39. }

Nginx

该模块支持 Nginx 1.9 及以上版本。并要求安装有 mod_stub_status 模块。

  1. "nginx": {
  2. "stubstatus": {
  3. "accepts": 22,
  4. "active": 1,
  5. "current": 10,
  6. "dropped": 0,
  7. "handled": 22,
  8. "hostname": "nginx",
  9. "reading": 0,
  10. "requests": 10,
  11. "waiting": 0,
  12. "writing": 1
  13. }
  14. }

PostgreSQL

该模块支持 PostgreSQL 9 及以上版本。可以采集 activity,bgwriter 和 database 三类数据。

activity 示例数据如下:

  1. "postgresql": {
  2. "activity": {
  3. "application_name": "",
  4. "backend_start": "2016-09-06T07:33:18.323Z",
  5. "client": {
  6. "address": "172.17.0.14",
  7. "hostname": "",
  8. "port": 57436
  9. },
  10. "database": {
  11. "name": "postgres",
  12. "oid": 12379
  13. },
  14. "pid": 162,
  15. "query": "SELECT * FROM pg_stat_activity",
  16. "query_start": "2016-09-06T07:33:18.325Z",
  17. "state": "active",
  18. "state_change": "2016-09-06T07:33:18.325Z",
  19. "transaction_start": "2016-09-06T07:33:18.325Z",
  20. "user": {
  21. "id": 10,
  22. "name": "postgres"
  23. },
  24. "waiting": false
  25. },

bgwriter 示例数据如下:

  1. "bgwriter": {
  2. "buffers": {
  3. "allocated": 191,
  4. "backend": 0,
  5. "backend_fsync": 0,
  6. "checkpoints": 0,
  7. "clean": 0,
  8. "clean_full": 0
  9. },
  10. "checkpoints": {
  11. "requested": 0,
  12. "scheduled": 7,
  13. "times": {
  14. "sync": {
  15. "ms": 0
  16. },
  17. "write": {
  18. "ms": 0
  19. }
  20. }
  21. },
  22. "stats_reset": "2016-09-05T18:49:53.575Z"
  23. },

database 示例数据如下:

  1. "database": {
  2. "blocks": {
  3. "hit": 0,
  4. "read": 0,
  5. "time": {
  6. "read": {
  7. "ms": 0
  8. },
  9. "write": {
  10. "ms": 0
  11. }
  12. }
  13. },
  14. "conflicts": 0,
  15. "deadlocks": 0,
  16. "name": "template1",
  17. "number_of_backends": 0,
  18. "oid": 1,
  19. "rows": {
  20. "deleted": 0,
  21. "fetched": 0,
  22. "inserted": 0,
  23. "returned": 0,
  24. "updated": 0
  25. },
  26. "temporary": {
  27. "bytes": 0,
  28. "files": 0
  29. },
  30. "transactions": {
  31. "commit": 0,
  32. "rollback": 0
  33. }
  34. }

Redis

该模块支持 Redis 3 及以上版本。可以采集 info 和 keyspace 两类数据。

info 示例数据如下:

  1. "redis": {
  2. "info": {
  3. "clients": {
  4. "biggest_input_buf": 0,
  5. "blocked": 0,
  6. "connected": 2,
  7. "longest_output_list": 0
  8. },
  9. "cluster": {
  10. "enabled": false
  11. },
  12. "cpu": {
  13. "used": {
  14. "sys": 0.33,
  15. "sys_children": 0,
  16. "user": 0.39,
  17. "user_children": 0
  18. }
  19. },
  20. "memory": {
  21. "allocator": "jemalloc-4.0.3",
  22. "used": {
  23. "lua": 37888,
  24. "peak": 883992,
  25. "rss": 4030464,
  26. "value": 883032
  27. }
  28. },
  29. "persistence": {
  30. "aof": {
  31. "bgrewrite": {
  32. "last_status": "ok"
  33. },
  34. "enabled": false,
  35. "rewrite": {
  36. "current_time": {
  37. "sec": -1
  38. },
  39. "in_progress": false,
  40. "last_time": {
  41. "sec": -1
  42. },
  43. "scheduled": false
  44. },
  45. "write": {
  46. "last_status": "ok"
  47. }
  48. },
  49. "loading": false,
  50. "rdb": {
  51. "bgsave": {
  52. "current_time": {
  53. "sec": -1
  54. },
  55. "in_progress": false,
  56. "last_status": "ok",
  57. "last_time": {
  58. "sec": -1
  59. }
  60. },
  61. "last_save": {
  62. "changes_since": 2,
  63. "time": 1475698251
  64. }
  65. }
  66. },
  67. "replication": {
  68. "backlog": {
  69. "active": 0,
  70. "first_byte_offset": 0,
  71. "histlen": 0,
  72. "size": 1048576
  73. },
  74. "connected_slaves": 0,
  75. "master_offset": 0,
  76. "role": "master"
  77. },
  78. "server": {
  79. "arch_bits": "64",
  80. "build_id": "5575d747b4b3b12c",
  81. "config_file": "",
  82. "gcc_version": "4.9.2",
  83. "git_dirty": "0",
  84. "git_sha1": "00000000",
  85. "hz": 10,
  86. "lru_clock": 16080842,
  87. "mode": "standalone",
  88. "multiplexing_api": "epoll",
  89. "os": "Linux 4.4.22-moby x86_64",
  90. "process_id": 1,
  91. "run_id": "d37e5ebfe0ae6c4972dbe9f0174a1637bb8247f6",
  92. "tcp_port": 6379,
  93. "uptime": 383,
  94. "version": "3.2.4"
  95. },
  96. "stats": {
  97. "commands_processed": 70,
  98. "connections": {
  99. "received": 17,
  100. "rejected": 0
  101. },
  102. "instantaneous": {
  103. "input_kbps": 0.07,
  104. "ops_per_sec": 2,
  105. "output_kbps": 0.07
  106. },
  107. "keys": {
  108. "evicted": 0,
  109. "expired": 0
  110. },
  111. "keyspace": {
  112. "hits": 0,
  113. "misses": 0
  114. },
  115. "latest_fork_usec": 0,
  116. "migrate_cached_sockets": 0,
  117. "net": {
  118. "input": {
  119. "bytes": 1949
  120. },
  121. "output": {
  122. "bytes": 4956554
  123. }
  124. },
  125. "pubsub": {
  126. "channels": 0,
  127. "patterns": 0
  128. },
  129. "sync": {
  130. "full": 0,
  131. "partial": {
  132. "err": 0,
  133. "ok": 0
  134. }
  135. }
  136. }
  137. },

keyspace 示例数据如下:

  1. "keyspace": {
  2. "avg_ttl": 0,
  3. "expires": 0,
  4. "id": "db0",
  5. "keys": 1
  6. }
  7. }

System

System 就是过去的 TopBeat,可以采集 core、cpu、diskio、filesystem、fsstat、load、memory、network 和 process 指标。这都是运维人员最熟悉的部分,就不再单独贴指标名称和示例了。

模块自带有一个预定义仪表盘,示例如下:

metricbeat - 图2

Zookeeper

该模块支持 Zookeeper 3.4.0 及以上版本。采集的 mntr 数据示例如下:

  1. "zookeeper": {
  2. "mntr": {
  3. "approximate_data_size": 27,
  4. "ephemerals_count": 0,
  5. "latency": {
  6. "avg": 0,
  7. "max": 0,
  8. "min": 0
  9. },
  10. "num_alive_connections": 1,
  11. "outstanding_requests": 0,
  12. "packets": {
  13. "received": 10,
  14. "sent": 9
  15. },
  16. "server_state": "standalone",
  17. "version": "3.4.8--1, built on 02/06/2016 03:18 GMT",
  18. "watch_count": 0,
  19. "znode_count": 4
  20. }
  21. }

docker 中的采集方式

metricbeat 的 system 数据大多采集自 /proc。而 docker 中,每个容器的实际数据是放在 /hostfs 而不是 /proc 里的。所以如果要用 metricbeat 采集容器数据,需要先挂载好对应路径:

  1. $ sudo docker run \
  2. --volume=/proc:/hostfs/proc:ro \
  3. --volume=/sys/fs/cgroup:/hostfs/sys/fs/cgroup:ro \
  4. --volume=/:/hostfs:ro \
  5. --net=host
  6. my/metricbeat:latest -system.hostfs=/hostfs