Nacos monitor guide

Nacos 0.8.0 improves the monitoring system, supporting Nacos operation status monitoring through exposing metrics data access to third-party monitoring system. Currently, prometheus, elastic search and influxdb are supported. The docs introduce how prometheus and grafana monitor Nacos. Here is Nacos grafana monitoring page. You can find out for yourself how to use elastic search and influxdb.

Deploy Nacos cluster to expose metrics data

Deploy the Nacos cluster according to the deploy document

Configure the application. properties file to expose metrics data

  1. management.endpoints.web.exposure.include=*

Access {ip}:8848/nacos/actuator/prometheus to see if metrics data can be accessed

Deploy prometheus to collect Nacos metrics data

Download the Prometheus version you want to install at the address of download prometheus

linux & mac

Decompress prometheus compression package

  1. tar xvfz prometheus-*.tar.gz
  2. cd prometheus-*

Modify configuration file prometheus.yml to collect Nacos metrics data

  1. metrics_path: '/nacos/actuator/prometheus'
  2. static_configs:
  3. - targets:['{ip1}:8848','{ip2}:8848','{ip3}:8848']

Start prometheus service

  1. ./prometheus --config.file="prometheus.yml"

windows

Download the corresponding version of Windows and decompress it

Modify configuration file prometheus.yml to collect Nacos metrics data

  1. metrics_path: '/nacos/actuator/prometheus'
  2. static_configs:
  3. - targets:['{ip1}:8848','{ip2}:8848','{ip3}:8848']

Start prometheus service

  1. prometheus.exe --config.file=prometheus.yml

By accessing http://{ip}:9090/graph, we can see the data collected by prometheus. By searching nacos_monitor in the search bar, we can find Nacos data to show the success of the data collection. IMAGE

Deploy grafana to graphically display metrics data

Install grafana on the same machine as prometheus, and use yum to install grafana

mac

  1. brew install grafana
  2. brew services start grafana

linux

  1. sudo yum install https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-5.2.4-1.x86_64.rpm
  2. sudo service grafana-server start

windows

Reference document:http://docs.grafana.org/installation/windows/

Access grafana: http://{ip}:3000

Configuring prometheus data source IMAGE

Import Nacos grafana monitoring template IMAGE

Nacos monitoring is divided into three modules:

  • nacos monitor shows core monitoring items IMAGE
  • nacos detail shows the change curve of index IMAGE
  • nacos alert is alerts about nacos IMAGE

configure grafana alert

When Nacos runs out of order, Grafana can alert the person in charge. Grafana supports a variety of police alert. Mail, DingTalk and webhook are commonly used.

DingTalk alert

Configure DingTalk robots IMAGE

Configure DingTalk robots url IMAGE

Test alert IMAGE

mail alert

Modify defaults.ini configuration file to add mail alerts

  1. #################################### SMTP / Emailing ##########################
  2. [smtp]
  3. enabled = true
  4. host = smtp.126.com:25
  5. user = xxxxxx
  6. password = xxxxx
  7. ;cert_file =
  8. ;key_file =
  9. skip_verify = true
  10. from_address = xxxxxx@126.com
  11. [emails]
  12. ;welcome_email_on_sign_up = false

Configuration notification mailbox IMAGE

meaning of Nacos metrics

jvm metrics

itemmeaning
system_cpu_usagecpu usage
system_load_average_1mload
jvm_memory_used_bytesjvm memory used(bytes)
jvm_memory_max_bytesjvm memory max(bytes)
jvm_gc_pause_seconds_countgc count
jvm_gc_pause_seconds_sumgc time
jvm_threads_daemonjvm threads count

Nacos metrics

itemmeaning
http_server_requests_seconds_counthttp requests count
http_server_requests_seconds_sumhttp requests time
nacos_timer_seconds_sumNacos config notify time
nacos_timer_seconds_countNacos config notify count
nacos_monitor{name=’longPolling’}Nacos config connection count
nacos_monitor{name=’configCount’}Nacos configuration file count
nacos_monitor{name=’dumpTask’}Nacos config dump task count
nacos_monitor{name=’notifyTask’}Nacos config notify task count
nacos_monitor{name=’getConfig’}Nacos config read configuration count
nacos_monitor{name=’publish’}Nacos config update configuration count
nacos_monitor{name=’ipCount’}Nacos naming ip count
nacos_monitor{name=’domCount’}Nacos naming domain count
nacos_monitor{name=’failedPush’}Nacos naming push fail count
nacos_monitor{name=’avgPushCost’}Nacos naming push cost time(average)
nacos_monitor{name=’leaderStatus’}Nacos naming if node is leader
nacos_monitor{name=’maxPushCost’}Nacos naming push cost time(max)
nacos_monitor{name=’mysqlhealthCheck’}Nacos naming mysql health check count
nacos_monitor{name=’httpHealthCheck’}Nacos naming http health check count
nacos_monitor{name=’tcpHealthCheck’}Nacos naming tcp health check count

nacos exception

itemmeaning
nacos_exception_total{name=’db’}database exception
nacos_exception_total{name=’configNotify’}Nacos config notify exception
nacos_exception_total{name=’unhealth’}Nacos config server health check exception
nacos_exception_total{name=’disk’}Nacos naming write disk exception
nacos_exception_total{name=’leaderSendBeatFailed’}Nacos naming leader send heart beat fail count
nacos_exception_total{name=’illegalArgument’}request argument illegal count
nacos_exception_total{name=’nacos’}Nacos inner exception

client metrics

itemmeaning
nacos_monitor{name=’subServiceCount’}subscribed services count
nacos_monitor{name=’pubServiceCount’}published services count
nacos_monitor{name=’configListenSize’}listened configuration file count
nacos_client_request_seconds_countrequest count
nacos_client_request_seconds_sumrequest time

Nacos-Sync monitor

With the release of Nacos 0.9, Nacos-Sync 0.3 supports metrics monitoring. It can observe the running status of Nacos-Sync service through metrics data, and improve the monitoring capability of Nacos-Sync in production environment. Reference for the Construction of the Overall Monitoring System Nacos Monitoring Manual

grafana monitor Nacos-Sync

The same as Nacos monitoring, Nacos-Sync also provides monitoring templates to import monitoring Nacos-Sync templates

Nacos-Sync monitoring is also divided into three modules:

  • nacos-sync monitor shows core monitoring items monitor
  • nacos-sync detail and alert shows monitoring curves and alarms. detail

Nacos-Sync metrics meaning

Nacos-Sync metrics is divided into JVM layer and application layer

jvm metrics

itemmeaning
system_cpu_usagecpu usage
system_load_average_1mload
jvm_memory_used_bytesjvm memory used(bytes)
jvm_memory_max_bytesjvm memory max(bytes)
jvm_gc_pause_seconds_countgc count
jvm_gc_pause_seconds_sumgc time
jvm_threads_daemonjvm threads count

application metrics

itemmeaning
nacosSync_task_sizesync task count
nacosSync_cluster_sizecluster count
nacosSync_add_task_rtadd task time
nacosSync_delete_task_rtdelete task time
nacosSync_dispatcher_taskdispatcher task time
nacosSync_sync_task_errorsync task error count