prometheus

Description

The prometheus Plugin exports metrics in Prometheus exposition format.

Attributes

NameTypeRequiredDefaultDescription
prefer_namebooleanFalsefalseWhen set to true, prints Route/Service name instead of ID in Prometheus metric.

Specifying export_uri

You can change the default export URI by configuring the export_uri attribute under plugin_attr in your configuration file (conf/config.yaml).

NameTypeDefaultDescription
export_uristring“/apisix/prometheus/metrics”URI to export the Prometheus metrics.

Here is a configuration example:

conf/config.yaml

  1. plugin_attr:
  2. prometheus:
  3. export_uri: /apisix/metrics

Specifying metrics

For http request related metrics, you could specify extra labels, which match the APISIX variables.

If you specify label for nonexist APISIX variable, the label value would be “”.

Currently, only below metrics are supported:

  • http_status
  • http_latency
  • bandwidth

Here is a configuration example:

conf/config.yaml

  1. plugin_attr:
  2. prometheus:
  3. metrics:
  4. http_status:
  5. extra_labels:
  6. - upstream_addr: $upstream_addr
  7. - upstream_status: $upstream_status

Specifying default_buckets

DEFAULT_BUCKETS is the default value for bucket array in http_latency metrics.

You can change the DEFAULT_BUCKETS by configuring default_buckets attribute in you configuration file.

Here is a configuration example:

conf/config.yaml

  1. plugin_attr:
  2. prometheus:
  3. default_buckets:
  4. - 15
  5. - 55
  6. - 105
  7. - 205
  8. - 505

Metrics endpoint

This Plugin will add the metrics endpoint /apisix/prometheus/metrics or your custom export URI for exposing the metrics.

These metrics are exposed by a separate Prometheus server address. By default, the address is 127.0.0.1:9091. You can change it in your configuration file (conf/config.yaml):

conf/config.yaml

  1. plugin_attr:
  2. prometheus:
  3. export_addr:
  4. ip: ${{INTRANET_IP}}
  5. port: 9092

Now, if the environment variable INTRANET_IP is 172.1.1.1, APISIX will export the metrics via 172.1.1.1:9092.

If you still want to expose the metrics via the data plane port (default: 9080), you can configure it as shown below:

conf/config.yaml

  1. plugin_attr:
  2. prometheus:
  3. enable_export_server: false

You can then expose it by using the public-api Plugin.

prometheus - 图1IMPORTANT

If the Prometheus plugin collects too many metrics, it will take CPU resources to calculate the metric data when getting the metrics via URI, which may affect APISIX to process normal requests. To solve this problem, APISIX exposes the URI and calculates the metrics in the privileged agent. If the URI is exposed using the public-api plugin, then APISIX will calculate the metric data in a normal worker process, which may still affect APISIX processing of normal requests.

This feature requires APISIX to run on APISIX-Runtime.

Enable Plugin

The prometheus Plugin can be enabled with an empty table.

The example below shows how you can configure the Plugin on a specific Route:

prometheus - 图2note

You can fetch the admin_key from config.yaml and save to an environment variable with the following command:

  1. admin_key=$(yq '.deployment.admin.admin_key[0].key' conf/config.yaml | sed 's/"//g')
  1. curl http://127.0.0.1:9180/apisix/admin/routes/1 -H "X-API-KEY: $admin_key" -X PUT -d '
  2. {
  3. "uri": "/hello",
  4. "plugins": {
  5. "prometheus":{}
  6. },
  7. "upstream": {
  8. "type": "roundrobin",
  9. "nodes": {
  10. "127.0.0.1:80": 1
  11. }
  12. }
  13. }'
prometheus - 图3note

When prefer_name is set to true make sure to not duplicate names for multiple Routes/Services or it could be misleading.

Fetching metrics

You can fetch the metrics from the specified export URI (default: /apisix/prometheus/metrics):

  1. curl -i http://127.0.0.1:9091/apisix/prometheus/metrics

You can add this address to Prometheus to fetch the data:

  1. scrape_configs:
  2. - job_name: "apisix"
  3. scrape_interval: 15s # This value will be related to the time range of the rate function in Prometheus QL. The time range in the rate function should be at least twice this value.
  4. metrics_path: "/apisix/prometheus/metrics"
  5. static_configs:
  6. - targets: ["127.0.0.1:9091"]

Now, you will be able to check the status in your Prometheus console:

checking status on prometheus dashboard

prometheus apisix in-depth metric view

Using Grafana to graph the metrics

Metrics exported by the prometheus Plugin can be graphed in Grafana using a drop in dashboard.

To set it up, download Grafana dashboard meta and import it in Grafana. Or, you can go to Grafana official for Grafana metadata.

Grafana chart-1

Grafana chart-2

Grafana chart-3

Grafana chart-4

Available HTTP metrics

The following metrics are exported by the prometheus Plugin:

  • Status code: HTTP status code returned from Upstream services. They are available for a single service and across all services.

    The available attributes are:

    NameDescription
    codeHTTP status code returned by the upstream service.
    routeroute_id of the matched Route with request. Defaults to an empty string if the Routes don’t match.
    matched_uriuri of the Route matching the request. Defaults to an empty string if the Routes don’t match.
    matched_hosthost of the Route matching the request. Defaults to an empty string if the Routes don’t match.
    serviceservice_id of the Route matching the request. If the Route does not have a service_id configured, it defaults to $host.
    consumerconsumer_name of the Consumer matching the request. Defaults to an empty string if it does not match.
    nodeIP address of the Upstream node.
  • Bandwidth: Total amount of traffic (ingress and egress) flowing through APISIX. Total bandwidth of a service can also be obtained.

    The available attributes are:

    NameDescription
    typeType of traffic (egress/ingress).
    routeroute_id of the matched Route with request. Defaults to an empty string if the Routes don’t match.
    serviceservice_id of the Route matching the request. If the Route does not have a service_id configured, it defaults to $host.
    consumerconsumer_name of the Consumer matching the request. Defaults to an empty string if it does not match.
    nodeIP address of the Upstream node.
  • etcd reachability: A gauge type representing whether etcd can be reached by APISIX. A value of 1 represents reachable and 0 represents unreachable.

  • Connections: Nginx connection metrics like active, reading, writing, and number of accepted connections.

  • Batch process entries: A gauge type useful when Plugins like syslog, http-logger, tcp-logger, udp-logger, and zipkin use batch process to send data. Entries that hasn’t been sent in batch process will be counted in the metrics.

  • Latency: Histogram of the request time per service in different dimensions.

    The available attributes are:

    NameDescription
    typeValue can be one of apisix, upstream, or request. This translates to latency caused by APISIX, Upstream, or both (their sum).
    serviceservice_id of the Route matching the request. If the Route does not have a service_id configured, it defaults to $host.
    consumerconsumer_name of the Consumer matching the request. Defaults to an empty string if it does not match.
    nodeIP address of the Upstream node.
  • Info: Information about the APISIX node.

  • Shared dict: The capacity and free space of all nginx.shared.DICT in APISIX.

  • apisix_upstream_status: Health check result status of upstream nodes. A value of 1 represents healthy and 0 represents unhealthy.

    The available attributes are:

    NameDescription
    nameresource id where the upstream node is attached to, e.g. /apisix/routes/1, /apisix/upstreams/1.
    ipip address of the node.
    portport number of the node.

Here are the original metrics from APISIX:

  1. curl http://127.0.0.1:9091/apisix/prometheus/metrics
  1. # HELP apisix_bandwidth Total bandwidth in bytes consumed per service in Apisix
  2. # TYPE apisix_bandwidth counter
  3. apisix_bandwidth{type="egress",route="",service="",consumer="",node=""} 8417
  4. apisix_bandwidth{type="egress",route="1",service="",consumer="",node="127.0.0.1"} 1420
  5. apisix_bandwidth{type="egress",route="2",service="",consumer="",node="127.0.0.1"} 1420
  6. apisix_bandwidth{type="ingress",route="",service="",consumer="",node=""} 189
  7. apisix_bandwidth{type="ingress",route="1",service="",consumer="",node="127.0.0.1"} 332
  8. apisix_bandwidth{type="ingress",route="2",service="",consumer="",node="127.0.0.1"} 332
  9. # HELP apisix_etcd_modify_indexes Etcd modify index for APISIX keys
  10. # TYPE apisix_etcd_modify_indexes gauge
  11. apisix_etcd_modify_indexes{key="consumers"} 0
  12. apisix_etcd_modify_indexes{key="global_rules"} 0
  13. apisix_etcd_modify_indexes{key="max_modify_index"} 222
  14. apisix_etcd_modify_indexes{key="prev_index"} 35
  15. apisix_etcd_modify_indexes{key="protos"} 0
  16. apisix_etcd_modify_indexes{key="routes"} 222
  17. apisix_etcd_modify_indexes{key="services"} 0
  18. apisix_etcd_modify_indexes{key="ssls"} 0
  19. apisix_etcd_modify_indexes{key="stream_routes"} 0
  20. apisix_etcd_modify_indexes{key="upstreams"} 0
  21. apisix_etcd_modify_indexes{key="x_etcd_index"} 223
  22. # HELP apisix_batch_process_entries batch process remaining entries
  23. # TYPE apisix_batch_process_entries gauge
  24. apisix_batch_process_entries{name="http-logger",route_id="9",server_addr="127.0.0.1"} 1
  25. apisix_batch_process_entries{name="sls-logger",route_id="9",server_addr="127.0.0.1"} 1
  26. apisix_batch_process_entries{name="tcp-logger",route_id="9",server_addr="127.0.0.1"} 1
  27. apisix_batch_process_entries{name="udp-logger",route_id="9",server_addr="127.0.0.1"} 1
  28. apisix_batch_process_entries{name="sys-logger",route_id="9",server_addr="127.0.0.1"} 1
  29. apisix_batch_process_entries{name="zipkin_report",route_id="9",server_addr="127.0.0.1"} 1
  30. # HELP apisix_etcd_reachable Config server etcd reachable from Apisix, 0 is unreachable
  31. # TYPE apisix_etcd_reachable gauge
  32. apisix_etcd_reachable 1
  33. # HELP apisix_http_status HTTP status codes per service in Apisix
  34. # TYPE apisix_http_status counter
  35. apisix_http_status{code="200",route="1",matched_uri="/hello",matched_host="",service="",consumer="",node="127.0.0.1"} 4
  36. apisix_http_status{code="200",route="2",matched_uri="/world",matched_host="",service="",consumer="",node="127.0.0.1"} 4
  37. apisix_http_status{code="404",route="",matched_uri="",matched_host="",service="",consumer="",node=""} 1
  38. # HELP apisix_http_requests_total The total number of client requests
  39. # TYPE apisix_http_requests_total gauge
  40. apisix_http_requests_total 1191780
  41. # HELP apisix_nginx_http_current_connections Number of HTTP connections
  42. # TYPE apisix_nginx_http_current_connections gauge
  43. apisix_nginx_http_current_connections{state="accepted"} 11994
  44. apisix_nginx_http_current_connections{state="active"} 2
  45. apisix_nginx_http_current_connections{state="handled"} 11994
  46. apisix_nginx_http_current_connections{state="reading"} 0
  47. apisix_nginx_http_current_connections{state="waiting"} 1
  48. apisix_nginx_http_current_connections{state="writing"} 1
  49. # HELP apisix_nginx_metric_errors_total Number of nginx-lua-prometheus errors
  50. # TYPE apisix_nginx_metric_errors_total counter
  51. apisix_nginx_metric_errors_total 0
  52. # HELP apisix_http_latency HTTP request latency in milliseconds per service in APISIX
  53. # TYPE apisix_http_latency histogram
  54. apisix_http_latency_bucket{type="apisix",route="1",service="",consumer="",node="127.0.0.1",le="1"} 1
  55. apisix_http_latency_bucket{type="apisix",route="1",service="",consumer="",node="127.0.0.1",le="2"} 1
  56. apisix_http_latency_bucket{type="request",route="1",service="",consumer="",node="127.0.0.1",le="1"} 1
  57. apisix_http_latency_bucket{type="request",route="1",service="",consumer="",node="127.0.0.1",le="2"} 1
  58. apisix_http_latency_bucket{type="upstream",route="1",service="",consumer="",node="127.0.0.1",le="1"} 1
  59. apisix_http_latency_bucket{type="upstream",route="1",service="",consumer="",node="127.0.0.1",le="2"} 1
  60. ...
  61. # HELP apisix_node_info Info of APISIX node
  62. # TYPE apisix_node_info gauge
  63. apisix_node_info{hostname="desktop-2022q8f-wsl"} 1
  64. # HELP apisix_shared_dict_capacity_bytes The capacity of each nginx shared DICT since APISIX start
  65. # TYPE apisix_shared_dict_capacity_bytes gauge
  66. apisix_shared_dict_capacity_bytes{name="access-tokens"} 1048576
  67. apisix_shared_dict_capacity_bytes{name="balancer-ewma"} 10485760
  68. apisix_shared_dict_capacity_bytes{name="balancer-ewma-last-touched-at"} 10485760
  69. apisix_shared_dict_capacity_bytes{name="balancer-ewma-locks"} 10485760
  70. apisix_shared_dict_capacity_bytes{name="discovery"} 1048576
  71. apisix_shared_dict_capacity_bytes{name="etcd-cluster-health-check"} 10485760
  72. ...
  73. # HELP apisix_shared_dict_free_space_bytes The free space of each nginx shared DICT since APISIX start
  74. # TYPE apisix_shared_dict_free_space_bytes gauge
  75. apisix_shared_dict_free_space_bytes{name="access-tokens"} 1032192
  76. apisix_shared_dict_free_space_bytes{name="balancer-ewma"} 10412032
  77. apisix_shared_dict_free_space_bytes{name="balancer-ewma-last-touched-at"} 10412032
  78. apisix_shared_dict_free_space_bytes{name="balancer-ewma-locks"} 10412032
  79. apisix_shared_dict_free_space_bytes{name="discovery"} 1032192
  80. apisix_shared_dict_free_space_bytes{name="etcd-cluster-health-check"} 10412032
  81. ...
  82. # HELP apisix_upstream_status Upstream status from health check
  83. # TYPE apisix_upstream_status gauge
  84. apisix_upstream_status{name="/apisix/routes/1",ip="100.24.156.8",port="80"} 0
  85. apisix_upstream_status{name="/apisix/routes/1",ip="52.86.68.46",port="80"} 1

Delete Plugin

To remove the prometheus Plugin, you can delete the corresponding JSON configuration from the Plugin configuration. APISIX will automatically reload and you do not have to restart for this to take effect.

  1. curl http://127.0.0.1:9180/apisix/admin/routes/1 -H "X-API-KEY: $admin_key" -X PUT -d '
  2. {
  3. "uri": "/hello",
  4. "plugins": {},
  5. "upstream": {
  6. "type": "roundrobin",
  7. "nodes": {
  8. "127.0.0.1:80": 1
  9. }
  10. }
  11. }'

How to enable it for TCP/UDP

prometheus - 图10IMPORTANT

This feature requires APISIX to run on APISIX-Runtime.

We can also enable prometheus to collect metrics for TCP/UDP.

First of all, ensure prometheus plugin is in your configuration file (conf/config.yaml):

conf/config.yaml

  1. stream_plugins:
  2. - ...
  3. - prometheus

Then you need to configure the prometheus plugin on the stream route:

  1. curl http://127.0.0.1:9180/apisix/admin/stream_routes/1 -H "X-API-KEY: $admin_key" -X PUT -d '
  2. {
  3. "plugins": {
  4. "prometheus":{}
  5. },
  6. "upstream": {
  7. "type": "roundrobin",
  8. "nodes": {
  9. "127.0.0.1:80": 1
  10. }
  11. }
  12. }'

Available TCP/UDP metrics

The following metrics are available when using APISIX as an L4 proxy.

  • Stream Connections: The number of processed connections at the route level.

    Attributes:

    NameDescription
    routematched stream route ID
  • Connections: Various Nginx connection metrics like active, reading, writing, and number of accepted connections.

  • Info: Information about the current APISIX node.

Here are examples of APISIX metrics:

  1. $ curl http://127.0.0.1:9091/apisix/prometheus/metrics
  1. ...
  2. # HELP apisix_node_info Info of APISIX node
  3. # TYPE apisix_node_info gauge
  4. apisix_node_info{hostname="desktop-2022q8f-wsl"} 1
  5. # HELP apisix_stream_connection_total Total number of connections handled per stream route in APISIX
  6. # TYPE apisix_stream_connection_total counter
  7. apisix_stream_connection_total{route="1"} 1