Knative Serving metrics

Knative Serving metrics

Administrators can monitor Serving control plane based on the metrics exposed by each Serving component. Metrics are listed next.

Activator

The following metrics allow the user to understand how application responds when traffic goes through the activator, for example, when scaling from zero. For example high request latency means that requests are taken too much time be fulfilled.

Metric Name	Description	Type	Tags	Unit	Status
request_concurrency	Concurrent requests that are routed to Activator These are requests reported by the concurrency reporter which may not be done yet. This is the average concurrency over a reporting period	Gauge	configuration_name container_name namespace_name pod_name revision_name service_name	Dimensionless	Stable
request_count	The number of requests that are routed to Activator. These are requests that have been fulfilled from the activator handler.	Counter	configuration_name container_name namespace_name pod_name response_code response_code_class revision_name service_name	Dimensionless	Stable
request_latencies	The response time in millisecond for the fulfilled routed requests	Histogram	configuration_name container_name namespace_name pod_name response_code response_code_class revision_name service_name	Milliseconds	Stable

Autoscaler

Autoscaler component exposes a number of metrics related to its decisions per revision. For example, at any given time, you can monitor the desired pods the Autoscaler wants to allocate for a Service, the average number of requests per second during the stable window, or whether autoscaler is in panic mode (KPA).

Metric Name	Description	Type	Tags	Unit	Status
desired_pods	Number of pods autoscaler wants to allocate	Gauge	configuration_name namespace_name revision_name service_name	Dimensionless	Stable
excess_burst_capacity	Excess burst capacity overserved over the stable window	Gauge	configuration_name namespace_name revision_name service_name	Dimensionless	Stable
stable_request_concurrency	Average of requests count per observed pod over the stable window	Gauge	configuration_name namespace_name revision_name service_name	Dimensionless	Stable
panic_request_concurrency	Average of requests count per observed pod over the panic window	Gauge	configuration_name namespace_name revision_name service_name	Dimensionless	Stable
target_concurrency_per_pod	The desired number of concurrent requests for each pod	Gauge	configuration_name namespace_name revision_name service_name	Dimensionless	Stable
stable_requests_per_second	Average requests-per-second per observed pod over the stable window	Gauge	configuration_name namespace_name revision_name service_name	Dimensionless	Stable
panic_requests_per_second	Average requests-per-second per observed pod over the panic window	Gauge	configuration_name namespace_name revision_name service_name	Dimensionless	Stable
target_requests_per_second	The desired requests-per-second for each pod	Gauge	configuration_name namespace_name revision_name service_name	Dimensionless	Stable
panic_mode	1 if autoscaler is in panic mode, 0 otherwise	Gauge	configuration_name namespace_name revision_name service_name	Dimensionless	Stable
requested_pods	Number of pods autoscaler requested from Kubernetes	Gauge	configuration_name namespace_name revision_name service_name	Dimensionless	Stable
actual_pods	Number of pods that are allocated currently in ready state	Gauge	configuration_name namespace_name revision_name service_name	Dimensionless	Stable
not_ready_pods	Number of pods that are not ready currently	Gauge	configuration_name= namespace_name= revision_name service_name	Dimensionless	Stable
pending_pods	Number of pods that are pending currently	Gauge	configuration_name namespace_name revision_name service_name	Dimensionless	Stable
terminating_pods	Number of pods that are terminating currently	Gauge	configuration_name namespace_name revision_name service_name	Dimensionless	Stable

Controller

The following metrics are emitted by any component that implements a controller logic. The metrics show details about the reconciliation operations and the workqueue behavior on which reconciliation requests are enqueued.

Metric Name	Description	Type	Tags	Unit	Status
work_queue_depth	Depth of the work queue	Gauge	reconciler	Dimensionless	Stable
reconcile_count	Number of reconcile operations	Counter	reconciler success	Dimensionless	Stable
reconcile_latency	Latency of reconcile operations	Histogram	reconciler success	Milliseconds	Stable
workqueue_adds_total	Total number of adds handled by workqueue	Counter	name	Dimensionless	Stable
workqueue_depth	Current depth of workqueue	Gauge	reconciler	Dimensionless	Stable
workqueue_queue_latency_seconds	How long in seconds an item stays in workqueue before being requested	Histogram	name	Seconds	Stable
workqueue_retries_total	Total number of retries handled by workqueue	Counter	name	Dimensionless	Stable
workqueue_work_duration_seconds	How long in seconds processing an item from a workqueue takes.	Histogram	name	Seconds	Stable
workqueue_unfinished_work_seconds	How long in seconds the outstanding workqueue items have been in flight (total).	Histogram	name	Seconds	Stable
workqueue_longest_running_processor_seconds	How long in seconds the longest outstanding workqueue item has been in flight	Histogram	name	Seconds	Stable

Webhook

Webhook metrics report useful info about operations. For example, if a large number of operations fail, this could indicate an issue with a user-created resource.

Metric Name	Description	Type	Tags	Unit	Status
request_count	The number of requests that are routed to webhook	Counter	admission_allowed kind_group kind_kind kind_version request_operation resource_group resource_namespace resource_resource resource_version	Dimensionless	Stable
request_latencies	The response time in milliseconds	Histogram	admission_allowed kind_group kind_kind kind_version request_operation resource_group resource_namespace resource_resource resource_version	Milliseconds	Stable

Go Runtime - memstats

Each Knative Serving control plane process emits a number of Go runtime memory statistics (shown next). As a baseline for monitoring purproses, user could start with a subset of the metrics: current allocations (go_alloc), total allocations (go_total_alloc), system memory (go_sys), mallocs (go_mallocs), frees (go_frees) and garbage collection total pause time (total_gc_pause_ns), next gc target heap size (go_next_gc) and number of garbage collection cycles (num_gc).

Metric Name	Description	Type	Tags	Unit	Status
go_alloc	The number of bytes of allocated heap objects (same as heap_alloc)	Gauge	name	Dimensionless	Stable
go_total_alloc	The cumulative bytes allocated for heap objects	Gauge	name	Dimensionless	Stable
go_sys	The total bytes of memory obtained from the OS	Gauge	name	Dimensionless	Stable
go_lookups	The number of pointer lookups performed by the runtime	Gauge	name	Dimensionless	Stable
go_mallocs	The cumulative count of heap objects allocated	Gauge	name	Dimensionless	Stable
go_frees	The cumulative count of heap objects freed	Gauge	name	Dimensionless	Stable
go_heap_alloc	The number of bytes of allocated heap objects	Gauge	name	Dimensionless	Stable
go_heap_sys	The number of bytes of heap memory obtained from the OS	Gauge	name	Dimensionless	Stable
go_heap_idle	The number of bytes in idle (unused) spans	Gauge	name	Dimensionless	Stable
go_heap_in_use	The number of bytes in in-use spans	Gauge	name	Dimensionless	Stable
go_heap_released	The number of bytes of physical memory returned to the OS	Gauge	name	Dimensionless	Stable
go_heap_objects	The number of allocated heap objects	Gauge	name	Dimensionless	Stable
go_stack_in_use	The number of bytes in stack spans	Gauge	name	Dimensionless	Stable
go_stack_sys	The number of bytes of stack memory obtained from the OS	Gauge	name	Dimensionless	Stable
go_mspan_in_use	The number of bytes of allocated mspan structures	Gauge	name	Dimensionless	Stable
go_mspan_sys	The number of bytes of memory obtained from the OS for mspan structures	Gauge	name	Dimensionless	Stable
go_mcache_in_use	The number of bytes of allocated mcache structures	Gauge	name	Dimensionless	Stable
go_mcache_sys	The number of bytes of memory obtained from the OS for mcache structures	Gauge	name	Dimensionless	Stable
go_bucket_hash_sys	The number of bytes of memory in profiling bucket hash tables.	Gauge	name	Dimensionless	Stable
go_gc_sys	The number of bytes of memory in garbage collection metadata	Gauge	name	Dimensionless	Stable
go_other_sys	The number of bytes of memory in miscellaneous off-heap runtime allocations	Gauge	name	Dimensionless	Stable
go_next_gc	The target heap size of the next GC cycle	Gauge	name	Dimensionless	Stable
go_last_gc	The time the last garbage collection finished, as nanoseconds since 1970 (the UNIX epoch)	Gauge	name	Nanoseconds	Stable
go_total_gc_pause_ns	The cumulative nanoseconds in GC stop-the-world pauses since the program started	Gauge	name	Nanoseconds	Stable
go_num_gc	The number of completed GC cycles.	Gauge	name	Dimensionless	Stable
go_num_forced_gc	The number of GC cycles that were forced by the application calling the GC function.	Gauge	name	Dimensionless	Stable
go_gc_cpu_fraction	The fraction of this program’s available CPU time used by the GC since the program started	Gauge	name	Dimensionless	Stable

Note

The name tag is empty.

Developer - User Services

Every Knative service has a proxy container that proxies the connections to the application container. A number of metrics are reported for the queue proxy performance. Using the following metrics application developers, devops and others, could measure if requests are queued at the proxy side (need for backpressure) and what is the actual delay in serving requests at the application side.

Queue proxy

Requests endpoint.

Metric Name	Description	Type	Tags	Unit	Status
revision_request_count	The number of requests that are routed to queue-proxy	Counter	configuration_name container_name namespace_name pod_name response_code response_code_class revision_name service_name	Dimensionless	Stable
revision_request_latencies	The response time in millisecond	Histogram	configuration_name container_name namespace_name pod_name response_code response_code_class revision_name service_name	Milliseconds	Stable
revision_app_request_count	The number of requests that are routed to user-container	Counter	configuration_name container_name namespace_name pod_name response_code response_code_class revision_name service_name	Dimensionless	Stable
revision_app_request_latencies	The response time in millisecond	Histogram	configuration_name namespace_name pod_name response_code response_code_class revision_name service_name	Milliseconds	Stable
revision_queue_depth	The current number of items in the serving and waiting queue, or not reported if unlimited concurrency	Gauge	configuration_name event-display container_name namespace_name pod_name response_code_class revision_name service_name	Dimensionless	Stable