Knative Serving metrics

Administrators can monitor Serving control plane based on the metrics exposed by each Serving component. Metrics are listed next.

Activator

The following metrics allow the user to understand how application responds when traffic goes through the activator, for example, when scaling from zero. For example high request latency means that requests are taken too much time be fulfilled.

Metric NameDescriptionTypeTagsUnitStatus
request_concurrencyConcurrent requests that are routed to Activator
These are requests reported by the concurrency reporter which may not be done yet.
This is the average concurrency over a reporting period
Gaugeconfiguration_name
container_name
namespace_name
pod_name
revision_name
service_name
DimensionlessStable
request_countThe number of requests that are routed to Activator.
These are requests that have been fulfilled from the activator handler.
Counterconfiguration_name
container_name
namespace_name
pod_name
response_code
response_code_class
revision_name
service_name
DimensionlessStable
request_latenciesThe response time in millisecond for the fulfilled routed requestsHistogramconfiguration_name
container_name
namespace_name
pod_name
response_code
response_code_class
revision_name
service_name
MillisecondsStable

Autoscaler

Autoscaler component exposes a number of metrics related to its decisions per revision. For example, at any given time, you can monitor the desired pods the Autoscaler wants to allocate for a Service, the average number of requests per second during the stable window, or whether autoscaler is in panic mode (KPA).

Metric NameDescriptionTypeTagsUnitStatus
desired_podsNumber of pods autoscaler wants to allocateGaugeconfiguration_name
namespace_name
revision_name
service_name
DimensionlessStable
excess_burst_capacityExcess burst capacity overserved over the stable windowGaugeconfiguration_name
namespace_name
revision_name
service_name
DimensionlessStable
stable_request_concurrencyAverage of requests count per observed pod over the stable windowGaugeconfiguration_name
namespace_name
revision_name
service_name
DimensionlessStable
panic_request_concurrencyAverage of requests count per observed pod over the panic windowGaugeconfiguration_name
namespace_name
revision_name
service_name
DimensionlessStable
target_concurrency_per_podThe desired number of concurrent requests for each podGaugeconfiguration_name
namespace_name
revision_name
service_name
DimensionlessStable
stable_requests_per_secondAverage requests-per-second per observed pod over the stable windowGaugeconfiguration_name
namespace_name
revision_name
service_name
DimensionlessStable
panic_requests_per_secondAverage requests-per-second per observed pod over the panic windowGaugeconfiguration_name
namespace_name
revision_name
service_name
DimensionlessStable
target_requests_per_secondThe desired requests-per-second for each podGaugeconfiguration_name
namespace_name
revision_name
service_name
DimensionlessStable
panic_mode1 if autoscaler is in panic mode, 0 otherwiseGaugeconfiguration_name
namespace_name
revision_name
service_name
DimensionlessStable
requested_podsNumber of pods autoscaler requested from KubernetesGaugeconfiguration_name
namespace_name
revision_name
service_name
DimensionlessStable
actual_podsNumber of pods that are allocated currently in ready stateGaugeconfiguration_name
namespace_name
revision_name
service_name
DimensionlessStable
not_ready_podsNumber of pods that are not ready currentlyGaugeconfiguration_name=
namespace_name=
revision_name
service_name
DimensionlessStable
pending_podsNumber of pods that are pending currentlyGaugeconfiguration_name
namespace_name
revision_name
service_name
DimensionlessStable
terminating_podsNumber of pods that are terminating currentlyGaugeconfiguration_name
namespace_name
revision_name
service_name
DimensionlessStable

Controller

The following metrics are emitted by any component that implements a controller logic. The metrics show details about the reconciliation operations and the workqueue behavior on which reconciliation requests are enqueued.

Metric NameDescriptionTypeTagsUnitStatus
work_queue_depthDepth of the work queueGaugereconcilerDimensionlessStable
reconcile_countNumber of reconcile operationsCounterreconciler
success
DimensionlessStable
reconcile_latencyLatency of reconcile operationsHistogramreconciler
success
MillisecondsStable
workqueue_adds_totalTotal number of adds handled by workqueueCounternameDimensionlessStable
workqueue_depthCurrent depth of workqueueGaugereconcilerDimensionlessStable
workqueue_queue_latency_secondsHow long in seconds an item stays in workqueue before being requestedHistogramnameSecondsStable
workqueue_retries_totalTotal number of retries handled by workqueueCounternameDimensionlessStable
workqueue_work_duration_secondsHow long in seconds processing an item from a workqueue takes.HistogramnameSecondsStable
workqueue_unfinished_work_secondsHow long in seconds the outstanding workqueue items have been in flight (total).HistogramnameSecondsStable
workqueue_longest_running_processor_secondsHow long in seconds the longest outstanding workqueue item has been in flightHistogramnameSecondsStable

Webhook

Webhook metrics report useful info about operations. For example, if a large number of operations fail, this could indicate an issue with a user-created resource.

Metric NameDescriptionTypeTagsUnitStatus
request_countThe number of requests that are routed to webhookCounteradmission_allowed
kind_group
kind_kind
kind_version
request_operation
resource_group
resource_namespace
resource_resource
resource_version
DimensionlessStable
request_latenciesThe response time in millisecondsHistogramadmission_allowed
kind_group
kind_kind
kind_version
request_operation
resource_group
resource_namespace
resource_resource
resource_version
MillisecondsStable

Go Runtime - memstats

Each Knative Serving control plane process emits a number of Go runtime memory statistics (shown next). As a baseline for monitoring purproses, user could start with a subset of the metrics: current allocations (go_alloc), total allocations (go_total_alloc), system memory (go_sys), mallocs (go_mallocs), frees (go_frees) and garbage collection total pause time (total_gc_pause_ns), next gc target heap size (go_next_gc) and number of garbage collection cycles (num_gc).

Metric NameDescriptionTypeTagsUnitStatus
go_allocThe number of bytes of allocated heap objects (same as heap_alloc)GaugenameDimensionlessStable
go_total_allocThe cumulative bytes allocated for heap objectsGaugenameDimensionlessStable
go_sysThe total bytes of memory obtained from the OSGaugenameDimensionlessStable
go_lookupsThe number of pointer lookups performed by the runtimeGaugenameDimensionlessStable
go_mallocsThe cumulative count of heap objects allocatedGaugenameDimensionlessStable
go_freesThe cumulative count of heap objects freedGaugenameDimensionlessStable
go_heap_allocThe number of bytes of allocated heap objectsGaugenameDimensionlessStable
go_heap_sysThe number of bytes of heap memory obtained from the OSGaugenameDimensionlessStable
go_heap_idleThe number of bytes in idle (unused) spansGaugenameDimensionlessStable
go_heap_in_useThe number of bytes in in-use spansGaugenameDimensionlessStable
go_heap_releasedThe number of bytes of physical memory returned to the OSGaugenameDimensionlessStable
go_heap_objectsThe number of allocated heap objectsGaugenameDimensionlessStable
go_stack_in_useThe number of bytes in stack spansGaugenameDimensionlessStable
go_stack_sysThe number of bytes of stack memory obtained from the OSGaugenameDimensionlessStable
go_mspan_in_useThe number of bytes of allocated mspan structuresGaugenameDimensionlessStable
go_mspan_sysThe number of bytes of memory obtained from the OS for mspan structuresGaugenameDimensionlessStable
go_mcache_in_useThe number of bytes of allocated mcache structuresGaugenameDimensionlessStable
go_mcache_sysThe number of bytes of memory obtained from the OS for mcache structuresGaugenameDimensionlessStable
go_bucket_hash_sysThe number of bytes of memory in profiling bucket hash tables.GaugenameDimensionlessStable
go_gc_sysThe number of bytes of memory in garbage collection metadataGaugenameDimensionlessStable
go_other_sysThe number of bytes of memory in miscellaneous off-heap runtime allocationsGaugenameDimensionlessStable
go_next_gcThe target heap size of the next GC cycleGaugenameDimensionlessStable
go_last_gcThe time the last garbage collection finished, as nanoseconds since 1970 (the UNIX epoch)GaugenameNanosecondsStable
go_total_gc_pause_nsThe cumulative nanoseconds in GC stop-the-world pauses since the program startedGaugenameNanosecondsStable
go_num_gcThe number of completed GC cycles.GaugenameDimensionlessStable
go_num_forced_gcThe number of GC cycles that were forced by the application calling the GC function.GaugenameDimensionlessStable
go_gc_cpu_fractionThe fraction of this program’s available CPU time used by the GC since the program startedGaugenameDimensionlessStable

Note

The name tag is empty.

Developer - User Services

Every Knative service has a proxy container that proxies the connections to the application container. A number of metrics are reported for the queue proxy performance. Using the following metrics application developers, devops and others, could measure if requests are queued at the proxy side (need for backpressure) and what is the actual delay in serving requests at the application side.

Queue proxy

Requests endpoint.

Metric NameDescriptionTypeTagsUnitStatus
revision_request_countThe number of requests that are routed to queue-proxyCounterconfiguration_name
container_name
namespace_name
pod_name
response_code
response_code_class
revision_name
service_name
DimensionlessStable
revision_request_latenciesThe response time in millisecondHistogramconfiguration_name
container_name
namespace_name
pod_name
response_code
response_code_class
revision_name
service_name
MillisecondsStable
revision_app_request_countThe number of requests that are routed to user-containerCounterconfiguration_name
container_name
namespace_name
pod_name
response_code
response_code_class
revision_name
service_name
DimensionlessStable
revision_app_request_latenciesThe response time in millisecondHistogramconfiguration_name
namespace_name
pod_name
response_code
response_code_class
revision_name
service_name
MillisecondsStable
revision_queue_depthThe current number of items in the serving and waiting queue, or not reported if unlimited concurrencyGaugeconfiguration_name
event-display
container_name
namespace_name
pod_name
response_code_class
revision_name
service_name
DimensionlessStable