Kubernetes Metrics Reference
Details of the metric data that Kubernetes components export.
Metrics (v1.32)
This page details the metrics that different Kubernetes components export. You can query the metrics endpoint for these components using an HTTP scrape, and fetch the current metrics data in Prometheus format.
List of Stable Kubernetes Metrics
Stable metrics observe strict API contracts and no labels can be added or removed from stable metrics during their lifetime.
apiserver_admission_controller_admission_duration_seconds
Admission controller latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit).
- Stability Level:STABLE
- Type: Histogram
- Labels:nameoperationrejectedtype
apiserver_admission_step_admission_duration_seconds
Admission sub-step latency histogram in seconds, broken out for each operation and API resource and step type (validate or admit).
- Stability Level:STABLE
- Type: Histogram
- Labels:operationrejectedtype
apiserver_admission_webhook_admission_duration_seconds
Admission webhook latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit).
- Stability Level:STABLE
- Type: Histogram
- Labels:nameoperationrejectedtype
apiserver_current_inflight_requests
Maximal number of currently used inflight request limit of this apiserver per request kind in last second.
- Stability Level:STABLE
- Type: Gauge
- Labels:request_kind
apiserver_longrunning_requests
Gauge of all active long-running apiserver requests broken out by verb, group, version, resource, scope and component. Not all requests are tracked this way.
- Stability Level:STABLE
- Type: Gauge
- Labels:componentgroupresourcescopesubresourceverbversion
apiserver_request_duration_seconds
Response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component.
- Stability Level:STABLE
- Type: Histogram
- Labels:componentdry_rungroupresourcescopesubresourceverbversion
apiserver_request_total
Counter of apiserver requests broken out for each verb, dry run value, group, version, resource, scope, component, and HTTP response code.
- Stability Level:STABLE
- Type: Counter
- Labels:codecomponentdry_rungroupresourcescopesubresourceverbversion
apiserver_requested_deprecated_apis
Gauge of deprecated APIs that have been requested, broken out by API group, version, resource, subresource, and removed_release.
- Stability Level:STABLE
- Type: Gauge
- Labels:groupremoved_releaseresourcesubresourceversion
apiserver_response_sizes
Response size distribution in bytes for each group, version, verb, resource, subresource, scope and component.
- Stability Level:STABLE
- Type: Histogram
- Labels:componentgroupresourcescopesubresourceverbversion
apiserver_storage_objects
Number of stored objects at the time of last check split by kind. In case of a fetching error, the value will be -1.
- Stability Level:STABLE
- Type: Gauge
- Labels:resource
apiserver_storage_size_bytes
Size of the storage database file physically allocated in bytes.
- Stability Level:STABLE
- Type: Custom
- Labels:storage_cluster_id
container_cpu_usage_seconds_total
Cumulative cpu time consumed by the container in core-seconds
- Stability Level:STABLE
- Type: Custom
- Labels:containerpodnamespace
container_memory_working_set_bytes
Current working set of the container in bytes
- Stability Level:STABLE
- Type: Custom
- Labels:containerpodnamespace
container_start_time_seconds
Start time of the container since unix epoch in seconds
- Stability Level:STABLE
- Type: Custom
- Labels:containerpodnamespace
cronjob_controller_job_creation_skew_duration_seconds
Time between when a cronjob is scheduled to be run, and when the corresponding job is created
- Stability Level:STABLE
- Type: Histogram
job_controller_job_pods_finished_total
The number of finished Pods that are fully tracked
- Stability Level:STABLE
- Type: Counter
- Labels:completion_moderesult
job_controller_job_sync_duration_seconds
The time it took to sync a job
- Stability Level:STABLE
- Type: Histogram
- Labels:actioncompletion_moderesult
job_controller_job_syncs_total
The number of job syncs
- Stability Level:STABLE
- Type: Counter
- Labels:actioncompletion_moderesult
job_controller_jobs_finished_total
The number of finished jobs
- Stability Level:STABLE
- Type: Counter
- Labels:completion_modereasonresult
kube_pod_resource_limit
Resources limit for workloads on the cluster, broken down by pod. This shows the resource usage the scheduler and kubelet expect per pod for resources along with the unit for the resource if any.
- Stability Level:STABLE
- Type: Custom
- Labels:namespacepodnodeschedulerpriorityresourceunit
kube_pod_resource_request
Resources requested by workloads on the cluster, broken down by pod. This shows the resource usage the scheduler and kubelet expect per pod for resources along with the unit for the resource if any.
- Stability Level:STABLE
- Type: Custom
- Labels:namespacepodnodeschedulerpriorityresourceunit
kubernetes_healthcheck
This metric records the result of a single healthcheck.
- Stability Level:STABLE
- Type: Gauge
- Labels:nametype
kubernetes_healthchecks_total
This metric records the results of all healthcheck.
- Stability Level:STABLE
- Type: Counter
- Labels:namestatustype
node_collector_evictions_total
Number of Node evictions that happened since current instance of NodeController started.
- Stability Level:STABLE
- Type: Counter
- Labels:zone
node_cpu_usage_seconds_total
Cumulative cpu time consumed by the node in core-seconds
- Stability Level:STABLE
- Type: Custom
node_memory_working_set_bytes
Current working set of the node in bytes
- Stability Level:STABLE
- Type: Custom
pod_cpu_usage_seconds_total
Cumulative cpu time consumed by the pod in core-seconds
- Stability Level:STABLE
- Type: Custom
- Labels:podnamespace
pod_memory_working_set_bytes
Current working set of the pod in bytes
- Stability Level:STABLE
- Type: Custom
- Labels:podnamespace
resource_scrape_error
1 if there was an error while getting container metrics, 0 otherwise
- Stability Level:STABLE
- Type: Custom
scheduler_framework_extension_point_duration_seconds
Latency for running all plugins of a specific extension point.
- Stability Level:STABLE
- Type: Histogram
- Labels:extension_pointprofilestatus
scheduler_pending_pods
Number of pending pods, by the queue type. ‘active’ means number of pods in activeQ; ‘backoff’ means number of pods in backoffQ; ‘unschedulable’ means number of pods in unschedulablePods that the scheduler attempted to schedule and failed; ‘gated’ is the number of unschedulable pods that the scheduler never attempted to schedule because they are gated.
- Stability Level:STABLE
- Type: Gauge
- Labels:queue
scheduler_pod_scheduling_attempts
Number of attempts to successfully schedule a pod.
- Stability Level:STABLE
- Type: Histogram
scheduler_pod_scheduling_duration_seconds
E2e latency for a pod being scheduled which may include multiple scheduling attempts.
- Stability Level:STABLE
- Type: Histogram
- Labels:attempts
- Deprecated Versions:1.29.0
scheduler_preemption_attempts_total
Total preemption attempts in the cluster till now
- Stability Level:STABLE
- Type: Counter
scheduler_preemption_victims
Number of selected preemption victims
- Stability Level:STABLE
- Type: Histogram
scheduler_queue_incoming_pods_total
Number of pods added to scheduling queues by event and queue type.
- Stability Level:STABLE
- Type: Counter
- Labels:eventqueue
scheduler_schedule_attempts_total
Number of attempts to schedule pods, by the result. ‘unschedulable’ means a pod could not be scheduled, while ‘error’ means an internal scheduler problem.
- Stability Level:STABLE
- Type: Counter
- Labels:profileresult
scheduler_scheduling_attempt_duration_seconds
Scheduling attempt latency in seconds (scheduling algorithm + binding)
- Stability Level:STABLE
- Type: Histogram
- Labels:profileresult
List of Beta Kubernetes Metrics
Beta metrics observe a looser API contract than its stable counterparts. No labels can be removed from beta metrics during their lifetime, however, labels can be added while the metric is in the beta stage. This offers the assurance that beta metrics will honor existing dashboards and alerts, while allowing for amendments in the future.
apiserver_cel_compilation_duration_seconds
CEL compilation time in seconds.
- Stability Level:BETA
- Type: Histogram
apiserver_cel_evaluation_duration_seconds
CEL evaluation time in seconds.
- Stability Level:BETA
- Type: Histogram
apiserver_flowcontrol_current_executing_requests
Number of requests in initial (for a WATCH) or any (for a non-WATCH) execution stage in the API Priority and Fairness subsystem
- Stability Level:BETA
- Type: Gauge
- Labels:flow_schemapriority_level
apiserver_flowcontrol_current_executing_seats
Concurrency (number of seats) occupied by the currently executing (initial stage for a WATCH, any stage otherwise) requests in the API Priority and Fairness subsystem
- Stability Level:BETA
- Type: Gauge
- Labels:flow_schemapriority_level
apiserver_flowcontrol_current_inqueue_requests
Number of requests currently pending in queues of the API Priority and Fairness subsystem
- Stability Level:BETA
- Type: Gauge
- Labels:flow_schemapriority_level
apiserver_flowcontrol_dispatched_requests_total
Number of requests executed by API Priority and Fairness subsystem
- Stability Level:BETA
- Type: Counter
- Labels:flow_schemapriority_level
apiserver_flowcontrol_nominal_limit_seats
Nominal number of execution seats configured for each priority level
- Stability Level:BETA
- Type: Gauge
- Labels:priority_level
apiserver_flowcontrol_rejected_requests_total
Number of requests rejected by API Priority and Fairness subsystem
- Stability Level:BETA
- Type: Counter
- Labels:flow_schemapriority_levelreason
apiserver_flowcontrol_request_wait_duration_seconds
Length of time a request spent waiting in its queue
- Stability Level:BETA
- Type: Histogram
- Labels:executeflow_schemapriority_level
apiserver_validating_admission_policy_check_duration_seconds
Validation admission latency for individual validation expressions in seconds, labeled by policy and further including binding and enforcement action taken.
- Stability Level:BETA
- Type: Histogram
- Labels:enforcement_actionerror_typepolicypolicy_binding
apiserver_validating_admission_policy_check_total
Validation admission policy check total, labeled by policy and further identified by binding and enforcement action taken.
- Stability Level:BETA
- Type: Counter
- Labels:enforcement_actionerror_typepolicypolicy_binding
disabled_metrics_total
The count of disabled metrics.
- Stability Level:BETA
- Type: Counter
hidden_metrics_total
The count of hidden metrics.
- Stability Level:BETA
- Type: Counter
kubernetes_feature_enabled
This metric records the data about the stage and enablement of a k8s feature.
- Stability Level:BETA
- Type: Gauge
- Labels:namestage
registered_metrics_total
The count of registered metrics broken by stability level and deprecation version.
- Stability Level:BETA
- Type: Counter
- Labels:deprecated_versionstability_level
scheduler_pod_scheduling_sli_duration_seconds
E2e latency for a pod being scheduled, from the time the pod enters the scheduling queue and might involve multiple scheduling attempts.
- Stability Level:BETA
- Type: Histogram
- Labels:attempts
List of Alpha Kubernetes Metrics
Alpha metrics do not have any API guarantees. These metrics must be used at your own risk, subsequent versions of Kubernetes may remove these metrics altogether, or mutate the API in such a way that breaks existing dashboards and alerts.
aggregator_discovery_aggregation_count_total
Counter of number of times discovery was aggregated
- Stability Level:ALPHA
- Type: Counter
aggregator_openapi_v2_regeneration_count
Counter of OpenAPI v2 spec regeneration count broken down by causing APIService name and reason.
- Stability Level:ALPHA
- Type: Counter
- Labels:apiservicereason
aggregator_openapi_v2_regeneration_duration
Gauge of OpenAPI v2 spec regeneration duration in seconds.
- Stability Level:ALPHA
- Type: Gauge
- Labels:reason
aggregator_unavailable_apiservice
Gauge of APIServices which are marked as unavailable broken down by APIService name.
- Stability Level:ALPHA
- Type: Custom
- Labels:name
aggregator_unavailable_apiservice_total
Counter of APIServices which are marked as unavailable broken down by APIService name and reason.
- Stability Level:ALPHA
- Type: Counter
- Labels:namereason
apiextensions_apiserver_validation_ratcheting_seconds
Time for comparison of old to new for the purposes of CRDValidationRatcheting during an UPDATE in seconds.
- Stability Level:ALPHA
- Type: Histogram
apiextensions_openapi_v2_regeneration_count
Counter of OpenAPI v2 spec regeneration count broken down by causing CRD name and reason.
- Stability Level:ALPHA
- Type: Counter
- Labels:crdreason
apiextensions_openapi_v3_regeneration_count
Counter of OpenAPI v3 spec regeneration count broken down by group, version, causing CRD and reason.
- Stability Level:ALPHA
- Type: Counter
- Labels:crdgroupreasonversion
apiserver_admission_match_condition_evaluation_errors_total
Admission match condition evaluation errors count, identified by name of resource containing the match condition and broken out for each kind containing matchConditions (webhook or policy), operation and admission type (validate or admit).
- Stability Level:ALPHA
- Type: Counter
- Labels:kindnameoperationtype
apiserver_admission_match_condition_evaluation_seconds
Admission match condition evaluation time in seconds, identified by name and broken out for each kind containing matchConditions (webhook or policy), operation and type (validate or admit).
- Stability Level:ALPHA
- Type: Histogram
- Labels:kindnameoperationtype
apiserver_admission_match_condition_exclusions_total
Admission match condition evaluation exclusions count, identified by name of resource containing the match condition and broken out for each kind containing matchConditions (webhook or policy), operation and admission type (validate or admit).
- Stability Level:ALPHA
- Type: Counter
- Labels:kindnameoperationtype
apiserver_admission_step_admission_duration_seconds_summary
Admission sub-step latency summary in seconds, broken out for each operation and API resource and step type (validate or admit).
- Stability Level:ALPHA
- Type: Summary
- Labels:operationrejectedtype
apiserver_admission_webhook_fail_open_count
Admission webhook fail open count, identified by name and broken out for each admission type (validating or admit).
- Stability Level:ALPHA
- Type: Counter
- Labels:nametype
apiserver_admission_webhook_rejection_count
Admission webhook rejection count, identified by name and broken out for each admission type (validating or admit) and operation. Additional labels specify an error type (calling_webhook_error or apiserver_internal_error if an error occurred; no_error otherwise) and optionally a non-zero rejection code if the webhook rejects the request with an HTTP status code (honored by the apiserver when the code is greater or equal to 400). Codes greater than 600 are truncated to 600, to keep the metrics cardinality bounded.
- Stability Level:ALPHA
- Type: Counter
- Labels:error_typenameoperationrejection_codetype
apiserver_admission_webhook_request_total
Admission webhook request total, identified by name and broken out for each admission type (validating or admit) and operation. Additional labels specify whether the request was rejected or not and an HTTP status code. Codes greater than 600 are truncated to 600, to keep the metrics cardinality bounded.
- Stability Level:ALPHA
- Type: Counter
- Labels:codenameoperationrejectedtype
apiserver_audit_error_total
Counter of audit events that failed to be audited properly. Plugin identifies the plugin affected by the error.
- Stability Level:ALPHA
- Type: Counter
- Labels:plugin
apiserver_audit_event_total
Counter of audit events generated and sent to the audit backend.
- Stability Level:ALPHA
- Type: Counter
apiserver_audit_level_total
Counter of policy levels for audit events (1 per request).
- Stability Level:ALPHA
- Type: Counter
- Labels:level
apiserver_audit_requests_rejected_total
Counter of apiserver requests rejected due to an error in audit logging backend.
- Stability Level:ALPHA
- Type: Counter
apiserver_authentication_config_controller_automatic_reload_last_timestamp_seconds
Timestamp of the last automatic reload of authentication configuration split by status and apiserver identity.
- Stability Level:ALPHA
- Type: Gauge
- Labels:apiserver_id_hashstatus
apiserver_authentication_config_controller_automatic_reloads_total
Total number of automatic reloads of authentication configuration split by status and apiserver identity.
- Stability Level:ALPHA
- Type: Counter
- Labels:apiserver_id_hashstatus
apiserver_authentication_jwt_authenticator_latency_seconds
Latency of jwt authentication operations in seconds. This is the time spent authenticating a token for cache miss only (i.e. when the token is not found in the cache).
- Stability Level:ALPHA
- Type: Histogram
- Labels:jwt_issuer_hashresult
apiserver_authorization_config_controller_automatic_reload_last_timestamp_seconds
Timestamp of the last automatic reload of authorization configuration split by status and apiserver identity.
- Stability Level:ALPHA
- Type: Gauge
- Labels:apiserver_id_hashstatus
apiserver_authorization_config_controller_automatic_reloads_total
Total number of automatic reloads of authorization configuration split by status and apiserver identity.
- Stability Level:ALPHA
- Type: Counter
- Labels:apiserver_id_hashstatus
apiserver_authorization_decisions_total
Total number of terminal decisions made by an authorizer split by authorizer type, name, and decision.
- Stability Level:ALPHA
- Type: Counter
- Labels:decisionnametype
apiserver_authorization_match_condition_evaluation_errors_total
Total number of errors when an authorization webhook encounters a match condition error split by authorizer type and name.
- Stability Level:ALPHA
- Type: Counter
- Labels:nametype
apiserver_authorization_match_condition_evaluation_seconds
Authorization match condition evaluation time in seconds, split by authorizer type and name.
- Stability Level:ALPHA
- Type: Histogram
- Labels:nametype
apiserver_authorization_match_condition_exclusions_total
Total number of exclusions when an authorization webhook is skipped because match conditions exclude it.
- Stability Level:ALPHA
- Type: Counter
- Labels:nametype
apiserver_authorization_webhook_duration_seconds
Request latency in seconds.
- Stability Level:ALPHA
- Type: Histogram
- Labels:nameresult
apiserver_authorization_webhook_evaluations_fail_open_total
NoOpinion results due to webhook timeout or error.
- Stability Level:ALPHA
- Type: Counter
- Labels:nameresult
apiserver_authorization_webhook_evaluations_total
Round-trips to authorization webhooks.
- Stability Level:ALPHA
- Type: Counter
- Labels:nameresult
apiserver_cache_list_fetched_objects_total
Number of objects read from watch cache in the course of serving a LIST request
- Stability Level:ALPHA
- Type: Counter
- Labels:indexresource_prefix
apiserver_cache_list_returned_objects_total
Number of objects returned for a LIST request from watch cache
- Stability Level:ALPHA
- Type: Counter
- Labels:resource_prefix
apiserver_cache_list_total
Number of LIST requests served from watch cache
- Stability Level:ALPHA
- Type: Counter
- Labels:indexresource_prefix
apiserver_certificates_registry_csr_honored_duration_total
Total number of issued CSRs with a requested duration that was honored, sliced by signer (only kubernetes.io signer names are specifically identified)
- Stability Level:ALPHA
- Type: Counter
- Labels:signerName
apiserver_certificates_registry_csr_requested_duration_total
Total number of issued CSRs with a requested duration, sliced by signer (only kubernetes.io signer names are specifically identified)
- Stability Level:ALPHA
- Type: Counter
- Labels:signerName
apiserver_client_certificate_expiration_seconds
Distribution of the remaining lifetime on the certificate used to authenticate a request.
- Stability Level:ALPHA
- Type: Histogram
apiserver_clusterip_repair_ip_errors_total
Number of errors detected on clusterips by the repair loop broken down by type of error: leak, repair, full, outOfRange, duplicate, unknown, invalid
- Stability Level:ALPHA
- Type: Counter
- Labels:type
apiserver_clusterip_repair_reconcile_errors_total
Number of reconciliation failures on the clusterip repair reconcile loop
- Stability Level:ALPHA
- Type: Counter
apiserver_conversion_webhook_duration_seconds
Conversion webhook request latency
- Stability Level:ALPHA
- Type: Histogram
- Labels:failure_typeresult
apiserver_conversion_webhook_request_total
Counter for conversion webhook requests with success/failure and failure error type
- Stability Level:ALPHA
- Type: Counter
- Labels:failure_typeresult
apiserver_crd_conversion_webhook_duration_seconds
CRD webhook conversion duration in seconds
- Stability Level:ALPHA
- Type: Histogram
- Labels:crd_namefrom_versionsucceededto_version
apiserver_current_inqueue_requests
Maximal number of queued requests in this apiserver per request kind in last second.
- Stability Level:ALPHA
- Type: Gauge
- Labels:request_kind
apiserver_delegated_authn_request_duration_seconds
Request latency in seconds. Broken down by status code.
- Stability Level:ALPHA
- Type: Histogram
- Labels:code
apiserver_delegated_authn_request_total
Number of HTTP requests partitioned by status code.
- Stability Level:ALPHA
- Type: Counter
- Labels:code
apiserver_delegated_authz_request_duration_seconds
Request latency in seconds. Broken down by status code.
- Stability Level:ALPHA
- Type: Histogram
- Labels:code
apiserver_delegated_authz_request_total
Number of HTTP requests partitioned by status code.
- Stability Level:ALPHA
- Type: Counter
- Labels:code
apiserver_egress_dialer_dial_duration_seconds
Dial latency histogram in seconds, labeled by the protocol (http-connect or grpc), transport (tcp or uds)
- Stability Level:ALPHA
- Type: Histogram
- Labels:protocoltransport
apiserver_egress_dialer_dial_failure_count
Dial failure count, labeled by the protocol (http-connect or grpc), transport (tcp or uds), and stage (connect or proxy). The stage indicates at which stage the dial failed
- Stability Level:ALPHA
- Type: Counter
- Labels:protocolstagetransport
apiserver_egress_dialer_dial_start_total
Dial starts, labeled by the protocol (http-connect or grpc) and transport (tcp or uds).
- Stability Level:ALPHA
- Type: Counter
- Labels:protocoltransport
apiserver_encryption_config_controller_automatic_reload_failures_total
Total number of failed automatic reloads of encryption configuration split by apiserver identity.
- Stability Level:ALPHA
- Type: Counter
- Labels:apiserver_id_hash
- Deprecated Versions:1.30.0
apiserver_encryption_config_controller_automatic_reload_last_timestamp_seconds
Timestamp of the last successful or failed automatic reload of encryption configuration split by apiserver identity.
- Stability Level:ALPHA
- Type: Gauge
- Labels:apiserver_id_hashstatus
apiserver_encryption_config_controller_automatic_reload_success_total
Total number of successful automatic reloads of encryption configuration split by apiserver identity.
- Stability Level:ALPHA
- Type: Counter
- Labels:apiserver_id_hash
- Deprecated Versions:1.30.0
apiserver_encryption_config_controller_automatic_reloads_total
Total number of reload successes and failures of encryption configuration split by apiserver identity.
- Stability Level:ALPHA
- Type: Counter
- Labels:apiserver_id_hashstatus
apiserver_envelope_encryption_dek_cache_fill_percent
Percent of the cache slots currently occupied by cached DEKs.
- Stability Level:ALPHA
- Type: Gauge
apiserver_envelope_encryption_dek_cache_inter_arrival_time_seconds
Time (in seconds) of inter arrival of transformation requests.
- Stability Level:ALPHA
- Type: Histogram
- Labels:transformation_type
apiserver_envelope_encryption_dek_source_cache_size
Number of records in data encryption key (DEK) source cache. On a restart, this value is an approximation of the number of decrypt RPC calls the server will make to the KMS plugin.
- Stability Level:ALPHA
- Type: Gauge
- Labels:provider_name
apiserver_envelope_encryption_invalid_key_id_from_status_total
Number of times an invalid keyID is returned by the Status RPC call split by error.
- Stability Level:ALPHA
- Type: Counter
- Labels:errorprovider_name
apiserver_envelope_encryption_key_id_hash_last_timestamp_seconds
The last time in seconds when a keyID was used.
- Stability Level:ALPHA
- Type: Gauge
- Labels:apiserver_id_hashkey_id_hashprovider_nametransformation_type
apiserver_envelope_encryption_key_id_hash_status_last_timestamp_seconds
The last time in seconds when a keyID was returned by the Status RPC call.
- Stability Level:ALPHA
- Type: Gauge
- Labels:apiserver_id_hashkey_id_hashprovider_name
apiserver_envelope_encryption_key_id_hash_total
Number of times a keyID is used split by transformation type, provider, and apiserver identity.
- Stability Level:ALPHA
- Type: Counter
- Labels:apiserver_id_hashkey_id_hashprovider_nametransformation_type
apiserver_envelope_encryption_kms_operations_latency_seconds
KMS operation duration with gRPC error code status total.
- Stability Level:ALPHA
- Type: Histogram
- Labels:grpc_status_codemethod_nameprovider_name
apiserver_externaljwt_fetch_keys_data_timestamp
Unix Timestamp in seconds of the last successful FetchKeys data_timestamp value returned by the external signer
- Stability Level:ALPHA
- Type: Gauge
apiserver_externaljwt_fetch_keys_request_total
Total attempts at syncing supported JWKs
- Stability Level:ALPHA
- Type: Counter
- Labels:code
apiserver_externaljwt_fetch_keys_success_timestamp
Unix Timestamp in seconds of the last successful FetchKeys request
- Stability Level:ALPHA
- Type: Gauge
apiserver_externaljwt_request_duration_seconds
Request duration and time for calls to external-jwt-signer
- Stability Level:ALPHA
- Type: Histogram
- Labels:codemethod
apiserver_externaljwt_sign_request_total
Total attempts at signing JWT
- Stability Level:ALPHA
- Type: Counter
- Labels:code
apiserver_flowcontrol_current_inqueue_seats
Number of seats currently pending in queues of the API Priority and Fairness subsystem
- Stability Level:ALPHA
- Type: Gauge
- Labels:flow_schemapriority_level
apiserver_flowcontrol_current_limit_seats
current derived number of execution seats available to each priority level
- Stability Level:ALPHA
- Type: Gauge
- Labels:priority_level
apiserver_flowcontrol_current_r
R(time of last change)
- Stability Level:ALPHA
- Type: Gauge
- Labels:priority_level
apiserver_flowcontrol_demand_seats
Observations, at the end of every nanosecond, of (the number of seats each priority level could use) / (nominal number of seats for that level)
- Stability Level:ALPHA
- Type: TimingRatioHistogram
- Labels:priority_level
apiserver_flowcontrol_demand_seats_average
Time-weighted average, over last adjustment period, of demand_seats
- Stability Level:ALPHA
- Type: Gauge
- Labels:priority_level
apiserver_flowcontrol_demand_seats_high_watermark
High watermark, over last adjustment period, of demand_seats
- Stability Level:ALPHA
- Type: Gauge
- Labels:priority_level
apiserver_flowcontrol_demand_seats_smoothed
Smoothed seat demands
- Stability Level:ALPHA
- Type: Gauge
- Labels:priority_level
apiserver_flowcontrol_demand_seats_stdev
Time-weighted standard deviation, over last adjustment period, of demand_seats
- Stability Level:ALPHA
- Type: Gauge
- Labels:priority_level
apiserver_flowcontrol_dispatch_r
R(time of last dispatch)
- Stability Level:ALPHA
- Type: Gauge
- Labels:priority_level
apiserver_flowcontrol_epoch_advance_total
Number of times the queueset’s progress meter jumped backward
- Stability Level:ALPHA
- Type: Counter
- Labels:priority_levelsuccess
apiserver_flowcontrol_latest_s
S(most recently dispatched request)
- Stability Level:ALPHA
- Type: Gauge
- Labels:priority_level
apiserver_flowcontrol_lower_limit_seats
Configured lower bound on number of execution seats available to each priority level
- Stability Level:ALPHA
- Type: Gauge
- Labels:priority_level
apiserver_flowcontrol_next_discounted_s_bounds
min and max, over queues, of S(oldest waiting request in queue) - estimated work in progress
- Stability Level:ALPHA
- Type: Gauge
- Labels:boundpriority_level
apiserver_flowcontrol_next_s_bounds
min and max, over queues, of S(oldest waiting request in queue)
- Stability Level:ALPHA
- Type: Gauge
- Labels:boundpriority_level
apiserver_flowcontrol_priority_level_request_utilization
Observations, at the end of every nanosecond, of number of requests (as a fraction of the relevant limit) waiting or in any stage of execution (but only initial stage for WATCHes)
- Stability Level:ALPHA
- Type: TimingRatioHistogram
- Labels:phasepriority_level
apiserver_flowcontrol_priority_level_seat_utilization
Observations, at the end of every nanosecond, of utilization of seats for any stage of execution (but only initial stage for WATCHes)
- Stability Level:ALPHA
- Type: TimingRatioHistogram
- Labels:priority_level
- Const Labels:phase:executing
apiserver_flowcontrol_read_vs_write_current_requests
Observations, at the end of every nanosecond, of the number of requests (as a fraction of the relevant limit) waiting or in regular stage of execution
- Stability Level:ALPHA
- Type: TimingRatioHistogram
- Labels:phaserequest_kind
apiserver_flowcontrol_request_concurrency_in_use
Concurrency (number of seats) occupied by the currently executing (initial stage for a WATCH, any stage otherwise) requests in the API Priority and Fairness subsystem
- Stability Level:ALPHA
- Type: Gauge
- Labels:flow_schemapriority_level
- Deprecated Versions:1.31.0
apiserver_flowcontrol_request_concurrency_limit
Nominal number of execution seats configured for each priority level
- Stability Level:ALPHA
- Type: Gauge
- Labels:priority_level
- Deprecated Versions:1.30.0
apiserver_flowcontrol_request_dispatch_no_accommodation_total
Number of times a dispatch attempt resulted in a non accommodation due to lack of available seats
- Stability Level:ALPHA
- Type: Counter
- Labels:flow_schemapriority_level
apiserver_flowcontrol_request_execution_seconds
Duration of initial stage (for a WATCH) or any (for a non-WATCH) stage of request execution in the API Priority and Fairness subsystem
- Stability Level:ALPHA
- Type: Histogram
- Labels:flow_schemapriority_leveltype
apiserver_flowcontrol_request_queue_length_after_enqueue
Length of queue in the API Priority and Fairness subsystem, as seen by each request after it is enqueued
- Stability Level:ALPHA
- Type: Histogram
- Labels:flow_schemapriority_level
apiserver_flowcontrol_seat_fair_frac
Fair fraction of server’s concurrency to allocate to each priority level that can use it
- Stability Level:ALPHA
- Type: Gauge
apiserver_flowcontrol_target_seats
Seat allocation targets
- Stability Level:ALPHA
- Type: Gauge
- Labels:priority_level
apiserver_flowcontrol_upper_limit_seats
Configured upper bound on number of execution seats available to each priority level
- Stability Level:ALPHA
- Type: Gauge
- Labels:priority_level
apiserver_flowcontrol_watch_count_samples
count of watchers for mutating requests in API Priority and Fairness
- Stability Level:ALPHA
- Type: Histogram
- Labels:flow_schemapriority_level
apiserver_flowcontrol_work_estimated_seats
Number of estimated seats (maximum of initial and final seats) associated with requests in API Priority and Fairness
- Stability Level:ALPHA
- Type: Histogram
- Labels:flow_schemapriority_level
apiserver_init_events_total
Counter of init events processed in watch cache broken by resource type.
- Stability Level:ALPHA
- Type: Counter
- Labels:resource
apiserver_kube_aggregator_x509_insecure_sha1_total
Counts the number of requests to servers with insecure SHA1 signatures in their serving certificate OR the number of connection failures due to the insecure SHA1 signatures (either/or, based on the runtime environment)
- Stability Level:ALPHA
- Type: Counter
apiserver_kube_aggregator_x509_missing_san_total
Counts the number of requests to servers missing SAN extension in their serving certificate OR the number of connection failures due to the lack of x509 certificate SAN extension missing (either/or, based on the runtime environment)
- Stability Level:ALPHA
- Type: Counter
apiserver_nodeport_repair_port_errors_total
Number of errors detected on ports by the repair loop broken down by type of error: leak, repair, full, outOfRange, duplicate, unknown
- Stability Level:ALPHA
- Type: Counter
- Labels:type
apiserver_nodeport_repair_reconcile_errors_total
Number of reconciliation failures on the nodeport repair reconcile loop
- Stability Level:ALPHA
- Type: Counter
apiserver_request_aborts_total
Number of requests which apiserver aborted possibly due to a timeout, for each group, version, verb, resource, subresource and scope
- Stability Level:ALPHA
- Type: Counter
- Labels:groupresourcescopesubresourceverbversion
apiserver_request_body_size_bytes
Apiserver request body size in bytes broken out by resource and verb.
- Stability Level:ALPHA
- Type: Histogram
- Labels:resourceverb
apiserver_request_filter_duration_seconds
Request filter latency distribution in seconds, for each filter type
- Stability Level:ALPHA
- Type: Histogram
- Labels:filter
apiserver_request_post_timeout_total
Tracks the activity of the request handlers after the associated requests have been timed out by the apiserver
- Stability Level:ALPHA
- Type: Counter
- Labels:sourcestatus
apiserver_request_sli_duration_seconds
Response latency distribution (not counting webhook duration and priority & fairness queue wait times) in seconds for each verb, group, version, resource, subresource, scope and component.
- Stability Level:ALPHA
- Type: Histogram
- Labels:componentgroupresourcescopesubresourceverbversion
apiserver_request_slo_duration_seconds
Response latency distribution (not counting webhook duration and priority & fairness queue wait times) in seconds for each verb, group, version, resource, subresource, scope and component.
- Stability Level:ALPHA
- Type: Histogram
- Labels:componentgroupresourcescopesubresourceverbversion
- Deprecated Versions:1.27.0
apiserver_request_terminations_total
Number of requests which apiserver terminated in self-defense.
- Stability Level:ALPHA
- Type: Counter
- Labels:codecomponentgroupresourcescopesubresourceverbversion
apiserver_request_timestamp_comparison_time
Time taken for comparison of old vs new objects in UPDATE or PATCH requests
- Stability Level:ALPHA
- Type: Histogram
- Labels:code_path
apiserver_rerouted_request_total
Total number of requests that were proxied to a peer kube apiserver because the local apiserver was not capable of serving it
- Stability Level:ALPHA
- Type: Counter
- Labels:code
apiserver_selfrequest_total
Counter of apiserver self-requests broken out for each verb, API resource and subresource.
- Stability Level:ALPHA
- Type: Counter
- Labels:resourcesubresourceverb
apiserver_storage_data_key_generation_duration_seconds
Latencies in seconds of data encryption key(DEK) generation operations.
- Stability Level:ALPHA
- Type: Histogram
apiserver_storage_data_key_generation_failures_total
Total number of failed data encryption key(DEK) generation operations.
- Stability Level:ALPHA
- Type: Counter
apiserver_storage_db_total_size_in_bytes
Total size of the storage database file physically allocated in bytes.
- Stability Level:ALPHA
- Type: Gauge
- Labels:endpoint
- Deprecated Versions:1.28.0
apiserver_storage_decode_errors_total
Number of stored object decode errors split by object type
- Stability Level:ALPHA
- Type: Counter
- Labels:resource
apiserver_storage_envelope_transformation_cache_misses_total
Total number of cache misses while accessing key decryption key(KEK).
- Stability Level:ALPHA
- Type: Counter
apiserver_storage_events_received_total
Number of etcd events received split by kind.
- Stability Level:ALPHA
- Type: Counter
- Labels:resource
apiserver_storage_list_evaluated_objects_total
Number of objects tested in the course of serving a LIST request from storage
- Stability Level:ALPHA
- Type: Counter
- Labels:resource
apiserver_storage_list_fetched_objects_total
Number of objects read from storage in the course of serving a LIST request
- Stability Level:ALPHA
- Type: Counter
- Labels:resource
apiserver_storage_list_returned_objects_total
Number of objects returned for a LIST request from storage
- Stability Level:ALPHA
- Type: Counter
- Labels:resource
apiserver_storage_list_total
Number of LIST requests served from storage
- Stability Level:ALPHA
- Type: Counter
- Labels:resource
apiserver_storage_transformation_duration_seconds
Latencies in seconds of value transformation operations.
- Stability Level:ALPHA
- Type: Histogram
- Labels:transformation_typetransformer_prefix
apiserver_storage_transformation_operations_total
Total number of transformations. Successful transformation will have a status ‘OK’ and a varied status string when the transformation fails. The status, resource, and transformation_type fields can be used for alerting purposes. For example, you can monitor for encryption/decryption failures using the transformation_type (e.g., from_storage for decryption and to_storage for encryption). Additionally, these fields can be used to ensure that the correct transformers are applied to each resource.
- Stability Level:ALPHA
- Type: Counter
- Labels:resourcestatustransformation_typetransformer_prefix
apiserver_stream_translator_requests_total
Total number of requests that were handled by the StreamTranslatorProxy, which processes streaming RemoteCommand/V5
- Stability Level:ALPHA
- Type: Counter
- Labels:code
apiserver_stream_tunnel_requests_total
Total number of requests that were handled by the StreamTunnelProxy, which processes streaming PortForward/V2
- Stability Level:ALPHA
- Type: Counter
- Labels:code
apiserver_terminated_watchers_total
Counter of watchers closed due to unresponsiveness broken by resource type.
- Stability Level:ALPHA
- Type: Counter
- Labels:resource
apiserver_tls_handshake_errors_total
Number of requests dropped with ‘TLS handshake error from’ error
- Stability Level:ALPHA
- Type: Counter
apiserver_watch_cache_consistent_read_total
Counter for consistent reads from cache.
- Stability Level:ALPHA
- Type: Counter
- Labels:fallbackresourcesuccess
apiserver_watch_cache_events_dispatched_total
Counter of events dispatched in watch cache broken by resource type.
- Stability Level:ALPHA
- Type: Counter
- Labels:resource
apiserver_watch_cache_events_received_total
Counter of events received in watch cache broken by resource type.
- Stability Level:ALPHA
- Type: Counter
- Labels:resource
apiserver_watch_cache_initializations_total
Counter of watch cache initializations broken by resource type.
- Stability Level:ALPHA
- Type: Counter
- Labels:resource
apiserver_watch_cache_read_wait_seconds
Histogram of time spent waiting for a watch cache to become fresh.
- Stability Level:ALPHA
- Type: Histogram
- Labels:resource
apiserver_watch_cache_resource_version
Current resource version of watch cache broken by resource type.
- Stability Level:ALPHA
- Type: Gauge
- Labels:resource
apiserver_watch_events_sizes
Watch event size distribution in bytes
- Stability Level:ALPHA
- Type: Histogram
- Labels:groupkindversion
apiserver_watch_events_total
Number of events sent in watch clients
- Stability Level:ALPHA
- Type: Counter
- Labels:groupkindversion
apiserver_watch_list_duration_seconds
Response latency distribution in seconds for watch list requests broken by group, version, resource and scope.
- Stability Level:ALPHA
- Type: Histogram
- Labels:groupresourcescopeversion
apiserver_webhooks_x509_insecure_sha1_total
Counts the number of requests to servers with insecure SHA1 signatures in their serving certificate OR the number of connection failures due to the insecure SHA1 signatures (either/or, based on the runtime environment)
- Stability Level:ALPHA
- Type: Counter
apiserver_webhooks_x509_missing_san_total
Counts the number of requests to servers missing SAN extension in their serving certificate OR the number of connection failures due to the lack of x509 certificate SAN extension missing (either/or, based on the runtime environment)
- Stability Level:ALPHA
- Type: Counter
attach_detach_controller_attachdetach_controller_forced_detaches
Number of times the A/D Controller performed a forced detach
- Stability Level:ALPHA
- Type: Counter
- Labels:reason
attachdetach_controller_total_volumes
Number of volumes in A/D Controller
- Stability Level:ALPHA
- Type: Custom
- Labels:plugin_namestate
authenticated_user_requests
Counter of authenticated requests broken out by username.
- Stability Level:ALPHA
- Type: Counter
- Labels:username
authentication_attempts
Counter of authenticated attempts.
- Stability Level:ALPHA
- Type: Counter
- Labels:result
authentication_duration_seconds
Authentication duration in seconds broken out by result.
- Stability Level:ALPHA
- Type: Histogram
- Labels:result
authentication_token_cache_active_fetch_count
- Stability Level:ALPHA
- Type: Gauge
- Labels:status
authentication_token_cache_fetch_total
- Stability Level:ALPHA
- Type: Counter
- Labels:status
authentication_token_cache_request_duration_seconds
- Stability Level:ALPHA
- Type: Histogram
- Labels:status
authentication_token_cache_request_total
- Stability Level:ALPHA
- Type: Counter
- Labels:status
authorization_attempts_total
Counter of authorization attempts broken down by result. It can be either ‘allowed’, ‘denied’, ‘no-opinion’ or ‘error’.
- Stability Level:ALPHA
- Type: Counter
- Labels:result
authorization_duration_seconds
Authorization duration in seconds broken out by result.
- Stability Level:ALPHA
- Type: Histogram
- Labels:result
cloud_provider_webhook_request_duration_seconds
Request latency in seconds. Broken down by status code.
- Stability Level:ALPHA
- Type: Histogram
- Labels:codewebhook
cloud_provider_webhook_request_total
Number of HTTP requests partitioned by status code.
- Stability Level:ALPHA
- Type: Counter
- Labels:codewebhook
clustertrustbundle_publisher_sync_duration_seconds
The time it took to sync a cluster trust bundle.
- Stability Level:ALPHA
- Type: Histogram
- Labels:code
clustertrustbundle_publisher_sync_total
Number of syncs that occurred in cluster trust bundle publisher.
- Stability Level:ALPHA
- Type: Counter
- Labels:code
container_swap_usage_bytes
Current amount of the container swap usage in bytes. Reported only on non-windows systems
- Stability Level:ALPHA
- Type: Custom
- Labels:containerpodnamespace
csi_operations_seconds
Container Storage Interface operation duration with gRPC error code status total
- Stability Level:ALPHA
- Type: Histogram
- Labels:driver_namegrpc_status_codemethod_namemigrated
dra_grpc_operations_duration_seconds
Duration in seconds of the DRA gRPC operations
- Stability Level:ALPHA
- Type: Histogram
- Labels:driver_namegrpc_status_codemethod_name
dra_operations_duration_seconds
Latency histogram in seconds for the duration of handling all ResourceClaims referenced by a pod when the pod starts or stops. Identified by the name of the operation (PrepareResources or UnprepareResources) and separated by the success of the operation. The number of failed operations is provided through the histogram’s overall count.
- Stability Level:ALPHA
- Type: Histogram
- Labels:is_erroroperation_name
endpoint_slice_controller_changes
Number of EndpointSlice changes
- Stability Level:ALPHA
- Type: Counter
- Labels:operation
endpoint_slice_controller_desired_endpoint_slices
Number of EndpointSlices that would exist with perfect endpoint allocation
- Stability Level:ALPHA
- Type: Gauge
endpoint_slice_controller_endpoints_added_per_sync
Number of endpoints added on each Service sync
- Stability Level:ALPHA
- Type: Histogram
endpoint_slice_controller_endpoints_desired
Number of endpoints desired
- Stability Level:ALPHA
- Type: Gauge
endpoint_slice_controller_endpoints_removed_per_sync
Number of endpoints removed on each Service sync
- Stability Level:ALPHA
- Type: Histogram
endpoint_slice_controller_endpointslices_changed_per_sync
Number of EndpointSlices changed on each Service sync
- Stability Level:ALPHA
- Type: Histogram
- Labels:topologytraffic_distribution
endpoint_slice_controller_num_endpoint_slices
Number of EndpointSlices
- Stability Level:ALPHA
- Type: Gauge
endpoint_slice_controller_services_count_by_traffic_distribution
Number of Services using some specific trafficDistribution
- Stability Level:ALPHA
- Type: Gauge
- Labels:traffic_distribution
endpoint_slice_controller_syncs
Number of EndpointSlice syncs
- Stability Level:ALPHA
- Type: Counter
- Labels:result
endpoint_slice_mirroring_controller_addresses_skipped_per_sync
Number of addresses skipped on each Endpoints sync due to being invalid or exceeding MaxEndpointsPerSubset
- Stability Level:ALPHA
- Type: Histogram
endpoint_slice_mirroring_controller_changes
Number of EndpointSlice changes
- Stability Level:ALPHA
- Type: Counter
- Labels:operation
endpoint_slice_mirroring_controller_desired_endpoint_slices
Number of EndpointSlices that would exist with perfect endpoint allocation
- Stability Level:ALPHA
- Type: Gauge
endpoint_slice_mirroring_controller_endpoints_added_per_sync
Number of endpoints added on each Endpoints sync
- Stability Level:ALPHA
- Type: Histogram
endpoint_slice_mirroring_controller_endpoints_desired
Number of endpoints desired
- Stability Level:ALPHA
- Type: Gauge
endpoint_slice_mirroring_controller_endpoints_removed_per_sync
Number of endpoints removed on each Endpoints sync
- Stability Level:ALPHA
- Type: Histogram
endpoint_slice_mirroring_controller_endpoints_sync_duration
Duration of syncEndpoints() in seconds
- Stability Level:ALPHA
- Type: Histogram
endpoint_slice_mirroring_controller_endpoints_updated_per_sync
Number of endpoints updated on each Endpoints sync
- Stability Level:ALPHA
- Type: Histogram
endpoint_slice_mirroring_controller_num_endpoint_slices
Number of EndpointSlices
- Stability Level:ALPHA
- Type: Gauge
ephemeral_volume_controller_create_failures_total
Number of PersistentVolumeClaim creation requests
- Stability Level:ALPHA
- Type: Counter
ephemeral_volume_controller_create_total
Number of PersistentVolumeClaim creation requests
- Stability Level:ALPHA
- Type: Counter
etcd_bookmark_counts
Number of etcd bookmarks (progress notify events) split by kind.
- Stability Level:ALPHA
- Type: Gauge
- Labels:resource
etcd_lease_object_counts
Number of objects attached to a single etcd lease.
- Stability Level:ALPHA
- Type: Histogram
etcd_request_duration_seconds
Etcd request latency in seconds for each operation and object type.
- Stability Level:ALPHA
- Type: Histogram
- Labels:operationtype
etcd_request_errors_total
Etcd failed request counts for each operation and object type.
- Stability Level:ALPHA
- Type: Counter
- Labels:operationtype
etcd_requests_total
Etcd request counts for each operation and object type.
- Stability Level:ALPHA
- Type: Counter
- Labels:operationtype
etcd_version_info
Etcd server’s binary version
- Stability Level:ALPHA
- Type: Gauge
- Labels:binary_version
field_validation_request_duration_seconds
Response latency distribution in seconds for each field validation value
- Stability Level:ALPHA
- Type: Histogram
- Labels:field_validation
force_cleaned_failed_volume_operation_errors_total
The number of volumes that failed force cleanup after their reconstruction failed during kubelet startup.
- Stability Level:ALPHA
- Type: Counter
force_cleaned_failed_volume_operations_total
The number of volumes that were force cleaned after their reconstruction failed during kubelet startup. This includes both successful and failed cleanups.
- Stability Level:ALPHA
- Type: Counter
garbagecollector_controller_resources_sync_error_total
Number of garbage collector resources sync errors
- Stability Level:ALPHA
- Type: Counter
horizontal_pod_autoscaler_controller_metric_computation_duration_seconds
The time(seconds) that the HPA controller takes to calculate one metric. The label ‘action’ should be either ‘scale_down’, ‘scale_up’, or ‘none’. The label ‘error’ should be either ‘spec’, ‘internal’, or ‘none’. The label ‘metric_type’ corresponds to HPA.spec.metrics[*].type
- Stability Level:ALPHA
- Type: Histogram
- Labels:actionerrormetric_type
horizontal_pod_autoscaler_controller_metric_computation_total
Number of metric computations. The label ‘action’ should be either ‘scale_down’, ‘scale_up’, or ‘none’. Also, the label ‘error’ should be either ‘spec’, ‘internal’, or ‘none’. The label ‘metric_type’ corresponds to HPA.spec.metrics[*].type
- Stability Level:ALPHA
- Type: Counter
- Labels:actionerrormetric_type
horizontal_pod_autoscaler_controller_reconciliation_duration_seconds
The time(seconds) that the HPA controller takes to reconcile once. The label ‘action’ should be either ‘scale_down’, ‘scale_up’, or ‘none’. Also, the label ‘error’ should be either ‘spec’, ‘internal’, or ‘none’. Note that if both spec and internal errors happen during a reconciliation, the first one to occur is reported in `error` label.
- Stability Level:ALPHA
- Type: Histogram
- Labels:actionerror
horizontal_pod_autoscaler_controller_reconciliations_total
Number of reconciliations of HPA controller. The label ‘action’ should be either ‘scale_down’, ‘scale_up’, or ‘none’. Also, the label ‘error’ should be either ‘spec’, ‘internal’, or ‘none’. Note that if both spec and internal errors happen during a reconciliation, the first one to occur is reported in `error` label.
- Stability Level:ALPHA
- Type: Counter
- Labels:actionerror
job_controller_job_finished_indexes_total
`The number of finished indexes. Possible values for the, status label are: “succeeded”, “failed”. Possible values for the, backoffLimit label are: “perIndex” and “global”`
- Stability Level:ALPHA
- Type: Counter
- Labels:backoffLimitstatus
job_controller_job_pods_creation_total
`The number of Pods created by the Job controller labelled with a reason for the Pod creation., This metric also distinguishes between Pods created using different PodReplacementPolicy settings., Possible values of the “reason” label are:, “new”, “recreate_terminating_or_failed”, “recreate_failed”., Possible values of the “status” label are:, “succeeded”, “failed”.`
- Stability Level:ALPHA
- Type: Counter
- Labels:reasonstatus
job_controller_jobs_by_external_controller_total
The number of Jobs managed by an external controller
- Stability Level:ALPHA
- Type: Counter
- Labels:controller_name
job_controller_pod_failures_handled_by_failure_policy_total
`The number of failed Pods handled by failure policy with, respect to the failure policy action applied based on the matched, rule. Possible values of the action label correspond to the, possible values for the failure policy rule action, which are:, “FailJob”, “Ignore” and “Count”.`
- Stability Level:ALPHA
- Type: Counter
- Labels:action
job_controller_terminated_pods_tracking_finalizer_total
`The number of terminated pods (phase=Failed|Succeeded), that have the finalizer batch.kubernetes.io/job-tracking, The event label can be “add” or “delete”.`
- Stability Level:ALPHA
- Type: Counter
- Labels:event
kube_apiserver_clusterip_allocator_allocated_ips
Gauge measuring the number of allocated IPs for Services
- Stability Level:ALPHA
- Type: Gauge
- Labels:cidr
kube_apiserver_clusterip_allocator_allocation_duration_seconds
Duration in seconds to allocate a Cluster IP by ServiceCIDR
- Stability Level:ALPHA
- Type: Histogram
- Labels:cidr
kube_apiserver_clusterip_allocator_allocation_errors_total
Number of errors trying to allocate Cluster IPs
- Stability Level:ALPHA
- Type: Counter
- Labels:cidrscope
kube_apiserver_clusterip_allocator_allocation_total
Number of Cluster IPs allocations
- Stability Level:ALPHA
- Type: Counter
- Labels:cidrscope
kube_apiserver_clusterip_allocator_available_ips
Gauge measuring the number of available IPs for Services
- Stability Level:ALPHA
- Type: Gauge
- Labels:cidr
kube_apiserver_nodeport_allocator_allocated_ports
Gauge measuring the number of allocated NodePorts for Services
- Stability Level:ALPHA
- Type: Gauge
kube_apiserver_nodeport_allocator_allocation_errors_total
Number of errors trying to allocate NodePort
- Stability Level:ALPHA
- Type: Counter
- Labels:scope
kube_apiserver_nodeport_allocator_allocation_total
Number of NodePort allocations
- Stability Level:ALPHA
- Type: Counter
- Labels:scope
kube_apiserver_nodeport_allocator_available_ports
Gauge measuring the number of available NodePorts for Services
- Stability Level:ALPHA
- Type: Gauge
kube_apiserver_pod_logs_backend_tls_failure_total
Total number of requests for pods/logs that failed due to kubelet server TLS verification
- Stability Level:ALPHA
- Type: Counter
kube_apiserver_pod_logs_insecure_backend_total
Total number of requests for pods/logs sliced by usage type: enforce_tls, skip_tls_allowed, skip_tls_denied
- Stability Level:ALPHA
- Type: Counter
- Labels:usage
kube_apiserver_pod_logs_pods_logs_backend_tls_failure_total
Total number of requests for pods/logs that failed due to kubelet server TLS verification
- Stability Level:ALPHA
- Type: Counter
- Deprecated Versions:1.27.0
kube_apiserver_pod_logs_pods_logs_insecure_backend_total
Total number of requests for pods/logs sliced by usage type: enforce_tls, skip_tls_allowed, skip_tls_denied
- Stability Level:ALPHA
- Type: Counter
- Labels:usage
- Deprecated Versions:1.27.0
kubelet_active_pods
The number of pods the kubelet considers active and which are being considered when admitting new pods. static is true if the pod is not from the apiserver.
- Stability Level:ALPHA
- Type: Gauge
- Labels:static
kubelet_admission_rejections_total
Cumulative number pod admission rejections by the Kubelet.
- Stability Level:ALPHA
- Type: Counter
- Labels:reason
kubelet_certificate_manager_client_expiration_renew_errors
Counter of certificate renewal errors.
- Stability Level:ALPHA
- Type: Counter
kubelet_certificate_manager_client_ttl_seconds
Gauge of the TTL (time-to-live) of the Kubelet’s client certificate. The value is in seconds until certificate expiry (negative if already expired). If client certificate is invalid or unused, the value will be +INF.
- Stability Level:ALPHA
- Type: Gauge
kubelet_certificate_manager_server_rotation_seconds
Histogram of the number of seconds the previous certificate lived before being rotated.
- Stability Level:ALPHA
- Type: Histogram
kubelet_certificate_manager_server_ttl_seconds
Gauge of the shortest TTL (time-to-live) of the Kubelet’s serving certificate. The value is in seconds until certificate expiry (negative if already expired). If serving certificate is invalid or unused, the value will be +INF.
- Stability Level:ALPHA
- Type: Gauge
kubelet_cgroup_manager_duration_seconds
Duration in seconds for cgroup manager operations. Broken down by method.
- Stability Level:ALPHA
- Type: Histogram
- Labels:operation_type
kubelet_cgroup_version
cgroup version on the hosts.
- Stability Level:ALPHA
- Type: Gauge
kubelet_container_aligned_compute_resources_count
Cumulative number of aligned compute resources allocated to containers by alignment type.
- Stability Level:ALPHA
- Type: Counter
- Labels:boundaryscope
kubelet_container_log_filesystem_used_bytes
Bytes used by the container’s logs on the filesystem.
- Stability Level:ALPHA
- Type: Custom
- Labels:uidnamespacepodcontainer
kubelet_containers_per_pod_count
The number of containers per pod.
- Stability Level:ALPHA
- Type: Histogram
kubelet_cpu_manager_exclusive_cpu_allocation_count
The total number of CPUs exclusively allocated to containers running on this node
- Stability Level:ALPHA
- Type: Gauge
kubelet_cpu_manager_pinning_errors_total
The number of cpu core allocations which required pinning failed.
- Stability Level:ALPHA
- Type: Counter
kubelet_cpu_manager_pinning_requests_total
The number of cpu core allocations which required pinning.
- Stability Level:ALPHA
- Type: Counter
kubelet_cpu_manager_shared_pool_size_millicores
The size of the shared CPU pool for non-guaranteed QoS pods, in millicores.
- Stability Level:ALPHA
- Type: Gauge
kubelet_credential_provider_plugin_duration
Duration of execution in seconds for credential provider plugin
- Stability Level:ALPHA
- Type: Histogram
- Labels:plugin_name
kubelet_credential_provider_plugin_errors
Number of errors from credential provider plugin
- Stability Level:ALPHA
- Type: Counter
- Labels:plugin_name
kubelet_desired_pods
The number of pods the kubelet is being instructed to run. static is true if the pod is not from the apiserver.
- Stability Level:ALPHA
- Type: Gauge
- Labels:static
kubelet_device_plugin_alloc_duration_seconds
Duration in seconds to serve a device plugin Allocation request. Broken down by resource name.
- Stability Level:ALPHA
- Type: Histogram
- Labels:resource_name
kubelet_device_plugin_registration_total
Cumulative number of device plugin registrations. Broken down by resource name.
- Stability Level:ALPHA
- Type: Counter
- Labels:resource_name
kubelet_evented_pleg_connection_error_count
The number of errors encountered during the establishment of streaming connection with the CRI runtime.
- Stability Level:ALPHA
- Type: Counter
kubelet_evented_pleg_connection_latency_seconds
The latency of streaming connection with the CRI runtime, measured in seconds.
- Stability Level:ALPHA
- Type: Histogram
kubelet_evented_pleg_connection_success_count
The number of times a streaming client was obtained to receive CRI Events.
- Stability Level:ALPHA
- Type: Counter
kubelet_eviction_stats_age_seconds
Time between when stats are collected, and when pod is evicted based on those stats by eviction signal
- Stability Level:ALPHA
- Type: Histogram
- Labels:eviction_signal
kubelet_evictions
Cumulative number of pod evictions by eviction signal
- Stability Level:ALPHA
- Type: Counter
- Labels:eviction_signal
kubelet_graceful_shutdown_end_time_seconds
Last graceful shutdown end time since unix epoch in seconds
- Stability Level:ALPHA
- Type: Gauge
kubelet_graceful_shutdown_start_time_seconds
Last graceful shutdown start time since unix epoch in seconds
- Stability Level:ALPHA
- Type: Gauge
kubelet_http_inflight_requests
Number of the inflight http requests
- Stability Level:ALPHA
- Type: Gauge
- Labels:long_runningmethodpathserver_type
kubelet_http_requests_duration_seconds
Duration in seconds to serve http requests
- Stability Level:ALPHA
- Type: Histogram
- Labels:long_runningmethodpathserver_type
kubelet_http_requests_total
Number of the http requests received since the server started
- Stability Level:ALPHA
- Type: Counter
- Labels:long_runningmethodpathserver_type
kubelet_image_garbage_collected_total
Total number of images garbage collected by the kubelet, whether through disk usage or image age.
- Stability Level:ALPHA
- Type: Counter
- Labels:reason
kubelet_image_pull_duration_seconds
Duration in seconds to pull an image.
- Stability Level:ALPHA
- Type: Histogram
- Labels:image_size_in_bytes
kubelet_lifecycle_handler_http_fallbacks_total
The number of times lifecycle handlers successfully fell back to http from https.
- Stability Level:ALPHA
- Type: Counter
kubelet_managed_ephemeral_containers
Current number of ephemeral containers in pods managed by this kubelet.
- Stability Level:ALPHA
- Type: Gauge
kubelet_memory_manager_pinning_errors_total
The number of memory pages allocations which required pinning that failed.
- Stability Level:ALPHA
- Type: Counter
kubelet_memory_manager_pinning_requests_total
The number of memory pages allocations which required pinning.
- Stability Level:ALPHA
- Type: Counter
kubelet_mirror_pods
The number of mirror pods the kubelet will try to create (one per admitted static pod)
- Stability Level:ALPHA
- Type: Gauge
kubelet_node_name
The node’s name. The count is always 1.
- Stability Level:ALPHA
- Type: Gauge
- Labels:node
kubelet_node_startup_duration_seconds
Duration in seconds of node startup in total.
- Stability Level:ALPHA
- Type: Gauge
kubelet_node_startup_post_registration_duration_seconds
Duration in seconds of node startup after registration.
- Stability Level:ALPHA
- Type: Gauge
kubelet_node_startup_pre_kubelet_duration_seconds
Duration in seconds of node startup before kubelet starts.
- Stability Level:ALPHA
- Type: Gauge
kubelet_node_startup_pre_registration_duration_seconds
Duration in seconds of node startup before registration.
- Stability Level:ALPHA
- Type: Gauge
kubelet_node_startup_registration_duration_seconds
Duration in seconds of node startup during registration.
- Stability Level:ALPHA
- Type: Gauge
kubelet_orphan_pod_cleaned_volumes
The total number of orphaned Pods whose volumes were cleaned in the last periodic sweep.
- Stability Level:ALPHA
- Type: Gauge
kubelet_orphan_pod_cleaned_volumes_errors
The number of orphaned Pods whose volumes failed to be cleaned in the last periodic sweep.
- Stability Level:ALPHA
- Type: Gauge
kubelet_orphaned_runtime_pods_total
Number of pods that have been detected in the container runtime without being already known to the pod worker. This typically indicates the kubelet was restarted while a pod was force deleted in the API or in the local configuration, which is unusual.
- Stability Level:ALPHA
- Type: Counter
kubelet_pleg_discard_events
The number of discard events in PLEG.
- Stability Level:ALPHA
- Type: Counter
kubelet_pleg_last_seen_seconds
Timestamp in seconds when PLEG was last seen active.
- Stability Level:ALPHA
- Type: Gauge
kubelet_pleg_relist_duration_seconds
Duration in seconds for relisting pods in PLEG.
- Stability Level:ALPHA
- Type: Histogram
kubelet_pleg_relist_interval_seconds
Interval in seconds between relisting in PLEG.
- Stability Level:ALPHA
- Type: Histogram
kubelet_pod_resources_endpoint_errors_get
Number of requests to the PodResource Get endpoint which returned error. Broken down by server api version.
- Stability Level:ALPHA
- Type: Counter
- Labels:server_api_version
kubelet_pod_resources_endpoint_errors_get_allocatable
Number of requests to the PodResource GetAllocatableResources endpoint which returned error. Broken down by server api version.
- Stability Level:ALPHA
- Type: Counter
- Labels:server_api_version
kubelet_pod_resources_endpoint_errors_list
Number of requests to the PodResource List endpoint which returned error. Broken down by server api version.
- Stability Level:ALPHA
- Type: Counter
- Labels:server_api_version
kubelet_pod_resources_endpoint_requests_get
Number of requests to the PodResource Get endpoint. Broken down by server api version.
- Stability Level:ALPHA
- Type: Counter
- Labels:server_api_version
kubelet_pod_resources_endpoint_requests_get_allocatable
Number of requests to the PodResource GetAllocatableResources endpoint. Broken down by server api version.
- Stability Level:ALPHA
- Type: Counter
- Labels:server_api_version
kubelet_pod_resources_endpoint_requests_list
Number of requests to the PodResource List endpoint. Broken down by server api version.
- Stability Level:ALPHA
- Type: Counter
- Labels:server_api_version
kubelet_pod_resources_endpoint_requests_total
Cumulative number of requests to the PodResource endpoint. Broken down by server api version.
- Stability Level:ALPHA
- Type: Counter
- Labels:server_api_version
kubelet_pod_start_duration_seconds
Duration in seconds from kubelet seeing a pod for the first time to the pod starting to run
- Stability Level:ALPHA
- Type: Histogram
kubelet_pod_start_sli_duration_seconds
Duration in seconds to start a pod, excluding time to pull images and run init containers, measured from pod creation timestamp to when all its containers are reported as started and observed via watch
- Stability Level:ALPHA
- Type: Histogram
kubelet_pod_start_total_duration_seconds
Duration in seconds to start a pod since creation, including time to pull images and run init containers, measured from pod creation timestamp to when all its containers are reported as started and observed via watch
- Stability Level:ALPHA
- Type: Histogram
kubelet_pod_status_sync_duration_seconds
Duration in seconds to sync a pod status update. Measures time from detection of a change to pod status until the API is successfully updated for that pod, even if multiple intevening changes to pod status occur.
- Stability Level:ALPHA
- Type: Histogram
kubelet_pod_worker_duration_seconds
Duration in seconds to sync a single pod. Broken down by operation type: create, update, or sync
- Stability Level:ALPHA
- Type: Histogram
- Labels:operation_type
kubelet_pod_worker_start_duration_seconds
Duration in seconds from kubelet seeing a pod to starting a worker.
- Stability Level:ALPHA
- Type: Histogram
kubelet_preemptions
Cumulative number of pod preemptions by preemption resource
- Stability Level:ALPHA
- Type: Counter
- Labels:preemption_signal
kubelet_restarted_pods_total
Number of pods that have been restarted because they were deleted and recreated with the same UID while the kubelet was watching them (common for static pods, extremely uncommon for API pods)
- Stability Level:ALPHA
- Type: Counter
- Labels:static
kubelet_run_podsandbox_duration_seconds
Duration in seconds of the run_podsandbox operations. Broken down by RuntimeClass.Handler.
- Stability Level:ALPHA
- Type: Histogram
- Labels:runtime_handler
kubelet_run_podsandbox_errors_total
Cumulative number of the run_podsandbox operation errors by RuntimeClass.Handler.
- Stability Level:ALPHA
- Type: Counter
- Labels:runtime_handler
kubelet_running_containers
Number of containers currently running
- Stability Level:ALPHA
- Type: Gauge
- Labels:container_state
kubelet_running_pods
Number of pods that have a running pod sandbox
- Stability Level:ALPHA
- Type: Gauge
kubelet_runtime_operations_duration_seconds
Duration in seconds of runtime operations. Broken down by operation type.
- Stability Level:ALPHA
- Type: Histogram
- Labels:operation_type
kubelet_runtime_operations_errors_total
Cumulative number of runtime operation errors by operation type.
- Stability Level:ALPHA
- Type: Counter
- Labels:operation_type
kubelet_runtime_operations_total
Cumulative number of runtime operations by operation type.
- Stability Level:ALPHA
- Type: Counter
- Labels:operation_type
kubelet_server_expiration_renew_errors
Counter of certificate renewal errors.
- Stability Level:ALPHA
- Type: Counter
kubelet_sleep_action_terminated_early_total
The number of times lifecycle sleep handler got terminated before it finishes
- Stability Level:ALPHA
- Type: Counter
kubelet_started_containers_errors_total
Cumulative number of errors when starting containers
- Stability Level:ALPHA
- Type: Counter
- Labels:codecontainer_type
kubelet_started_containers_total
Cumulative number of containers started
- Stability Level:ALPHA
- Type: Counter
- Labels:container_type
kubelet_started_host_process_containers_errors_total
Cumulative number of errors when starting hostprocess containers. This metric will only be collected on Windows.
- Stability Level:ALPHA
- Type: Counter
- Labels:codecontainer_type
kubelet_started_host_process_containers_total
Cumulative number of hostprocess containers started. This metric will only be collected on Windows.
- Stability Level:ALPHA
- Type: Counter
- Labels:container_type
kubelet_started_pods_errors_total
Cumulative number of errors when starting pods
- Stability Level:ALPHA
- Type: Counter
kubelet_started_pods_total
Cumulative number of pods started
- Stability Level:ALPHA
- Type: Counter
kubelet_topology_manager_admission_duration_ms
Duration in milliseconds to serve a pod admission request.
- Stability Level:ALPHA
- Type: Histogram
kubelet_topology_manager_admission_errors_total
The number of admission request failures where resources could not be aligned.
- Stability Level:ALPHA
- Type: Counter
kubelet_topology_manager_admission_requests_total
The number of admission requests where resources have to be aligned.
- Stability Level:ALPHA
- Type: Counter
kubelet_volume_metric_collection_duration_seconds
Duration in seconds to calculate volume stats
- Stability Level:ALPHA
- Type: Histogram
- Labels:metric_source
kubelet_volume_stats_available_bytes
Number of available bytes in the volume
- Stability Level:ALPHA
- Type: Custom
- Labels:namespacepersistentvolumeclaim
kubelet_volume_stats_capacity_bytes
Capacity in bytes of the volume
- Stability Level:ALPHA
- Type: Custom
- Labels:namespacepersistentvolumeclaim
kubelet_volume_stats_health_status_abnormal
Abnormal volume health status. The count is either 1 or 0. 1 indicates the volume is unhealthy, 0 indicates volume is healthy
- Stability Level:ALPHA
- Type: Custom
- Labels:namespacepersistentvolumeclaim
kubelet_volume_stats_inodes
Maximum number of inodes in the volume
- Stability Level:ALPHA
- Type: Custom
- Labels:namespacepersistentvolumeclaim
kubelet_volume_stats_inodes_free
Number of free inodes in the volume
- Stability Level:ALPHA
- Type: Custom
- Labels:namespacepersistentvolumeclaim
kubelet_volume_stats_inodes_used
Number of used inodes in the volume
- Stability Level:ALPHA
- Type: Custom
- Labels:namespacepersistentvolumeclaim
kubelet_volume_stats_used_bytes
Number of used bytes in the volume
- Stability Level:ALPHA
- Type: Custom
- Labels:namespacepersistentvolumeclaim
kubelet_working_pods
Number of pods the kubelet is actually running, broken down by lifecycle phase, whether the pod is desired, orphaned, or runtime only (also orphaned), and whether the pod is static. An orphaned pod has been removed from local configuration or force deleted in the API and consumes resources that are not otherwise visible.
- Stability Level:ALPHA
- Type: Gauge
- Labels:configlifecyclestatic
kubeproxy_iptables_ct_state_invalid_dropped_packets_total
packets dropped by iptables to work around conntrack problems
- Stability Level:ALPHA
- Type: Custom
kubeproxy_iptables_localhost_nodeports_accepted_packets_total
Number of packets accepted on nodeports of loopback interface
- Stability Level:ALPHA
- Type: Custom
kubeproxy_network_programming_duration_seconds
In Cluster Network Programming Latency in seconds
- Stability Level:ALPHA
- Type: Histogram
kubeproxy_proxy_healthz_total
Cumulative proxy healthz HTTP status
- Stability Level:ALPHA
- Type: Counter
- Labels:code
kubeproxy_proxy_livez_total
Cumulative proxy livez HTTP status
- Stability Level:ALPHA
- Type: Counter
- Labels:code
kubeproxy_sync_full_proxy_rules_duration_seconds
SyncProxyRules latency in seconds for full resyncs
- Stability Level:ALPHA
- Type: Histogram
kubeproxy_sync_partial_proxy_rules_duration_seconds
SyncProxyRules latency in seconds for partial resyncs
- Stability Level:ALPHA
- Type: Histogram
kubeproxy_sync_proxy_rules_duration_seconds
SyncProxyRules latency in seconds
- Stability Level:ALPHA
- Type: Histogram
kubeproxy_sync_proxy_rules_endpoint_changes_pending
Pending proxy rules Endpoint changes
- Stability Level:ALPHA
- Type: Gauge
kubeproxy_sync_proxy_rules_endpoint_changes_total
Cumulative proxy rules Endpoint changes
- Stability Level:ALPHA
- Type: Counter
kubeproxy_sync_proxy_rules_iptables_last
Number of iptables rules written by kube-proxy in last sync
- Stability Level:ALPHA
- Type: Gauge
- Labels:table
kubeproxy_sync_proxy_rules_iptables_partial_restore_failures_total
Cumulative proxy iptables partial restore failures
- Stability Level:ALPHA
- Type: Counter
kubeproxy_sync_proxy_rules_iptables_restore_failures_total
Cumulative proxy iptables restore failures
- Stability Level:ALPHA
- Type: Counter
kubeproxy_sync_proxy_rules_iptables_total
Total number of iptables rules owned by kube-proxy
- Stability Level:ALPHA
- Type: Gauge
- Labels:table
kubeproxy_sync_proxy_rules_last_queued_timestamp_seconds
The last time a sync of proxy rules was queued
- Stability Level:ALPHA
- Type: Gauge
kubeproxy_sync_proxy_rules_last_timestamp_seconds
The last time proxy rules were successfully synced
- Stability Level:ALPHA
- Type: Gauge
kubeproxy_sync_proxy_rules_nftables_cleanup_failures_total
Cumulative proxy nftables cleanup failures
- Stability Level:ALPHA
- Type: Counter
kubeproxy_sync_proxy_rules_nftables_sync_failures_total
Cumulative proxy nftables sync failures
- Stability Level:ALPHA
- Type: Counter
kubeproxy_sync_proxy_rules_no_local_endpoints_total
Number of services with a Local traffic policy and no endpoints
- Stability Level:ALPHA
- Type: Gauge
- Labels:traffic_policy
kubeproxy_sync_proxy_rules_service_changes_pending
Pending proxy rules Service changes
- Stability Level:ALPHA
- Type: Gauge
kubeproxy_sync_proxy_rules_service_changes_total
Cumulative proxy rules Service changes
- Stability Level:ALPHA
- Type: Counter
kubernetes_build_info
A metric with a constant ‘1’ value labeled by major, minor, git version, git commit, git tree state, build date, Go version, and compiler from which Kubernetes was built, and platform on which it is running.
- Stability Level:ALPHA
- Type: Gauge
- Labels:build_datecompilergit_commitgit_tree_stategit_versiongo_versionmajorminorplatform
leader_election_master_status
Gauge of if the reporting system is master of the relevant lease, 0 indicates backup, 1 indicates master. ‘name’ is the string used to identify the lease. Please make sure to group by name.
- Stability Level:ALPHA
- Type: Gauge
- Labels:name
leader_election_slowpath_total
Total number of slow path exercised in renewing leader leases. ‘name’ is the string used to identify the lease. Please make sure to group by name.
- Stability Level:ALPHA
- Type: Counter
- Labels:name
node_authorizer_graph_actions_duration_seconds
Histogram of duration of graph actions in node authorizer.
- Stability Level:ALPHA
- Type: Histogram
- Labels:operation
node_collector_unhealthy_nodes_in_zone
Gauge measuring number of not Ready Nodes per zones.
- Stability Level:ALPHA
- Type: Gauge
- Labels:zone
node_collector_update_all_nodes_health_duration_seconds
Duration in seconds for NodeController to update the health of all nodes.
- Stability Level:ALPHA
- Type: Histogram
node_collector_update_node_health_duration_seconds
Duration in seconds for NodeController to update the health of a single node.
- Stability Level:ALPHA
- Type: Histogram
node_collector_zone_health
Gauge measuring percentage of healthy nodes per zone.
- Stability Level:ALPHA
- Type: Gauge
- Labels:zone
node_collector_zone_size
Gauge measuring number of registered Nodes per zones.
- Stability Level:ALPHA
- Type: Gauge
- Labels:zone
node_controller_cloud_provider_taint_removal_delay_seconds
Number of seconds after node creation when NodeController removed the cloud-provider taint of a single node.
- Stability Level:ALPHA
- Type: Histogram
node_controller_initial_node_sync_delay_seconds
Number of seconds after node creation when NodeController finished the initial synchronization of a single node.
- Stability Level:ALPHA
- Type: Histogram
node_ipam_controller_cidrset_allocation_tries_per_request
Number of endpoints added on each Service sync
- Stability Level:ALPHA
- Type: Histogram
- Labels:clusterCIDR
node_ipam_controller_cidrset_cidrs_allocations_total
Counter measuring total number of CIDR allocations.
- Stability Level:ALPHA
- Type: Counter
- Labels:clusterCIDR
node_ipam_controller_cidrset_cidrs_releases_total
Counter measuring total number of CIDR releases.
- Stability Level:ALPHA
- Type: Counter
- Labels:clusterCIDR
node_ipam_controller_cidrset_usage_cidrs
Gauge measuring percentage of allocated CIDRs.
- Stability Level:ALPHA
- Type: Gauge
- Labels:clusterCIDR
node_ipam_controller_cirdset_max_cidrs
Maximum number of CIDRs that can be allocated.
- Stability Level:ALPHA
- Type: Gauge
- Labels:clusterCIDR
node_swap_usage_bytes
Current swap usage of the node in bytes. Reported only on non-windows systems
- Stability Level:ALPHA
- Type: Custom
plugin_manager_total_plugins
Number of plugins in Plugin Manager
- Stability Level:ALPHA
- Type: Custom
- Labels:socket_pathstate
pod_gc_collector_force_delete_pod_errors_total
Number of errors encountered when forcefully deleting the pods since the Pod GC Controller started.
- Stability Level:ALPHA
- Type: Counter
- Labels:namespacereason
pod_gc_collector_force_delete_pods_total
Number of pods that are being forcefully deleted since the Pod GC Controller started.
- Stability Level:ALPHA
- Type: Counter
- Labels:namespacereason
pod_security_errors_total
Number of errors preventing normal evaluation. Non-fatal errors may result in the latest restricted profile being used for evaluation.
- Stability Level:ALPHA
- Type: Counter
- Labels:fatalrequest_operationresourcesubresource
pod_security_evaluations_total
Number of policy evaluations that occurred, not counting ignored or exempt requests.
- Stability Level:ALPHA
- Type: Counter
- Labels:decisionmodepolicy_levelpolicy_versionrequest_operationresourcesubresource
pod_security_exemptions_total
Number of exempt requests, not counting ignored or out of scope requests.
- Stability Level:ALPHA
- Type: Counter
- Labels:request_operationresourcesubresource
pod_swap_usage_bytes
Current amount of the pod swap usage in bytes. Reported only on non-windows systems
- Stability Level:ALPHA
- Type: Custom
- Labels:podnamespace
prober_probe_duration_seconds
Duration in seconds for a probe response.
- Stability Level:ALPHA
- Type: Histogram
- Labels:containernamespacepodprobe_type
prober_probe_total
Cumulative number of a liveness, readiness or startup probe for a container by result.
- Stability Level:ALPHA
- Type: Counter
- Labels:containernamespacepodpod_uidprobe_typeresult
pv_collector_bound_pv_count
Gauge measuring number of persistent volume currently bound
- Stability Level:ALPHA
- Type: Custom
- Labels:storage_class
pv_collector_bound_pvc_count
Gauge measuring number of persistent volume claim currently bound
- Stability Level:ALPHA
- Type: Custom
- Labels:namespacestorage_classvolume_attributes_class
pv_collector_total_pv_count
Gauge measuring total number of persistent volumes
- Stability Level:ALPHA
- Type: Custom
- Labels:plugin_namevolume_mode
pv_collector_unbound_pv_count
Gauge measuring number of persistent volume currently unbound
- Stability Level:ALPHA
- Type: Custom
- Labels:storage_class
pv_collector_unbound_pvc_count
Gauge measuring number of persistent volume claim currently unbound
- Stability Level:ALPHA
- Type: Custom
- Labels:namespacestorage_classvolume_attributes_class
reconstruct_volume_operations_errors_total
The number of volumes that failed reconstruction from the operating system during kubelet startup.
- Stability Level:ALPHA
- Type: Counter
reconstruct_volume_operations_total
The number of volumes that were attempted to be reconstructed from the operating system during kubelet startup. This includes both successful and failed reconstruction.
- Stability Level:ALPHA
- Type: Counter
replicaset_controller_sorting_deletion_age_ratio
The ratio of chosen deleted pod’s ages to the current youngest pod’s age (at the time). Should be <2. The intent of this metric is to measure the rough efficacy of the LogarithmicScaleDown feature gate’s effect on the sorting (and deletion) of pods when a replicaset scales down. This only considers Ready pods when calculating and reporting.
- Stability Level:ALPHA
- Type: Histogram
resourceclaim_controller_allocated_resource_claims
Number of allocated ResourceClaims
- Stability Level:ALPHA
- Type: Gauge
resourceclaim_controller_create_attempts_total
Number of ResourceClaims creation requests
- Stability Level:ALPHA
- Type: Counter
resourceclaim_controller_create_failures_total
Number of ResourceClaims creation request failures
- Stability Level:ALPHA
- Type: Counter
resourceclaim_controller_resource_claims
Number of ResourceClaims
- Stability Level:ALPHA
- Type: Gauge
rest_client_dns_resolution_duration_seconds
DNS resolver latency in seconds. Broken down by host.
- Stability Level:ALPHA
- Type: Histogram
- Labels:host
rest_client_exec_plugin_call_total
Number of calls to an exec plugin, partitioned by the type of event encountered (no_error, plugin_execution_error, plugin_not_found_error, client_internal_error) and an optional exit code. The exit code will be set to 0 if and only if the plugin call was successful.
- Stability Level:ALPHA
- Type: Counter
- Labels:call_statuscode
rest_client_exec_plugin_certificate_rotation_age
Histogram of the number of seconds the last auth exec plugin client certificate lived before being rotated. If auth exec plugin client certificates are unused, histogram will contain no data.
- Stability Level:ALPHA
- Type: Histogram
rest_client_exec_plugin_ttl_seconds
Gauge of the shortest TTL (time-to-live) of the client certificate(s) managed by the auth exec plugin. The value is in seconds until certificate expiry (negative if already expired). If auth exec plugins are unused or manage no TLS certificates, the value will be +INF.
- Stability Level:ALPHA
- Type: Gauge
rest_client_rate_limiter_duration_seconds
Client side rate limiter latency in seconds. Broken down by verb, and host.
- Stability Level:ALPHA
- Type: Histogram
- Labels:hostverb
rest_client_request_duration_seconds
Request latency in seconds. Broken down by verb, and host.
- Stability Level:ALPHA
- Type: Histogram
- Labels:hostverb
rest_client_request_retries_total
Number of request retries, partitioned by status code, verb, and host.
- Stability Level:ALPHA
- Type: Counter
- Labels:codehostverb
rest_client_request_size_bytes
Request size in bytes. Broken down by verb and host.
- Stability Level:ALPHA
- Type: Histogram
- Labels:hostverb
rest_client_requests_total
Number of HTTP requests, partitioned by status code, method, and host.
- Stability Level:ALPHA
- Type: Counter
- Labels:codehostmethod
rest_client_response_size_bytes
Response size in bytes. Broken down by verb and host.
- Stability Level:ALPHA
- Type: Histogram
- Labels:hostverb
rest_client_transport_cache_entries
Number of transport entries in the internal cache.
- Stability Level:ALPHA
- Type: Gauge
rest_client_transport_create_calls_total
Number of calls to get a new transport, partitioned by the result of the operation hit: obtained from the cache, miss: created and added to the cache, uncacheable: created and not cached
- Stability Level:ALPHA
- Type: Counter
- Labels:result
retroactive_storageclass_errors_total
Total number of failed retroactive StorageClass assignments to persistent volume claim
- Stability Level:ALPHA
- Type: Counter
retroactive_storageclass_total
Total number of retroactive StorageClass assignments to persistent volume claim
- Stability Level:ALPHA
- Type: Counter
root_ca_cert_publisher_sync_duration_seconds
Number of namespace syncs happened in root ca cert publisher.
- Stability Level:ALPHA
- Type: Histogram
- Labels:code
root_ca_cert_publisher_sync_total
Number of namespace syncs happened in root ca cert publisher.
- Stability Level:ALPHA
- Type: Counter
- Labels:code
running_managed_controllers
Indicates where instances of a controller are currently running
- Stability Level:ALPHA
- Type: Gauge
- Labels:managername
scheduler_event_handling_duration_seconds
Event handling latency in seconds.
- Stability Level:ALPHA
- Type: Histogram
- Labels:event
scheduler_goroutines
Number of running goroutines split by the work they do such as binding.
- Stability Level:ALPHA
- Type: Gauge
- Labels:operation
scheduler_inflight_events
Number of events currently tracked in the scheduling queue.
- Stability Level:ALPHA
- Type: Gauge
- Labels:event
scheduler_permit_wait_duration_seconds
Duration of waiting on permit.
- Stability Level:ALPHA
- Type: Histogram
- Labels:result
scheduler_plugin_evaluation_total
Number of attempts to schedule pods by each plugin and the extension point (available only in PreFilter, Filter, PreScore, and Score).
- Stability Level:ALPHA
- Type: Counter
- Labels:extension_pointpluginprofile
scheduler_plugin_execution_duration_seconds
Duration for running a plugin at a specific extension point.
- Stability Level:ALPHA
- Type: Histogram
- Labels:extension_pointpluginstatus
scheduler_preemption_goroutines_duration_seconds
Duration in seconds for running goroutines for the preemption.
- Stability Level:ALPHA
- Type: Histogram
- Labels:result
scheduler_preemption_goroutines_execution_total
Number of preemption goroutines executed.
- Stability Level:ALPHA
- Type: Counter
- Labels:result
scheduler_queueing_hint_execution_duration_seconds
Duration for running a queueing hint function of a plugin.
- Stability Level:ALPHA
- Type: Histogram
- Labels:eventhintplugin
scheduler_scheduler_cache_size
Number of nodes, pods, and assumed (bound) pods in the scheduler cache.
- Stability Level:ALPHA
- Type: Gauge
- Labels:type
scheduler_scheduling_algorithm_duration_seconds
Scheduling algorithm latency in seconds
- Stability Level:ALPHA
- Type: Histogram
scheduler_unschedulable_pods
The number of unschedulable pods broken down by plugin name. A pod will increment the gauge for all plugins that caused it to not schedule and so this metric have meaning only when broken down by plugin.
- Stability Level:ALPHA
- Type: Gauge
- Labels:pluginprofile
scheduler_volume_binder_cache_requests_total
Total number for request volume binding cache
- Stability Level:ALPHA
- Type: Counter
- Labels:operation
scheduler_volume_scheduling_stage_error_total
Volume scheduling stage error count
- Stability Level:ALPHA
- Type: Counter
- Labels:operation
scrape_error
1 if there was an error while getting container metrics, 0 otherwise
- Stability Level:ALPHA
- Type: Custom
- Deprecated Versions:1.29.0
selinux_warning_controller_selinux_volume_conflict
Conflict between two Pods using the same volume
- Stability Level:ALPHA
- Type: Custom
- Labels:propertypod1_namespacepod1_namepod1_valuepod2_namespacepod2_namepod2_value
service_controller_loadbalancer_sync_total
A metric counting the amount of times any load balancer has been configured, as an effect of service/node changes on the cluster
- Stability Level:ALPHA
- Type: Counter
service_controller_nodesync_error_total
A metric counting the amount of times any load balancer has been configured and errored, as an effect of node changes on the cluster
- Stability Level:ALPHA
- Type: Counter
service_controller_nodesync_latency_seconds
A metric measuring the latency for nodesync which updates loadbalancer hosts on cluster node updates.
- Stability Level:ALPHA
- Type: Histogram
service_controller_update_loadbalancer_host_latency_seconds
A metric measuring the latency for updating each load balancer hosts.
- Stability Level:ALPHA
- Type: Histogram
serviceaccount_invalid_legacy_auto_token_uses_total
Cumulative invalid auto-generated legacy tokens used
- Stability Level:ALPHA
- Type: Counter
serviceaccount_legacy_auto_token_uses_total
Cumulative auto-generated legacy tokens used
- Stability Level:ALPHA
- Type: Counter
serviceaccount_legacy_manual_token_uses_total
Cumulative manually created legacy tokens used
- Stability Level:ALPHA
- Type: Counter
serviceaccount_legacy_tokens_total
Cumulative legacy service account tokens used
- Stability Level:ALPHA
- Type: Counter
serviceaccount_stale_tokens_total
Cumulative stale projected service account tokens used
- Stability Level:ALPHA
- Type: Counter
serviceaccount_valid_tokens_total
Cumulative valid projected service account tokens used
- Stability Level:ALPHA
- Type: Counter
storage_count_attachable_volumes_in_use
Measure number of volumes in use
- Stability Level:ALPHA
- Type: Custom
- Labels:nodevolume_plugin
storage_operation_duration_seconds
Storage operation duration
- Stability Level:ALPHA
- Type: Histogram
- Labels:migratedoperation_namestatusvolume_plugin
taint_eviction_controller_pod_deletion_duration_seconds
Latency, in seconds, between the time when a taint effect has been activated for the Pod and its deletion via TaintEvictionController.
- Stability Level:ALPHA
- Type: Histogram
taint_eviction_controller_pod_deletions_total
Total number of Pods deleted by TaintEvictionController since its start.
- Stability Level:ALPHA
- Type: Counter
ttl_after_finished_controller_job_deletion_duration_seconds
The time it took to delete the job since it became eligible for deletion
- Stability Level:ALPHA
- Type: Histogram
volume_manager_selinux_container_errors_total
Number of errors when kubelet cannot compute SELinux context for a container. Kubelet can’t start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of containers.
- Stability Level:ALPHA
- Type: Gauge
- Labels:access_mode
volume_manager_selinux_container_warnings_total
Number of errors when kubelet cannot compute SELinux context for a container that are ignored. They will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.
- Stability Level:ALPHA
- Type: Gauge
- Labels:access_mode
volume_manager_selinux_pod_context_mismatch_errors_total
Number of errors when a Pod defines different SELinux contexts for its containers that use the same volume. Kubelet can’t start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of Pods.
- Stability Level:ALPHA
- Type: Gauge
- Labels:access_mode
volume_manager_selinux_pod_context_mismatch_warnings_total
Number of errors when a Pod defines different SELinux contexts for its containers that use the same volume. They are not errors yet, but they will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.
- Stability Level:ALPHA
- Type: Gauge
- Labels:access_mode
volume_manager_selinux_volume_context_mismatch_errors_total
Number of errors when a Pod uses a volume that is already mounted with a different SELinux context than the Pod needs. Kubelet can’t start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of Pods.
- Stability Level:ALPHA
- Type: Gauge
- Labels:access_modevolume_plugin
volume_manager_selinux_volume_context_mismatch_warnings_total
Number of errors when a Pod uses a volume that is already mounted with a different SELinux context than the Pod needs. They are not errors yet, but they will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.
- Stability Level:ALPHA
- Type: Gauge
- Labels:access_modevolume_plugin
volume_manager_selinux_volumes_admitted_total
Number of volumes whose SELinux context was fine and will be mounted with mount -o context option.
- Stability Level:ALPHA
- Type: Gauge
- Labels:access_modevolume_plugin
volume_manager_total_volumes
Number of volumes in Volume Manager
- Stability Level:ALPHA
- Type: Custom
- Labels:plugin_namestate
volume_operation_total_errors
Total volume operation errors
- Stability Level:ALPHA
- Type: Counter
- Labels:operation_nameplugin_name
volume_operation_total_seconds
Storage operation end to end duration in seconds
- Stability Level:ALPHA
- Type: Histogram
- Labels:operation_nameplugin_name
watch_cache_capacity
Total capacity of watch cache broken by resource type.
- Stability Level:ALPHA
- Type: Gauge
- Labels:resource
watch_cache_capacity_decrease_total
Total number of watch cache capacity decrease events broken by resource type.
- Stability Level:ALPHA
- Type: Counter
- Labels:resource
watch_cache_capacity_increase_total
Total number of watch cache capacity increase events broken by resource type.
- Stability Level:ALPHA
- Type: Counter
- Labels:resource
workqueue_adds_total
Total number of adds handled by workqueue
- Stability Level:ALPHA
- Type: Counter
- Labels:name
workqueue_depth
Current depth of workqueue
- Stability Level:ALPHA
- Type: Gauge
- Labels:name
workqueue_longest_running_processor_seconds
How many seconds has the longest running processor for workqueue been running.
- Stability Level:ALPHA
- Type: Gauge
- Labels:name
workqueue_queue_duration_seconds
How long in seconds an item stays in workqueue before being requested.
- Stability Level:ALPHA
- Type: Histogram
- Labels:name
workqueue_retries_total
Total number of retries handled by workqueue
- Stability Level:ALPHA
- Type: Counter
- Labels:name
workqueue_unfinished_work_seconds
How many seconds of work has done that is in progress and hasn’t been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
- Stability Level:ALPHA
- Type: Gauge
- Labels:name
workqueue_work_duration_seconds
How long in seconds processing an item from workqueue takes.
- Stability Level:ALPHA
- Type: Histogram
- Labels:name