Monitoring virtual machine health
A virtual machine instance (VMI) can become unhealthy due to transient issues such as connectivity loss, deadlocks, or problems with external dependencies. A health check periodically performs diagnostics on a VMI by using any combination of the readiness and liveness probes.
About readiness and liveness probes
Use readiness and liveness probes to detect and handle unhealthy virtual machine instances (VMIs). You can include one or more probes in the specification of the VMI to ensure that traffic does not reach a VMI that is not ready for it and that a new instance is created when a VMI becomes unresponsive.
A readiness probe determines whether a VMI is ready to accept service requests. If the probe fails, the VMI is removed from the list of available endpoints until the VMI is ready.
A liveness probe determines whether a VMI is responsive. If the probe fails, the VMI is deleted and a new instance is created to restore responsiveness.
You can configure readiness and liveness probes by setting the spec.readinessProbe
and the spec.livenessProbe
fields of the VirtualMachineInstance
object. These fields support the following tests:
HTTP GET
The probe determines the health of the VMI by using a web hook. The test is successful if the HTTP response code is between 200 and 399. You can use an HTTP GET test with applications that return HTTP status codes when they are completely initialized.
TCP socket
The probe attempts to open a socket to the VMI. The VMI is only considered healthy if the probe can establish a connection. You can use a TCP socket test with applications that do not start listening until initialization is complete.
Guest agent ping
The probe uses the guest-ping
command to determine if the QEMU guest agent is running on the virtual machine.
Defining an HTTP readiness probe
Define an HTTP readiness probe by setting the spec.readinessProbe.httpGet
field of the virtual machine instance (VMI) configuration.
Procedure
Include details of the readiness probe in the VMI configuration file.
Sample readiness probe with an HTTP GET test
# ...
spec:
readinessProbe:
httpGet: (1)
port: 1500 (2)
path: /healthz (3)
httpHeaders:
- name: Custom-Header
value: Awesome
initialDelaySeconds: 120 (4)
periodSeconds: 20 (5)
timeoutSeconds: 10 (6)
failureThreshold: 3 (7)
successThreshold: 3 (8)
# ...
1 The HTTP GET request to perform to connect to the VMI. 2 The port of the VMI that the probe queries. In the above example, the probe queries port 1500. 3 The path to access on the HTTP server. In the above example, if the handler for the server’s /healthz path returns a success code, the VMI is considered to be healthy. If the handler returns a failure code, the VMI is removed from the list of available endpoints. 4 The time, in seconds, after the VMI starts before the readiness probe is initiated. 5 The delay, in seconds, between performing probes. The default delay is 10 seconds. This value must be greater than timeoutSeconds
.6 The number of seconds of inactivity after which the probe times out and the VMI is assumed to have failed. The default value is 1. This value must be lower than periodSeconds
.7 The number of times that the probe is allowed to fail. The default is 3. After the specified number of attempts, the pod is marked Unready
.8 The number of times that the probe must report success, after a failure, to be considered successful. The default is 1. Create the VMI by running the following command:
$ oc create -f <file_name>.yaml
Defining a TCP readiness probe
Define a TCP readiness probe by setting the spec.readinessProbe.tcpSocket
field of the virtual machine instance (VMI) configuration.
Procedure
Include details of the TCP readiness probe in the VMI configuration file.
Sample readiness probe with a TCP socket test
...
spec:
readinessProbe:
initialDelaySeconds: 120 (1)
periodSeconds: 20 (2)
tcpSocket: (3)
port: 1500 (4)
timeoutSeconds: 10 (5)
...
1 The time, in seconds, after the VMI starts before the readiness probe is initiated. 2 The delay, in seconds, between performing probes. The default delay is 10 seconds. This value must be greater than timeoutSeconds
.3 The TCP action to perform. 4 The port of the VMI that the probe queries. 5 The number of seconds of inactivity after which the probe times out and the VMI is assumed to have failed. The default value is 1. This value must be lower than periodSeconds
.Create the VMI by running the following command:
$ oc create -f <file_name>.yaml
Defining an HTTP liveness probe
Define an HTTP liveness probe by setting the spec.livenessProbe.httpGet
field of the virtual machine instance (VMI) configuration. You can define both HTTP and TCP tests for liveness probes in the same way as readiness probes. This procedure configures a sample liveness probe with an HTTP GET test.
Procedure
Include details of the HTTP liveness probe in the VMI configuration file.
Sample liveness probe with an HTTP GET test
# ...
spec:
livenessProbe:
initialDelaySeconds: 120 (1)
periodSeconds: 20 (2)
httpGet: (3)
port: 1500 (4)
path: /healthz (5)
httpHeaders:
- name: Custom-Header
value: Awesome
timeoutSeconds: 10 (6)
# ...
1 The time, in seconds, after the VMI starts before the liveness probe is initiated. 2 The delay, in seconds, between performing probes. The default delay is 10 seconds. This value must be greater than timeoutSeconds
.3 The HTTP GET request to perform to connect to the VMI. 4 The port of the VMI that the probe queries. In the above example, the probe queries port 1500. The VMI installs and runs a minimal HTTP server on port 1500 via cloud-init. 5 The path to access on the HTTP server. In the above example, if the handler for the server’s /healthz
path returns a success code, the VMI is considered to be healthy. If the handler returns a failure code, the VMI is deleted and a new instance is created.6 The number of seconds of inactivity after which the probe times out and the VMI is assumed to have failed. The default value is 1. This value must be lower than periodSeconds
.Create the VMI by running the following command:
$ oc create -f <file_name>.yaml
Defining a guest agent ping probe
Define a guest agent ping probe by setting the spec.readinessProbe.guestAgentPing
field of the virtual machine instance (VMI) configuration.
The guest agent ping probe is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. For more information about the support scope of Red Hat Technology Preview features, see https://access.redhat.com/support/offerings/techpreview/. |
Prerequisites
- The QEMU guest agent must be installed and enabled on the virtual machine.
Procedure
Include details of the guest agent ping probe in the VMI configuration file. For example:
Sample guest agent ping probe
# ...
spec:
readinessProbe:
guestAgentPing: {} (1)
initialDelaySeconds: 120 (2)
periodSeconds: 20 (3)
timeoutSeconds: 10 (4)
failureThreshold: 3 (5)
successThreshold: 3 (6)
# ...
1 The guest agent ping probe to connect to the VMI. 2 Optional: The time, in seconds, after the VMI starts before the guest agent probe is initiated. 3 Optional: The delay, in seconds, between performing probes. The default delay is 10 seconds. This value must be greater than timeoutSeconds
.4 Optional: The number of seconds of inactivity after which the probe times out and the VMI is assumed to have failed. The default value is 1. This value must be lower than periodSeconds
.5 Optional: The number of times that the probe is allowed to fail. The default is 3. After the specified number of attempts, the pod is marked Unready
.6 Optional: The number of times that the probe must report success, after a failure, to be considered successful. The default is 1. Create the VMI by running the following command:
$ oc create -f <file_name>.yaml
Template: Virtual machine configuration file for defining health checks
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
labels:
special: vm-fedora
name: vm-fedora
spec:
template:
metadata:
labels:
special: vm-fedora
spec:
domain:
devices:
disks:
- disk:
bus: virtio
name: containerdisk
- disk:
bus: virtio
name: cloudinitdisk
resources:
requests:
memory: 1024M
readinessProbe:
httpGet:
port: 1500
initialDelaySeconds: 120
periodSeconds: 20
timeoutSeconds: 10
failureThreshold: 3
successThreshold: 3
terminationGracePeriodSeconds: 180
volumes:
- name: containerdisk
containerDisk:
image: kubevirt/fedora-cloud-registry-disk-demo
- cloudInitNoCloud:
userData: |-
#cloud-config
password: fedora
chpasswd: { expire: False }
bootcmd:
- setenforce 0
- dnf install -y nmap-ncat
- systemd-run --unit=httpserver nc -klp 1500 -e '/usr/bin/echo -e HTTP/1.1 200 OK\\n\\nHello World!'
name: cloudinitdisk