Application Health

Overview

In software systems, components can become unhealthy due to transient issues (such as temporary connectivity loss), configuration errors, or problems with external dependencies. OKD applications have a number of options to detect and handle unhealthy containers.

Container Health Checks Using Probes

A probe is a Kubernetes action that periodically performs diagnostics on a running container. Currently, two types of probes exist, each serving a different purpose:

Liveness Probe

A liveness probe checks if the container in which it is configured is still running. If the liveness probe fails, the kubelet kills the container, which will be subjected to its restart policy. Set a liveness check by configuring the template.spec.containers.livenessprobe stanza of a pod configuration.

Readiness Probe

A readiness probe determines if a container is ready to service requests. If the readiness probe fails a container, the endpoints controller ensures the container has its IP address removed from the endpoints of all services. A readiness probe can be used to signal to the endpoints controller that even though a container is running, it should not receive any traffic from a proxy. Set a readiness check by configuring the template.spec.containers.readinessprobe stanza of a pod configuration.

The exact timing of a probe is controlled by two fields, both expressed in units of seconds:

FieldDescription

initialDelaySeconds

How long to wait after the container starts to begin the probe.

timeoutSeconds

How long to wait for the probe to finish (default: 1). If this time is exceeded, OKD considers the probe to have failed.

Both probes can be configured in three ways:

HTTP Checks

The kubelet uses a web hook to determine the healthiness of the container. The check is deemed successful if the HTTP response code is between 200 and 399. The following is an example of a readiness check using the HTTP checks method:

Example 1. Readiness HTTP check

  1. ...
  2. readinessProbe:
  3. httpGet:
  4. path: /healthz
  5. port: 8080
  6. initialDelaySeconds: 15
  7. timeoutSeconds: 1
  8. ...

A HTTP check is ideal for applications that return HTTP status codes when completely initialized.

Container Execution Checks

The kubelet executes a command inside the container. Exiting the check with status 0 is considered a success. The following is an example of a liveness check using the container execution method:

Example 2. Liveness Container Execution Check

  1. ...
  2. livenessProbe:
  3. exec:
  4. command:
  5. - cat
  6. - /tmp/health
  7. initialDelaySeconds: 15
  8. ...

The timeoutSeconds parameter has no effect on the readiness and liveness probes for Container Execution Checks.

The timeoutSeconds parameter has no effect on the readiness and liveness probes for Container Execution Checks. You can implement a timeout inside the probe itself, as OKD cannot time out on an exec call into the container. One way to implement a timeout in a probe is by using the timeout parameter to run your liveness or readiness probe:

  1. […]
  2. livenessProbe:
  3. exec:
  4. command:
  5. - /bin/bash
  6. - ‘-c
  7. - timeout 60 /opt/eap/bin/livenessProbe.sh (1)
  8. timeoutSeconds: 1
  9. periodSeconds: 10
  10. successThreshold: 1
  11. failureThreshold: 3
  12. […]
1Timeout value and path to the probe script.

TCP Socket Checks

The kubelet attempts to open a socket to the container. The container is only considered healthy if the check can establish a connection. The following is an example of a liveness check using the TCP socket check method:

Example 3. Liveness TCP Socket Check

  1. ...
  2. livenessProbe:
  3. tcpSocket:
  4. port: 8080
  5. initialDelaySeconds: 15
  6. timeoutSeconds: 1
  7. ...

A TCP socket check is ideal for applications that do not start listening until initialization is complete.

For more information on health checks, see the Kubernetes documentation.