Triggering virtual machine failover by resolving a failed node
If a node fails and machine health checks are not deployed on your cluster, virtual machines (VMs) with RunStrategy: Always
configured are not automatically relocated to healthy nodes. To trigger VM failover, you must manually delete the Node
object.
If you installed your cluster by using installer-provisioned infrastructure and you properly configured machine health checks:
|
Prerequisites
A node where a virtual machine was running has the
NotReady
condition.The virtual machine that was running on the failed node has
RunStrategy
set toAlways
.You have installed the OpenShift CLI (
oc
).
Deleting nodes from a bare metal cluster
When you delete a node using the CLI, the node object is deleted in Kubernetes, but the pods that exist on the node are not deleted. Any bare pods not backed by a replication controller become inaccessible to OKD. Pods backed by replication controllers are rescheduled to other available nodes. You must delete local manifest pods.
Procedure
Delete a node from an OKD cluster running on bare metal by completing the following steps:
Mark the node as unschedulable:
$ oc adm cordon <node_name>
Drain all pods on the node:
$ oc adm drain <node_name> --force=true
This step might fail if the node is offline or unresponsive. Even if the node does not respond, it might still be running a workload that writes to shared storage. To avoid data corruption, power down the physical hardware before you proceed.
Delete the node from the cluster:
$ oc delete node <node_name>
Although the node object is now deleted from the cluster, it can still rejoin the cluster after reboot or if the kubelet service is restarted. To permanently delete the node and all its data, you must decommission the node.
If you powered down the physical hardware, turn it back on so that the node can rejoin the cluster.
Verifying virtual machine failover
After all resources are terminated on the unhealthy node, a new virtual machine instance (VMI) is automatically created on a healthy node for each relocated VM. To confirm that the VMI was created, view all VMIs by using the oc
CLI.
Listing all virtual machine instances using the CLI
You can list all virtual machine instances (VMIs) in your cluster, including standalone VMIs and those owned by virtual machines, by using the oc
command-line interface (CLI).
Procedure
List all VMIs by running the following command:
$ oc get vmis