Control plane resiliency and recovery
You can use the control plane machine set to improve the resiliency of the control plane for your OKD cluster.
High availability and fault tolerance with failure domains
When possible, the control plane machine set spreads the control plane machines across multiple failure domains. This configuration provides high availability and fault tolerance within the control plane. This strategy can help protect the control plane when issues arise within the infrastructure provider.
Failure domain platform support and configuration
The control plane machine set concept of a failure domain is analogous to existing concepts on cloud providers. Not all platforms support the use of failure domains.
Cloud provider | Support for failure domains | Provider nomenclature |
---|---|---|
Amazon Web Services (AWS) | X | |
Google Cloud Platform (GCP) | X | |
Nutanix | Not applicable [1] | |
Microsoft Azure | X | |
VMware vSphere | Not applicable | |
OpenStack | X | OpenStack Nova availability zones and OpenStack Cinder availability zones |
- Nutanix has a failure domain concept, but OKD 4 does not include support for this feature.
The failure domain configuration in the control plane machine set custom resource (CR) is platform-specific. For more information about failure domain parameters in the CR, see the sample failure domain configuration for your provider.
Additional resources
Balancing control plane machines
The control plane machine set balances control plane machines across the failure domains that are specified in the custom resource (CR).
When possible, the control plane machine set uses each failure domain equally to ensure appropriate fault tolerance. If there are fewer failure domains than control plane machines, failure domains are selected for reuse alphabetically by name. For clusters with no failure domains specified, all control plane machines are placed within a single failure domain.
Some changes to the failure domain configuration cause the control plane machine set to rebalance the control plane machines. For example, if you add failure domains to a cluster with fewer failure domains than control plane machines, the control plane machine set rebalances the machines across all available failure domains.
Recovery of failed control plane machines
The Control Plane Machine Set Operator automates the recovery of control plane machines. When a control plane machine is deleted, the Operator creates a replacement with the configuration that is specified in the ControlPlaneMachineSet
custom resource (CR).
For clusters that use control plane machine sets, you can configure a machine health check. The machine health check deletes unhealthy control plane machines so that they are replaced.
If you configure a This configuration ensures that the machine health check takes no action when multiple control plane machines appear to be unhealthy. Multiple unhealthy control plane machines can indicate that the etcd cluster is degraded or that a scaling operation to replace a failed machine is in progress. If the etcd cluster is degraded, manual intervention might be required. If a scaling operation is in progress, the machine health check should allow it to finish. |
Additional resources
Quorum protection with machine lifecycle hooks
For OKD clusters that use the Machine API Operator, the etcd Operator uses lifecycle hooks for the machine deletion phase to implement a quorum protection mechanism.
By using a preDrain
lifecycle hook, the etcd Operator can control when the pods on a control plane machine are drained and removed. To protect etcd quorum, the etcd Operator prevents the removal of an etcd member until it migrates that member onto a new node within the cluster.
This mechanism allows the etcd Operator precise control over the members of the etcd quorum and allows the Machine API Operator to safely create and remove control plane machines without specific operational knowledge of the etcd cluster.
Control plane deletion with quorum protection processing order
When a control plane machine is replaced on a cluster that uses a control plane machine set, the cluster temporarily has four control plane machines. When the fourth control plane node joins the cluster, the etcd Operator starts a new etcd member on the replacement node. When the etcd Operator observes that the old control plane machine is marked for deletion, it stops the etcd member on the old node and promotes the replacement etcd member to join the quorum of the cluster.
The control plane machine Deleting
phase proceeds in the following order:
A control plane machine is slated for deletion.
The control plane machine enters the
Deleting
phase.To satisfy the
preDrain
lifecycle hook, the etcd Operator takes the following actions:The etcd Operator waits until a fourth control plane machine is added to the cluster as an etcd member. This new etcd member has a state of
Running
but notready
until it receives the full database update from the etcd leader.When the new etcd member receives the full database update, the etcd Operator promotes the new etcd member to a voting member and removes the old etcd member from the cluster.
After this transition is complete, it is safe for the old etcd pod and its data to be removed, so the
preDrain
lifecycle hook is removed.The control plane machine status condition
Drainable
is set toTrue
.The machine controller attempts to drain the node that is backed by the control plane machine.
If draining fails,
Drained
is set toFalse
and the machine controller attempts to drain the node again.If draining succeeds,
Drained
is set toTrue
.
The control plane machine status condition
Drained
is set toTrue
.If no other Operators have added a
preTerminate
lifecycle hook, the control plane machine status conditionTerminable
is set toTrue
.The machine controller removes the instance from the infrastructure provider.
The machine controller deletes the
Node
object.
YAML snippet demonstrating the etcd quorum protection preDrain
lifecycle hook
apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
...
spec:
lifecycleHooks:
preDrain:
- name: EtcdQuorumOperator (1)
owner: clusteroperator/etcd (2)
...
1 | The name of the preDrain lifecycle hook. |
2 | The hook-implementing controller that manages the preDrain lifecycle hook. |
Additional resources