High-availability or single-node cluster detection and support
An OpenShift Container Platform cluster can be configured in high-availability (HA) mode, which uses multiple nodes, or in non-HA mode, which uses a single node. A single-node cluster, also known as single-node OpenShift, is likely to have more conservative resource constraints. Therefore, it is important that Operators installed on a single-node cluster can adjust accordingly and still run well.
By accessing the cluster high-availability mode API provided in OKD, Operator authors can use the Operator SDK to enable their Operator to detect a cluster’s infrastructure topology, either HA or non-HA mode. Custom Operator logic can be developed that uses the detected cluster topology to automatically switch the resource requirements, both for the Operator and for any Operands or workloads it manages, to a profile that best fits the topology.
About the cluster high-availability mode API
OKD provides a cluster high-availability mode API that can be used by Operators to help detect infrastructure topology. The Infrastructure API holds cluster-wide information regarding infrastructure. Operators managed by Operator Lifecycle Manager (OLM) can use the Infrastructure API if they need to configure an Operand or managed workload differently based on the high-availability mode.
In the Infrastructure API, the infrastructureTopology
status expresses the expectations for infrastructure services that do not run on control plane nodes, usually indicated by a node selector for a role
value other than master
. The controlPlaneTopology
status expresses the expectations for Operands that normally run on control plane nodes.
The default setting for either status is HighlyAvailable
, which represents the behavior Operators have in multiple node clusters. The SingleReplica
setting is used in single-node clusters, also known as single-node OpenShift, and indicates that Operators should not configure their Operands for high-availability operation.
The OKD installer sets the controlPlaneTopology
and infrastructureTopology
status fields based on the replica counts for the cluster when it is created, according to the following rules:
When the control plane replica count is less than 3, the
controlPlaneTopology
status is set toSingleReplica
. Otherwise, it is set toHighlyAvailable
.When the worker replica count is 0, the control plane nodes are also configured as workers. Therefore, the
infrastructureTopology
status will be the same as thecontrolPlaneTopology
status.When the worker replica count is 1, the
infrastructureTopology
is set toSingleReplica
. Otherwise, it is set toHighlyAvailable
.
Example API usage in Operator projects
As an Operator author, you can update your Operator project to access the Infrastructure API by using normal Kubernetes constructs and the controller-runtime
library, as shown in the following examples:
controller-runtime
library example
// Simple query
nn := types.NamespacedName{
Name: "cluster",
}
infraConfig := &configv1.Infrastructure{}
err = crClient.Get(context.Background(), nn, infraConfig)
if err != nil {
return err
}
fmt.Printf("using crclient: %v\n", infraConfig.Status.ControlPlaneTopology)
fmt.Printf("using crclient: %v\n", infraConfig.Status.InfrastructureTopology)
Kubernetes constructs example
operatorConfigInformer := configinformer.NewSharedInformerFactoryWithOptions(configClient, 2*time.Second)
infrastructureLister = operatorConfigInformer.Config().V1().Infrastructures().Lister()
infraConfig, err := configClient.ConfigV1().Infrastructures().Get(context.Background(), "cluster", metav1.GetOptions{})
if err != nil {
return err
}
// fmt.Printf("%v\n", infraConfig)
fmt.Printf("%v\n", infraConfig.Status.ControlPlaneTopology)
fmt.Printf("%v\n", infraConfig.Status.InfrastructureTopology)