Rolling back to the OpenShift SDN network provider
As a cluster administrator, you can rollback to the OpenShift SDN Container Network Interface (CNI) cluster network provider from the OVN-Kubernetes CNI cluster network provider if the migration to OVN-Kubernetes is unsuccessful.
Rolling back the default CNI network provider to OpenShift SDN
As a cluster administrator, you can rollback your cluster to the OpenShift SDN default Container Network Interface (CNI) network provider. During the rollback, you must reboot every node in your cluster.
Only rollback to OpenShift SDN if the migration to OVN-Kubernetes fails. |
Prerequisites
Install the OpenShift CLI (
oc
).Access to the cluster as a user with the
cluster-admin
role.A cluster installed on infrastructure configured with the OVN-Kubernetes default CNI network provider.
Procedure
To enable the migration, set an annotation on the Cluster Network Operator configuration object by entering the following command:
$ oc annotate Network.operator.openshift.io cluster \
'networkoperator.openshift.io/network-migration'=""
Stop all of the machine configuration pools managed by the Machine Config Operator (MCO):
Stop the master configuration pool:
$ oc patch MachineConfigPool master --type='merge' --patch \
'{ "spec": { "paused": true } }'
Stop the worker configuration pool:
$ oc patch MachineConfigPool worker --type='merge' --patch \
'{ "spec":{ "paused" :true } }'
To configure the OpenShift SDN cluster network provider, enter the following command:
$ oc patch Network.config.openshift.io cluster \
--type='merge' --patch '{ "spec": { "networkType": "OpenShiftSDN" } }'
Optional: You can customize the following settings for OpenShift SDN to meet your network infrastructure requirements:
Maximum transmission unit (MTU)
VXLAN port
To customize either or both of the previously noted settings, customize and enter the following command. If you do not need to change the default value, omit the key from the patch.
$ oc patch Network.operator.openshift.io cluster --type=merge \
--patch '{
"spec":{
"defaultNetwork":{
"openshiftSDNConfig":{
"mtu":<mtu>,
"vxlanPort":<port>
}}}}'
mtu
The MTU for the VXLAN overlay network. This value is normally configured automatically, but if the nodes in your cluster do not all use the same MTU, then you must set this explicitly to
50
less than the smallest node MTU value.port
The UDP port for the VXLAN overlay network. If a value is not specified, the default is
4789
. The port cannot be the same as the Geneve port that is used by OVN-Kubernetes. The default value for the Geneve port is6081
.Example patch command
$ oc patch Network.operator.openshift.io cluster --type=merge \
--patch '{
"spec":{
"defaultNetwork":{
"openshiftSDNConfig":{
"mtu":1200
}}}}'
Wait until the Multus daemon set rollout completes.
$ oc -n openshift-multus rollout status daemonset/multus
The name of the Multus pods is in form of
multus-<xxxxx>
where<xxxxx>
is a random sequence of letters. It might take several moments for the pods to restart.Example output
Waiting for daemon set "multus" rollout to finish: 1 out of 6 new pods have been updated...
...
Waiting for daemon set "multus" rollout to finish: 5 of 6 updated pods are available...
daemon set "multus" successfully rolled out
To complete the rollback, reboot each node in your cluster. For example, you could use a bash script similar to the following. The script assumes that you can connect to each host by using
ssh
and that you have configuredsudo
to not prompt for a password.#!/bin/bash
for ip in $(oc get nodes -o jsonpath='{.items[*].status.addresses[?(@.type=="InternalIP")].address}')
do
echo "reboot node $ip"
ssh -o StrictHostKeyChecking=no core@$ip sudo shutdown -r -t 3
done
If ssh access is not available, you might be able to reboot each node through the management portal for your infrastructure provider.
After the nodes in your cluster have rebooted, start all of the machine configuration pools:
Start the master configuration pool:
$ oc patch MachineConfigPool master --type='merge' --patch \
'{ "spec": { "paused": false } }'
Start the worker configuration pool:
$ oc patch MachineConfigPool worker --type='merge' --patch \
'{ "spec": { "paused": false } }'
As the MCO updates machines in each config pool, it reboots each node.
By default the MCO updates a single machine per pool at a time, so the time that the migration requires to complete grows with the size of the cluster.
Confirm the status of the new machine configuration on the hosts:
To list the machine configuration state and the name of the applied machine configuration, enter the following command:
$ oc describe node | egrep "hostname|machineconfig"
Example output
kubernetes.io/hostname=master-0
machineconfiguration.openshift.io/currentConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b
machineconfiguration.openshift.io/desiredConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b
machineconfiguration.openshift.io/reason:
machineconfiguration.openshift.io/state: Done
Verify that the following statements are true:
The value of
machineconfiguration.openshift.io/state
field isDone
.The value of the
machineconfiguration.openshift.io/currentConfig
field is equal to the value of themachineconfiguration.openshift.io/desiredConfig
field.
To confirm that the machine config is correct, enter the following command:
$ oc get machineconfig <config_name> -o yaml
where
<config_name>
is the name of the machine config from themachineconfiguration.openshift.io/currentConfig
field.
Confirm that the migration succeeded:
To confirm that the default CNI network provider is OVN-Kubernetes, enter the following command. The value of
status.networkType
must beOpenShiftSDN
.$ oc get network.config/cluster -o jsonpath='{.status.networkType}{"\n"}'
To confirm that the cluster nodes are in the
Ready
state, enter the following command:$ oc get nodes
If a node is stuck in the
NotReady
state, investigate the machine config daemon pod logs and resolve any errors.To list the pods, enter the following command:
$ oc get pod -n openshift-machine-config-operator
Example output
NAME READY STATUS RESTARTS AGE
machine-config-controller-75f756f89d-sjp8b 1/1 Running 0 37m
machine-config-daemon-5cf4b 2/2 Running 0 43h
machine-config-daemon-7wzcd 2/2 Running 0 43h
machine-config-daemon-fc946 2/2 Running 0 43h
machine-config-daemon-g2v28 2/2 Running 0 43h
machine-config-daemon-gcl4f 2/2 Running 0 43h
machine-config-daemon-l5tnv 2/2 Running 0 43h
machine-config-operator-79d9c55d5-hth92 1/1 Running 0 37m
machine-config-server-bsc8h 1/1 Running 0 43h
machine-config-server-hklrm 1/1 Running 0 43h
machine-config-server-k9rtx 1/1 Running 0 43h
The names for the config daemon pods are in the following format:
machine-config-daemon-<seq>
. The<seq>
value is a random five character alphanumeric sequence.To display the pod log for each machine config daemon pod shown in the previous output, enter the following command:
$ oc logs <pod> -n openshift-machine-config-operator
where
pod
is the name of a machine config daemon pod.Resolve any errors in the logs shown by the output from the previous command.
To confirm that your pods are not in an error state, enter the following command:
$ oc get pods --all-namespaces -o wide --sort-by='{.spec.nodeName}'
If pods on a node are in an error state, reboot that node.
Complete the following steps only if the migration succeeds and your cluster is in a good state:
To remove the migration annotation from the Cluster Network Operator configuration object, enter the following command:
$ oc annotate Network.operator.openshift.io cluster \
networkoperator.openshift.io/network-migration-
To remove the OVN-Kubernetes network provider namespace, enter the following command:
$ oc delete namespace openshift-ovn-kubernetes