Rolling back to the OVN-Kubernetes network plugin

As a cluster administrator, you can rollback to the OVN-Kubernetes network plugin from the OpenShift SDN network plugin if the migration to OpenShift SDN is unsuccessful.

To learn more about OVN-Kubernetes, read About the OVN-Kubernetes network plugin.

Migrating to the OVN-Kubernetes network plugin

As a cluster administrator, you can change the network plugin for your cluster to OVN-Kubernetes. During the migration, you must reboot every node in your cluster.

While performing the migration, your cluster is unavailable and workloads might be interrupted. Perform the migration only when an interruption in service is acceptable.

Prerequisites

  • A cluster configured with the OpenShift SDN CNI network plugin in the network policy isolation mode.

  • Install the OpenShift CLI (oc).

  • Access to the cluster as a user with the cluster-admin role.

  • A recent backup of the etcd database is available.

  • A reboot can be triggered manually for each node.

  • The cluster is in a known good state, without any errors.

Procedure

  1. To backup the configuration for the cluster network, enter the following command:

    1. $ oc get Network.config.openshift.io cluster -o yaml > cluster-openshift-sdn.yaml
  2. To prepare all the nodes for the migration, set the migration field on the Cluster Network Operator configuration object by entering the following command:

    1. $ oc patch Network.operator.openshift.io cluster --type='merge' \
    2. --patch '{ "spec": { "migration": { "networkType": "OVNKubernetes" } } }'

    This step does not deploy OVN-Kubernetes immediately. Instead, specifying the migration field triggers the Machine Config Operator (MCO) to apply new machine configs to all the nodes in the cluster in preparation for the OVN-Kubernetes deployment.

  3. Optional: You can disable automatic migration of several OpenShift SDN capabilities to the OVN-Kubernetes equivalents:

    • Egress IPs

    • Egress firewall

    • Multicast

    To disable automatic migration of the configuration for any of the previously noted OpenShift SDN features, specify the following keys:

    1. $ oc patch Network.operator.openshift.io cluster --type='merge' \
    2. --patch '{
    3. "spec": {
    4. "migration": {
    5. "networkType": "OVNKubernetes",
    6. "features": {
    7. "egressIP": <bool>,
    8. "egressFirewall": <bool>,
    9. "multicast": <bool>
    10. }
    11. }
    12. }
    13. }'

    where:

    bool: Specifies whether to enable migration of the feature. The default is true.

  4. Optional: You can customize the following settings for OVN-Kubernetes to meet your network infrastructure requirements:

    • Maximum transmission unit (MTU)

    • Geneve (Generic Network Virtualization Encapsulation) overlay network port

    • OVN-Kubernetes IPv4 internal subnet

    • OVN-Kubernetes IPv6 internal subnet

    To customize either of the previously noted settings, enter and customize the following command. If you do not need to change the default value, omit the key from the patch.

    1. $ oc patch Network.operator.openshift.io cluster --type=merge \
    2. --patch '{
    3. "spec":{
    4. "defaultNetwork":{
    5. "ovnKubernetesConfig":{
    6. "mtu":<mtu>,
    7. "genevePort":<port>,
    8. "v4InternalSubnet":"<ipv4_subnet>",
    9. "v6InternalSubnet":"<ipv6_subnet>"
    10. }}}}'

    where:

    mtu

    The MTU for the Geneve overlay network. This value is normally configured automatically, but if the nodes in your cluster do not all use the same MTU, then you must set this explicitly to 100 less than the smallest node MTU value.

    port

    The UDP port for the Geneve overlay network. If a value is not specified, the default is 6081. The port cannot be the same as the VXLAN port that is used by OpenShift SDN. The default value for the VXLAN port is 4789.

    ipv4_subnet

    An IPv4 address range for internal use by OVN-Kubernetes. You must ensure that the IP address range does not overlap with any other subnet used by your OKD installation. The IP address range must be larger than the maximum number of nodes that can be added to the cluster. The default value is 100.64.0.0/16.

    ipv6_subnet

    An IPv6 address range for internal use by OVN-Kubernetes. You must ensure that the IP address range does not overlap with any other subnet used by your OKD installation. The IP address range must be larger than the maximum number of nodes that can be added to the cluster. The default value is fd98::/48.

    Example patch command to update mtu field

    1. $ oc patch Network.operator.openshift.io cluster --type=merge \
    2. --patch '{
    3. "spec":{
    4. "defaultNetwork":{
    5. "ovnKubernetesConfig":{
    6. "mtu":1200
    7. }}}}'
  5. As the MCO updates machines in each machine config pool, it reboots each node one by one. You must wait until all the nodes are updated. Check the machine config pool status by entering the following command:

    1. $ oc get mcp

    A successfully updated node has the following status: UPDATED=true, UPDATING=false, DEGRADED=false.

    By default, the MCO updates one machine per pool at a time, causing the total time the migration takes to increase with the size of the cluster.

  6. Confirm the status of the new machine configuration on the hosts:

    1. To list the machine configuration state and the name of the applied machine configuration, enter the following command:

      1. $ oc describe node | egrep "hostname|machineconfig"

      Example output

      1. kubernetes.io/hostname=master-0
      2. machineconfiguration.openshift.io/currentConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b
      3. machineconfiguration.openshift.io/desiredConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b
      4. machineconfiguration.openshift.io/reason:
      5. machineconfiguration.openshift.io/state: Done

      Verify that the following statements are true:

      • The value of machineconfiguration.openshift.io/state field is Done.

      • The value of the machineconfiguration.openshift.io/currentConfig field is equal to the value of the machineconfiguration.openshift.io/desiredConfig field.

    2. To confirm that the machine config is correct, enter the following command:

      1. $ oc get machineconfig <config_name> -o yaml | grep ExecStart

      where <config_name> is the name of the machine config from the machineconfiguration.openshift.io/currentConfig field.

      The machine config must include the following update to the systemd configuration:

      1. ExecStart=/usr/local/bin/configure-ovs.sh OVNKubernetes
    3. If a node is stuck in the NotReady state, investigate the machine config daemon pod logs and resolve any errors.

      1. To list the pods, enter the following command:

        1. $ oc get pod -n openshift-machine-config-operator

        Example output

        1. NAME READY STATUS RESTARTS AGE
        2. machine-config-controller-75f756f89d-sjp8b 1/1 Running 0 37m
        3. machine-config-daemon-5cf4b 2/2 Running 0 43h
        4. machine-config-daemon-7wzcd 2/2 Running 0 43h
        5. machine-config-daemon-fc946 2/2 Running 0 43h
        6. machine-config-daemon-g2v28 2/2 Running 0 43h
        7. machine-config-daemon-gcl4f 2/2 Running 0 43h
        8. machine-config-daemon-l5tnv 2/2 Running 0 43h
        9. machine-config-operator-79d9c55d5-hth92 1/1 Running 0 37m
        10. machine-config-server-bsc8h 1/1 Running 0 43h
        11. machine-config-server-hklrm 1/1 Running 0 43h
        12. machine-config-server-k9rtx 1/1 Running 0 43h

        The names for the config daemon pods are in the following format: machine-config-daemon-<seq>. The <seq> value is a random five character alphanumeric sequence.

      2. Display the pod log for the first machine config daemon pod shown in the previous output by enter the following command:

        1. $ oc logs <pod> -n openshift-machine-config-operator

        where pod is the name of a machine config daemon pod.

      3. Resolve any errors in the logs shown by the output from the previous command.

  1. To start the migration, configure the OVN-Kubernetes network plugin by using one of the following commands:

    • To specify the network provider without changing the cluster network IP address block, enter the following command:

      1. $ oc patch Network.config.openshift.io cluster \
      2. --type='merge' --patch '{ "spec": { "networkType": "OVNKubernetes" } }'
    • To specify a different cluster network IP address block, enter the following command:

      1. $ oc patch Network.config.openshift.io cluster \
      2. --type='merge' --patch '{
      3. "spec": {
      4. "clusterNetwork": [
      5. {
      6. "cidr": "<cidr>",
      7. "hostPrefix": <prefix>
      8. }
      9. ],
      10. "networkType": "OVNKubernetes"
      11. }
      12. }'

      where cidr is a CIDR block and prefix is the slice of the CIDR block apportioned to each node in your cluster. You cannot use any CIDR block that overlaps with the 100.64.0.0/16 CIDR block because the OVN-Kubernetes network provider uses this block internally.

      You cannot change the service network address block during the migration.

  2. Verify that the Multus daemon set rollout is complete before continuing with subsequent steps:

    1. $ oc -n openshift-multus rollout status daemonset/multus

    The name of the Multus pods is in the form of multus-<xxxxx> where <xxxxx> is a random sequence of letters. It might take several moments for the pods to restart.

    Example output

    1. Waiting for daemon set "multus" rollout to finish: 1 out of 6 new pods have been updated...
    2. ...
    3. Waiting for daemon set "multus" rollout to finish: 5 of 6 updated pods are available...
    4. daemon set "multus" successfully rolled out
  3. To complete changing the network plugin, reboot each node in your cluster. You can reboot the nodes in your cluster with either of the following approaches:

    • With the oc rsh command, you can use a bash script similar to the following:

      1. #!/bin/bash
      2. readarray -t POD_NODES <<< "$(oc get pod -n openshift-machine-config-operator -o wide| grep daemon|awk '{print $1" "$7}')"
      3. for i in "${POD_NODES[@]}"
      4. do
      5. read -r POD NODE <<< "$i"
      6. until oc rsh -n openshift-machine-config-operator "$POD" chroot /rootfs shutdown -r +1
      7. do
      8. echo "cannot reboot node $NODE, retry" && sleep 3
      9. done
      10. done
    • With the ssh command, you can use a bash script similar to the following. The script assumes that you have configured sudo to not prompt for a password.

      1. #!/bin/bash
      2. for ip in $(oc get nodes -o jsonpath='{.items[*].status.addresses[?(@.type=="InternalIP")].address}')
      3. do
      4. echo "reboot node $ip"
      5. ssh -o StrictHostKeyChecking=no core@$ip sudo shutdown -r -t 3
      6. done
  4. Confirm that the migration succeeded:

    1. To confirm that the network plugin is OVN-Kubernetes, enter the following command. The value of status.networkType must be OVNKubernetes.

      1. $ oc get network.config/cluster -o jsonpath='{.status.networkType}{"\n"}'
    2. To confirm that the cluster nodes are in the Ready state, enter the following command:

      1. $ oc get nodes
    3. To confirm that your pods are not in an error state, enter the following command:

      1. $ oc get pods --all-namespaces -o wide --sort-by='{.spec.nodeName}'

      If pods on a node are in an error state, reboot that node.

    4. To confirm that all of the cluster Operators are not in an abnormal state, enter the following command:

      1. $ oc get co

      The status of every cluster Operator must be the following: AVAILABLE="True", PROGRESSING="False", DEGRADED="False". If a cluster Operator is not available or degraded, check the logs for the cluster Operator for more information.

  5. Complete the following steps only if the migration succeeds and your cluster is in a good state:

    1. To remove the migration configuration from the CNO configuration object, enter the following command:

      1. $ oc patch Network.operator.openshift.io cluster --type='merge' \
      2. --patch '{ "spec": { "migration": null } }'
    2. To remove custom configuration for the OpenShift SDN network provider, enter the following command:

      1. $ oc patch Network.operator.openshift.io cluster --type='merge' \
      2. --patch '{ "spec": { "defaultNetwork": { "openshiftSDNConfig": null } } }'
    3. To remove the OpenShift SDN network provider namespace, enter the following command:

      1. $ oc delete namespace openshift-sdn