Harvester Network Best Practice

Replace Ethernet NICs

You may want to replace the Ethernet NICs of a bare-metal node in a Harvester cluster for various reasons, including the following:

  • Malfunction or damage

  • Insufficent hardware capacity

  • Missing features

You can follow the steps below and run them in each node step by step.

Pre-Replacement Checks

  1. Verify that the installed Harvester version supports the new NICs.

  2. Test the new NICs in non-production environment.

  3. On the Virtual Machines screen of the Harvester UI, verify that the status of all VMs is either Running or Stopped.

  4. On the embedded Longhorn dashboard, verify that the status of all Longhorn volumes is Healthy.

  5. (Optional) On the Harvester Support screen, generate a support bundle for comparison purposes.

Collect Information

Before any action is taken, it is important to collect the current network information and status.

  • Harvester network configuration: By default, Harvester creates a bond interface named mgmt-bo for the management network and one new bond interface for each cluster network. Harvester saves network configuration details in the file /oem/90_custom.yaml.

    Example: A NIC named ens3 was added to the mgmt-bo bond interface.

    1. - path: /etc/sysconfig/network/ifcfg-mgmt-bo
    2. permissions: 384
    3. owner: 0
    4. group: 0
    5. content: |+
    6. STARTMODE='onboot'
    7. BONDING_MASTER='yes'
    8. BOOTPROTO='none'
    9. POST_UP_SCRIPT="wicked:setup_bond.sh"
    10. BONDING_SLAVE_0='ens3'
    11. BONDING_MODULE_OPTS='miimon=100 mode=active-backup '
    12. DHCLIENT_SET_DEFAULT_ROUTE='no'
    13. encoding: ""
    14. ownerstring: ""
    15. - path: /etc/sysconfig/network/ifcfg-ens3
    16. permissions: 384
    17. owner: 0
    18. group: 0
    19. content: |
    20. STARTMODE='hotplug'
    21. BOOTPROTO='none'
    22. encoding: ""
    23. ownerstring: ""
  • Physical NICs: You can use the command ip link to retrieve related information, including the state of each NIC and the corresponding master (if applicable).

    Example:

    1. $ ip link | grep master -1
    2. 2: ens3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master mgmt-bo state UP mode DEFAULT group default qlen 1000
    3. link/ether 52:54:00:03:3a:e4 brd ff:ff:ff:ff:ff:ff
    4. --
    5. 4: mgmt-bo: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master mgmt-br state UP mode DEFAULT group default qlen 1000
    6. link/ether 52:54:00:03:3a:e4 brd ff:ff:ff:ff:ff:ff
  • PCI devices: You can use the command lspci to retrieve a list of devices, which allows you to quickly identify the network NICs. To retrieve detailed information about each device, use the command lspci -v.

    Example (lspci):

    1. $ lspci
    2. 00:03.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 03)

    Example (lspci -v):

    1. $ lspci -v
    2. 00:03.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 03)
    3. Subsystem: Red Hat, Inc. QEMU Virtual Machine
    4. Physical Slot: 3
    5. Flags: bus master, fast devsel, latency 0, IRQ 11
    6. Memory at fc080000 (32-bit, non-prefetchable) [size=128K]
    7. I/O ports at c000 [size=64]
    8. Expansion ROM at fc000000 [disabled] [size=512K]
    9. Kernel driver in use: e1000
    10. Kernel modules: e1000
  • Linux kernel log: You can use the command dmesg to display kernel messages, which include most of the required information. If you save the messages to kernel.log, you can check the driver and link status.

    Harvester places sub-NICs into the bond interfaces. In the following example, an additional bond interface named data-bo is created in the cluster.

    1. $ grep "(slave" kernel.log (or: dmesg | grep "(slave")
    2. Jan 08 00:35:00 localhost kernel: mgmt-bo: (slave eno5): Enslaving as a backup interface with an up link
    3. Jan 08 00:35:00 localhost kernel: mgmt-bo: (slave ens4f0): Enslaving as a backup interface with an up link
    4. Jan 08 00:37:34 localhost kernel: data-bo: (slave eno6): Enslaving as a backup interface with an up link
    5. Jan 08 00:37:35 localhost kernel: data-bo: (slave ens4f1): Enslaving as a backup interface with an up link

    The NICs are renamed.

    1. $ grep "renamed" kernel.log
    2. Jan 08 00:34:48 localhost kernel: tg3 0000:02:00.0 eno1: renamed from eth2 // eth2 / eno1 is not used by Harvester
    3. Jan 08 00:34:48 localhost kernel: tg3 0000:02:00.3 eno4: renamed from eth6 // eth6 / eno4 is not used by Harvester
    4. Jan 08 00:34:48 localhost kernel: tg3 0000:02:00.2 eno3: renamed from eth5 // eth5 / eno3 is not used by Harvester
    5. Jan 08 00:34:48 localhost kernel: tg3 0000:02:00.1 eno2: renamed from eth3 // eth3 / eno2 is not used by Harvester
    6. Jan 08 00:34:49 localhost kernel: i40e 0000:5d:00.0 eno5: renamed from eth0
    7. Jan 08 00:34:49 localhost kernel: i40e 0000:af:00.0 ens4f0: renamed from eth4
    8. Jan 08 00:34:49 localhost kernel: i40e 0000:5d:00.1 eno6: renamed from eth1
    9. Jan 08 00:34:49 localhost kernel: i40e 0000:af:00.1 ens4f1: renamed from eth2

    The NIC driver of eno5(0000:5d:00.0) is (intel) i40e 10Gbps Full Duplex.

    1. $ grep "0000:5d:00.0" kernel.log
    2. Jan 08 00:34:47 localhost kernel: i40e 0000:5d:00.0: fw 8.71.63306 api 1.11 nvm 10.54.7 [8086:1572] [103c:22fc]
    3. Jan 08 00:34:47 localhost kernel: i40e 0000:5d:00.0: MAC address: 48:df:37:24:c2:00
    4. Jan 08 00:34:47 localhost kernel: i40e 0000:5d:00.0: FW LLDP is enabled
    5. Jan 08 00:34:47 localhost kernel: i40e 0000:5d:00.0 eth0: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
    6. Jan 08 00:34:47 localhost kernel: i40e 0000:5d:00.0: PCI-Express: Speed 8.0GT/s Width x8
    7. Jan 08 00:34:47 localhost kernel: i40e 0000:5d:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 112 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
    8. Jan 08 00:34:49 localhost kernel: i40e 0000:5d:00.0 eno5: renamed from eth0

    The enabled NICs are detected.

    1. $ grep "is Up" kernel.log
    2. Jan 08 00:34:47 localhost kernel: i40e 0000:5d:00.0 eth0: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
    3. Jan 08 00:34:48 localhost kernel: i40e 0000:5d:00.1 eth1: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
    4. Jan 08 00:34:48 localhost kernel: i40e 0000:af:00.0 eth4: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
    5. Jan 08 00:34:49 localhost kernel: i40e 0000:af:00.1 eth2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None

Enable Maintenance Mode

  1. (Optional) Stop VMs that cannot or must not be migrated.

  2. Enable maintenance mode on the target node to automatically migrate all VMs to other nodes.

  • Wait for everything to become ready, and then repeat the steps in the Pre-check section.

  • Manually stop a VM in the following situations:

    • The VM fails to migrate.

    • The VM has selectors that prevent it from migrating to other nodes.

    • The VM has special hardware (for example, PCI passthrough or vGPUs) that prevent it from migrating to other nodes.

(Optional) Update the Network Config

There are one or more Network Config under every Cluster Network on Harvester. Each Network Config is backed by a VlanConfig CRD object.

Harvester Network Best Practice - 图1important

Updating the Network Config is required if the new NICs will be placed in different physical slots or will have different uplink parameters.

  1. Check the node.

    When a Harvester cluster node belongs to a Network Config, the Node object has a label with the key network.harvesterhci.io/vlanconfig.

    Example:

    1. apiVersion: v1
    2. kind: Node
    3. metadata:
    4. labels:
    5. ...
    6. network.harvesterhci.io/vlanconfig: vlan123
  2. Remove this node from the Network Config.

    When the new NICs are placed in different slots, you must change the Network Config to exclude this node. You can delete the VlanConfig if the Network Config object selects only this node from nodeSelector.

    Example:

    1. apiVersion: network.harvesterhci.io/v1beta1
    2. kind: VlanConfig
    3. spec:
    4. clusterNetwork: data
    5. nodeSelector:
    6. kubernetes.io/hostname: node123 // select one or more nodes
    7. uplink:
    8. bondOptions:
    9. miimon: 100
    10. mode: 802.3ad
    11. linkAttributes:
    12. mtu: 1500
    13. txQLen: -1
    14. nics:
    15. - enp0s1
    16. - enp0s2

    When VMs are still running on an affected node, the network webhook returns an error.

  3. Check the Node object.

    Depending on the situation, either the label network.harvesterhci.io/vlanconfig changes or is removed.

  4. Check the VlanStatus object.

    Depending on the situation, either the status of the VlanStatus object’s ready condition changes to "True" or the object is deleted.

    Example:

    1. apiVersion: network.harvesterhci.io/v1beta1
    2. kind: VlanStatus
    3. metadata:
    4. ...
    5. status:
    6. clusterNetwork: data
    7. conditions:
    8. - lastUpdateTime: "2024-02-03T18:32:41Z"
    9. status: "True"
    10. type: ready
    11. linkMonitor: public
    12. localAreas:
    13. - cidr: 10.190.186.0/24
    14. vlanID: 2013
    15. node: node123
    16. vlanConfig: vlan123

(Optional) Drain the Node

You may find that some Longhorn replicas remain active on the node even after completing the previously outlined procedures.

  1. Drain the node. (This is optional in Harvester.)

    • Scenario 1: The numReplicas value of all volumes is 3, which means that each Longhorn volume has three active replicas.

      The Longhorn Engine recognizes that it can no longer communicate with the replica on the drained node, and then marks that replica as failed. None of the replicas hold any special significance to Longhorn so it functions as long as it can communicate with at least one replica.

    • Scenario 2: Some Longhorn volumes have fewer than three active replicas, or you manually attached volumes using the Harvester UI or Longhorn UI.

      You must manually detach the replicas or move them to other nodes, and then drain the node using the command kubectl drain --ignore-daemonsets <node name>. The option --ignore-daemonsets is required because Longhorn deploys daemonsets such as Longhorn Manager, Longhorn CSI plugin, and Longhorn Engine image.

      Replicas running on the node are stopped and marked as Failed. Longhorn Engine processes running on the node are migrated with the pod to other nodes. Once the node is fully drained, no replicas and engine processes should remain running on the node.

  2. Replenish replicas.

    After a node is shut down, Longhorn does not start rebuilding the replicas on other nodes until the replica-replenishment-wait-interval (default value: 600 seconds) is reached. If the node comes back online before the wait interval value is reached, Longhorn reuses the replicas. Otherwise, Longhorn rebuilds the replicas on another node.

    During system maintenance, you can modify the replica-replenishment-wait-interval value using the embedded Longhorn UI to enable faster replica rebuilding.

    Harvester v1.3.0 uses Longhorn v1.6.0, while Harvester v1.2.1 uses Longhorn v1.4.3.

Replace the Nics

  1. Shut the node down.

  2. Replace the NICs.

  3. Restart the node.

  4. Collect information about the current network configuration and status.

If you observe any abnormalities, generate a support bundle for troubleshooting purposes.

(Optional) Update the Network Config Again

Harvester Network Best Practice - 图2important

Updating the Network Config is required if the new NICs will be placed in different physical slots.

  1. Add the node to the Network Config.

    You must create a new Network Config or change the Network Config to include this node.

  2. Check the Node object.

    The label network.harvesterhci.io/vlanconfig reflects the specific Network Config used.

  3. Check the VlanStatus object.

    The status of the VlanStatus object’s ready condition changes to "True".

Disable Maintenance Mode

  1. Wait for the node to be moved back to the cluster.

  2. Disable maintenance mode.

  3. (Optional) Start the VMs that you manually stopped.

  4. (Optional) Manually migrate VMs to this node.

Troubleshooting

Harvester uses multiple network-related pods and CRDs. When troubleshooting, check the pod logs and the status of CRD objects.

Pods:

  1. $ kubectl get pods -n harvester-system
  2. NAME READY STATUS RESTARTS AGE
  3. harvester-network-controller-cnf22 1/1 Running 2 (60m ago) 3d22h // Network controller agent daemonSet, deployed in each node
  4. harvester-network-controller-manager-859c4bd874-xcllf 1/1 Running 2 (60m ago) 3d22h // Network controller
  5. harvester-network-webhook-56b877d5d5-z42dp 1/1 Running 2 (60m ago) 3d22h

CRDs:

  1. clusternetworks.network.harvesterhci.io
  2. linkmonitors.network.harvesterhci.io
  3. vlanconfigs.network.harvesterhci.io
  4. vlanstatuses.network.harvesterhci.io