Configuring an SR-IOV network device for virtual machines

You can configure a Single Root I/O Virtualization (SR-IOV) device for virtual machines in your cluster. This process is similar but not identical to configuring an SR-IOV device for OKD.

Prerequisites

Automated discovery of SR-IOV network devices

The SR-IOV Network Operator searches your cluster for SR-IOV capable network devices on worker nodes. The Operator creates and updates a SriovNetworkNodeState custom resource (CR) for each worker node that provides a compatible SR-IOV network device.

The CR is assigned the same name as the worker node. The status.interfaces list provides information about the network devices on a node.

Do not modify a SriovNetworkNodeState object. The Operator creates and manages these resources automatically.

Example SriovNetworkNodeState object

The following YAML is an example of a SriovNetworkNodeState object created by the SR-IOV Network Operator:

An SriovNetworkNodeState object

  1. apiVersion: sriovnetwork.openshift.io/v1
  2. kind: SriovNetworkNodeState
  3. metadata:
  4. name: node-25 (1)
  5. namespace: openshift-sriov-network-operator
  6. ownerReferences:
  7. - apiVersion: sriovnetwork.openshift.io/v1
  8. blockOwnerDeletion: true
  9. controller: true
  10. kind: SriovNetworkNodePolicy
  11. name: default
  12. spec:
  13. dpConfigVersion: "39824"
  14. status:
  15. interfaces: (2)
  16. - deviceID: "1017"
  17. driver: mlx5_core
  18. mtu: 1500
  19. name: ens785f0
  20. pciAddress: "0000:18:00.0"
  21. totalvfs: 8
  22. vendor: 15b3
  23. - deviceID: "1017"
  24. driver: mlx5_core
  25. mtu: 1500
  26. name: ens785f1
  27. pciAddress: "0000:18:00.1"
  28. totalvfs: 8
  29. vendor: 15b3
  30. - deviceID: 158b
  31. driver: i40e
  32. mtu: 1500
  33. name: ens817f0
  34. pciAddress: 0000:81:00.0
  35. totalvfs: 64
  36. vendor: "8086"
  37. - deviceID: 158b
  38. driver: i40e
  39. mtu: 1500
  40. name: ens817f1
  41. pciAddress: 0000:81:00.1
  42. totalvfs: 64
  43. vendor: "8086"
  44. - deviceID: 158b
  45. driver: i40e
  46. mtu: 1500
  47. name: ens803f0
  48. pciAddress: 0000:86:00.0
  49. totalvfs: 64
  50. vendor: "8086"
  51. syncStatus: Succeeded
1The value of the name field is the same as the name of the worker node.
2The interfaces stanza includes a list of all of the SR-IOV devices discovered by the Operator on the worker node.

Configuring SR-IOV network devices

The SR-IOV Network Operator adds the SriovNetworkNodePolicy.sriovnetwork.openshift.io CustomResourceDefinition to OKD. You can configure an SR-IOV network device by creating a SriovNetworkNodePolicy custom resource (CR).

When applying the configuration specified in a SriovNetworkNodePolicy object, the SR-IOV Operator might drain the nodes, and in some cases, reboot nodes.

It might take several minutes for a configuration change to apply.

Prerequisites

  • You installed the OpenShift CLI (oc).

  • You have access to the cluster as a user with the cluster-admin role.

  • You have installed the SR-IOV Network Operator.

  • You have enough available nodes in your cluster to handle the evicted workload from drained nodes.

  • You have not selected any control plane nodes for SR-IOV network device configuration.

Procedure

  1. Create an SriovNetworkNodePolicy object, and then save the YAML in the <name>-sriov-node-network.yaml file. Replace <name> with the name for this configuration.
  1. apiVersion: sriovnetwork.openshift.io/v1
  2. kind: SriovNetworkNodePolicy
  3. metadata:
  4. name: <name> (1)
  5. namespace: openshift-sriov-network-operator (2)
  6. spec:
  7. resourceName: <sriov_resource_name> (3)
  8. nodeSelector:
  9. feature.node.kubernetes.io/network-sriov.capable: "true" (4)
  10. priority: <priority> (5)
  11. mtu: <mtu> (6)
  12. numVfs: <num> (7)
  13. nicSelector: (8)
  14. vendor: "<vendor_code>" (9)
  15. deviceID: "<device_id>" (10)
  16. pfNames: ["<pf_name>", ...] (11)
  17. rootDevices: ["<pci_bus_id>", "..."] (12)
  18. deviceType: vfio-pci (13)
  19. isRdma: false (14)
1Specify a name for the CR object.
2Specify the namespace where the SR-IOV Operator is installed.
3Specify the resource name of the SR-IOV device plug-in. You can create multiple SriovNetworkNodePolicy objects for a resource name.
4Specify the node selector to select which nodes are configured. Only SR-IOV network devices on selected nodes are configured. The SR-IOV Container Network Interface (CNI) plug-in and device plug-in are deployed only on selected nodes.
5Optional: Specify an integer value between 0 and 99. A smaller number gets higher priority, so a priority of 10 is higher than a priority of 99. The default value is 99.
6Optional: Specify a value for the maximum transmission unit (MTU) of the virtual function. The maximum MTU value can vary for different NIC models.
7Specify the number of the virtual functions (VF) to create for the SR-IOV physical network device. For an Intel network interface controller (NIC), the number of VFs cannot be larger than the total VFs supported by the device. For a Mellanox NIC, the number of VFs cannot be larger than 128.
8The nicSelector mapping selects the Ethernet device for the Operator to configure. You do not need to specify values for all the parameters. It is recommended to identify the Ethernet adapter with enough precision to minimize the possibility of selecting an Ethernet device unintentionally. If you specify rootDevices, you must also specify a value for vendor, deviceID, or pfNames. If you specify both pfNames and rootDevices at the same time, ensure that they point to an identical device.
9Optional: Specify the vendor hex code of the SR-IOV network device. The only allowed values are either 8086 or 15b3.
10Optional: Specify the device hex code of SR-IOV network device. The only allowed values are 158b, 1015, 1017.
11Optional: The parameter accepts an array of one or more physical function (PF) names for the Ethernet device.
12The parameter accepts an array of one or more PCI bus addresses for the physical function of the Ethernet device. Provide the address in the following format: 0000:02:00.1.
13The vfio-pci driver type is required for virtual functions in OKD Virtualization.
14Optional: Specify whether to enable remote direct memory access (RDMA) mode. For a Mellanox card, set isRdma to false. The default value is false.

If isRDMA flag is set to true, you can continue to use the RDMA enabled VF as a normal network device. A device can be used in either mode.

  1. Create the SriovNetworkNodePolicy object:

    1. $ oc create -f <name>-sriov-node-network.yaml

    where <name> specifies the name for this configuration.

    After applying the configuration update, all the pods in sriov-network-operator namespace transition to the Running status.

  2. To verify that the SR-IOV network device is configured, enter the following command. Replace <node_name> with the name of a node with the SR-IOV network device that you just configured.

    1. $ oc get sriovnetworknodestates -n openshift-sriov-network-operator <node_name> -o jsonpath='{.status.syncStatus}'

Next steps