Managing symmetric routing with MetalLB

As a cluster administrator, you can effectively manage traffic for pods behind a MetalLB load-balancer service with multiple host interfaces by implementing features from MetalLB, NMState, and OVN-Kubernetes. By combining these features in this context, you can provide symmetric routing, traffic segregation, and support clients on different networks with overlapping CIDR addresses.

To achieve this functionality, learn how to implement virtual routing and forwarding (VRF) instances with MetalLB, and configure egress services.

Configuring symmetric traffic by using a VRF instance with MetalLB and an egress service is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

Challenges of managing symmetric routing with MetalLB

When you use MetalLB with multiple host interfaces, MetalLB exposes and announces a service through all available interfaces on the host. This can present challenges relating to network isolation, asymmetric return traffic and overlapping CIDR addresses.

One option to ensure that return traffic reaches the correct client is to use static routes. However, with this solution, MetalLB cannot isolate the services and then announce each service through a different interface. Additionally, static routing requires manual configuration and requires maintenance if remote sites are added.

A further challenge of symmetric routing when implementing a MetalLB service is scenarios where external systems expect the source and destination IP address for an application to be the same. The default behavior for OKD is to assign the IP address of the host network interface as the source IP address for traffic originating from pods. This is problematic with multiple host interfaces.

You can overcome these challenges by implementing a configuration that combines features from MetalLB, NMState, and OVN-Kubernetes.

Overview of managing symmetric routing by using VRFs with MetalLB

You can overcome the challenges of implementing symmetric routing by using NMState to configure a VRF instance on a host, associating the VRF instance with a MetalLB BGPPeer resource, and configuring an egress service for egress traffic with OVN-Kubernetes.

Network overview of managing symmetric routing by using VRFs with MetalLB

Figure 1. Network overview of managing symmetric routing by using VRFs with MetalLB

The configuration process involves three stages:

1. Define a VRF and routing rules

  • Configure a NodeNetworkConfigurationPolicy custom resource (CR) to associate a VRF instance with a network interface.

  • Use the VRF routing table to direct ingress and egress traffic.

2. Link the VRF to a MetalLB BGPPeer

  • Configure a MetalLB BGPPeer resource to use the VRF instance on a network interface.

  • By associating the BGPPeer resource with the VRF instance, the designated network interface becomes the primary interface for the BGP session, and MetalLB advertises the services through this interface.

3. Configure an egress service

  • Configure an egress service to choose the network associated with the VRF instance for egress traffic.

  • Optional: Configure an egress service to use the IP address of the MetalLB load-balancer service as the source IP for egress traffic.

Configuring symmetric routing by using VRFs with MetalLB

You can configure symmetric network routing for applications behind a MetalLB service that require the same ingress and egress network paths.

This example associates a VRF routing table with MetalLB and an egress service to enable symmetric routing for ingress and egress traffic for pods behind a LoadBalancer service.

  • If you use the sourceIPBy: “LoadBalancerIP” setting in the EgressService CR, you must specify the load-balancer node in the BGPAdvertisement custom resource (CR).

  • You can use the sourceIPBy: “Network” setting on clusters that use OVN-Kubernetes configured with the gatewayConfig.routingViaHost specification set to true only. Additionally, if you use the sourceIPBy: “Network” setting, you must schedule the application workload on nodes configured with the network VRF instance.

Prerequisites

  • Install the OpenShift CLI (oc).

  • Log in as a user with cluster-admin privileges.

Procedure

  1. Create a NodeNetworkConfigurationPolicy CR to define the VRF instance:

    1. Create a file, such as node-network-vrf.yaml, with content like the following example:

      1. apiVersion: nmstate.io/v1
      2. kind: NodeNetworkConfigurationPolicy
      3. metadata:
      4. name: vrfpolicy (1)
      5. spec:
      6. nodeSelector:
      7. vrf: "true" (2)
      8. maxUnavailable: 3
      9. desiredState:
      10. interfaces:
      11. - name: ens4vrf (3)
      12. type: vrf (4)
      13. state: up
      14. vrf:
      15. port:
      16. - ens4 (5)
      17. route-table-id: 2 (6)
      18. routes: (7)
      19. config:
      20. - destination: 0.0.0.0/0
      21. metric: 150
      22. next-hop-address: 192.168.130.1
      23. next-hop-interface: ens4
      24. table-id: 2
      25. route-rules: (8)
      26. config:
      27. - ip-to: 172.30.0.0/16
      28. priority: 998
      29. route-table: 254
      30. - ip-to: 10.132.0.0/14
      31. priority: 998
      32. route-table: 254
      1The name of the policy.
      2This example applies the policy to all nodes with the label vrf:true.
      3The name of the interface.
      4The type of interface. This example creates a VRF instance.
      5The node interface that the VRF attaches to.
      6The name of the route table ID for the VRF.
      7Defines the configuration for network routes. The next-hop-address field defines the IP address of the next hop for the route. The next-hop-interface field defines the outgoing interface for the route. In this example, the VRF routing table is 2, which references the ID that you define in the EgressService CR.
      8Defines additional route rules. The ip-to fields must match the Cluster Network CIDR and Service Network CIDR. You can view the values for these CIDR address specifications by running the following command: oc describe network.config/cluster.
    2. Apply the policy by running the following command:

      1. $ oc apply -f node-network-vrf.yaml
  2. Create a BGPPeer custom resource (CR):

    1. Create a file, such as frr-via-vrf.yaml, with content like the following example:

      1. apiVersion: metallb.io/v1beta2
      2. kind: BGPPeer
      3. metadata:
      4. name: frrviavrf
      5. namespace: metallb-system
      6. spec:
      7. myASN: 100
      8. peerASN: 200
      9. peerAddress: 192.168.130.1
      10. vrf: ens4vrf (1)
      1Specifies the VRF instance to associate with the BGP peer. MetalLB can advertise services and make routing decisions based on the routing information in the VRF.
    2. Apply the configuration for the BGP peer by running the following command:

      1. $ oc apply -f frr-via-vrf.yaml
  3. Create an IPAddressPool CR:

    1. Create a file, such as first-pool.yaml, with content like the following example:

      1. apiVersion: metallb.io/v1beta1
      2. kind: IPAddressPool
      3. metadata:
      4. name: first-pool
      5. namespace: metallb-system
      6. spec:
      7. addresses:
      8. - 192.169.10.0/32
    2. Apply the configuration for the IP address pool by running the following command:

      1. $ oc apply -f first-pool.yaml
  4. Create a BGPAdvertisement CR:

    1. Create a file, such as first-adv.yaml, with content like the following example:

      1. apiVersion: metallb.io/v1beta1
      2. kind: BGPAdvertisement
      3. metadata:
      4. name: first-adv
      5. namespace: metallb-system
      6. spec:
      7. ipAddressPools:
      8. - first-pool
      9. peers:
      10. - frrviavrf (1)
      11. nodeSelectors:
      12. - matchLabels:
      13. egress-service.k8s.ovn.org/test-server1: "" (2)
      1In this example, MetalLB advertises a range of IP addresses from the first-pool IP address pool to the frrviavrf BGP peer.
      2In this example, the EgressService CR configures the source IP address for egress traffic to use the load-balancer service IP address. Therefore, you must specify the load-balancer node for return traffic to use the same return path for the traffic originating from the pod.
    2. Apply the configuration for the BGP advertisement by running the following command:

      1. $ oc apply -f first-adv.yaml
  5. Create an EgressService CR:

    1. Create a file, such as egress-service.yaml, with content like the following example:

      1. apiVersion: k8s.ovn.org/v1
      2. kind: EgressService
      3. metadata:
      4. name: server1 (1)
      5. namespace: test (2)
      6. spec:
      7. sourceIPBy: "LoadBalancerIP" (3)
      8. nodeSelector:
      9. matchLabels:
      10. vrf: "true" (4)
      11. network: "2" (5)
      1Specify the name for the egress service. The name of the EgressService resource must match the name of the load-balancer service that you want to modify.
      2Specify the namespace for the egress service. The namespace for the EgressService must match the namespace of the load-balancer service that you want to modify. The egress service is namespace-scoped.
      3This example assigns the LoadBalancer service ingress IP address as the source IP address for egress traffic.
      4If you specify LoadBalancer for the sourceIPBy specification, a single node handles the LoadBalancer service traffic. In this example, only a node with the label vrf: “true” can handle the service traffic. If you do not specify a node, OVN-Kubernetes selects a worker node to handle the service traffic. When a node is selected, OVN-Kubernetes labels the node in the following format: egress-service.k8s.ovn.org/<svc_namespace>-<svc_name>: “”.
      5Specify the routing table for egress traffic.
    2. Apply the configuration for the egress service by running the following command:

      1. $ oc apply -f egress-service.yaml

Verification

  1. Verify that you can access the application endpoint of the pods running behind the MetalLB service by running the following command:

    1. $ curl <external_ip_address>:<port_number> (1)
    1Update the external IP address and port number to suit your application endpoint.
  2. Optional: If you assigned the LoadBalancer service ingress IP address as the source IP address for egress traffic, verify this configuration by using tools such as tcpdump to analyze packets received at the external client.

Additional resources