Load balancing on OpenStack

Load balancing on OpenStack

Limitations of load balancer services

OKD clusters on OpenStack use Octavia to handle load balancer services. As a result of this choice, such clusters have a number of functional limitations.

OpenStack Octavia has two supported providers: Amphora and OVN. These providers differ in terms of available features as well as implementation details. These distinctions affect load balancer services that are created on your cluster.

Local external traffic policies

You can set the external traffic policy (ETP) parameter, .spec.externalTrafficPolicy, on a load balancer service to preserve the source IP address of incoming traffic when it reaches service endpoint pods. However, if your cluster uses the Amphora Octavia provider, the source IP of the traffic is replaced with the IP address of the Amphora VM. This behavior does not occur if your cluster uses the OVN Octavia provider.

Having the ETP option set to Local requires that health monitors be created for the load balancer. Without health monitors, traffic can be routed to a node that doesn’t have a functional endpoint, which causes the connection to drop. To force Cloud Provider OpenStack to create health monitors, you must set the value of the create-monitor option in the cloud provider configuration to true.

In OpenStack 16.2, the OVN Octavia provider does not support health monitors. Therefore, setting the ETP to local is unsupported.

In OpenStack 16.2, the Amphora Octavia provider does not support HTTP monitors on UDP pools. As a result, UDP load balancer services have UDP-CONNECT monitors created instead. Due to implementation details, this configuration only functions properly with the OVN-Kubernetes CNI plugin. When the OpenShift SDN CNI plugin is used, the UDP services alive nodes are detected unreliably. This issue also affects the OVN Octavia provider in any OpenStack version because the driver does not support HTTP health monitors.

Scaling clusters for application traffic by using Octavia

OKD clusters that run on OpenStack can use the Octavia load balancing service to distribute traffic across multiple virtual machines (VMs) or floating IP addresses. This feature mitigates the bottleneck that single machines or addresses create.

You must create your own Octavia load balancer to use it for application network scaling.

Scaling clusters by using Octavia

If you want to use multiple API load balancers, create an Octavia load balancer and then configure your cluster to use it.

Prerequisites

Octavia is available on your OpenStack deployment.

Procedure

From a command line, create an Octavia load balancer that uses the Amphora driver:
```
$ openstack loadbalancer create --name API_OCP_CLUSTER --vip-subnet-id <id_of_worker_vms_subnet>
```
You can use a name of your choice instead of API_OCP_CLUSTER.
After the load balancer becomes active, create listeners:
```
$ openstack loadbalancer listener create --name API_OCP_CLUSTER_6443 --protocol HTTPS--protocol-port 6443 API_OCP_CLUSTER
```
To view the status of the load balancer, enter openstack loadbalancer list.

Create a pool that uses the round robin algorithm and has session persistence enabled:

$ openstack loadbalancer pool create --name API_OCP_CLUSTER_pool_6443 --lb-algorithm ROUND_ROBIN --session-persistence type=<source_IP_address> --listener API_OCP_CLUSTER_6443 --protocol HTTPS

To ensure that control plane machines are available, create a health monitor:

$ openstack loadbalancer healthmonitor create --delay 5 --max-retries 4 --timeout 10 --type TCP API_OCP_CLUSTER_pool_6443

Add the control plane machines as members of the load balancer pool:

$ for SERVER in $(MASTER-0-IP MASTER-1-IP MASTER-2-IP)
do
  openstack loadbalancer member create --address $SERVER  --protocol-port 6443 API_OCP_CLUSTER_pool_6443
done

Optional: To reuse the cluster API floating IP address, unset it:
```
$ openstack floating ip unset $API_FIP
```

Add either the unset API_FIP or a new address to the created load balancer VIP:

$ openstack floating ip set  --port $(openstack loadbalancer show -c <vip_port_id> -f value API_OCP_CLUSTER) $API_FIP

Your cluster now uses Octavia for load balancing.

Configuring an external load balancer

You can configure an OKD cluster on OpenStack to use an external load balancer in place of the default load balancer.

Configuring an external load balancer depends on your vendor’s load balancer.

The information and examples in this section are for guideline purposes only. Consult the vendor documentation for more specific information about the vendor’s load balancer.

Red Hat supports the following services for an external load balancer:

Ingress Controller
OpenShift API
OpenShift MachineConfig API

You can choose whether you want to configure one or all of these services for an external load balancer. Configuring only the Ingress Controller service is a common configuration option. To better understand each service, view the following diagrams:

Figure 1. Example network workflow that shows an Ingress Controller operating in an OKD environment

Figure 2. Example network workflow that shows an OpenShift API operating in an OKD environment

Figure 3. Example network workflow that shows an OpenShift MachineConfig API operating in an OKD environment

The following configuration options are supported for external load balancers:

Use a node selector to map the Ingress Controller to a specific set of nodes. You must assign a static IP address to each node in this set, or configure each node to receive the same IP address from the Dynamic Host Configuration Protocol (DHCP). Infrastructure nodes commonly receive this type of configuration.
Target all IP addresses on a subnet. This configuration can reduce maintenance overhead, because you can create and destroy nodes within those networks without reconfiguring the load balancer targets. If you deploy your ingress pods by using a machine set on a smaller network, such as a /27 or /28, you can simplify your load balancer targets.

You can list all IP addresses that exist in a network by checking the machine config pool’s resources.

Considerations

For a front-end IP address, you can use the same IP address for the front-end IP address, the Ingress Controller’s load balancer, and API load balancer. Check the vendor’s documentation for this capability.
For a back-end IP address, ensure that an IP address for an OKD control plane node does not change during the lifetime of the external load balancer. You can achieve this by completing one of the following actions:
- Assign a static IP address to each control plane node.
- Configure each node to receive the same IP address from the DHCP every time the node requests a DHCP lease. Depending on the vendor, the DHCP lease might be in the form of an IP reservation or a static DHCP assignment.
Manually define each node that runs the Ingress Controller in the external load balancer for the Ingress Controller back-end service. For example, if the Ingress Controller moves to an undefined node, a connection outage can occur.

OpenShift API prerequisites

You defined a front-end IP address.
TCP ports 6443 and 22623 are exposed on the front-end IP address of your load balancer. Check the following items:
- Port 6443 provides access to the OpenShift API service.
- Port 22623 can provide ignition startup configurations to nodes.
The front-end IP address and port 6443 are reachable by all users of your system with a location external to your OKD cluster.
The front-end IP address and port 22623 are reachable only by OKD nodes.
The load balancer backend can communicate with OKD control plane nodes on port 6443 and 22623.

Ingress Controller prerequisites

You defined a front-end IP address.
TCP ports 443 and 80 are exposed on the front-end IP address of your load balancer.
The front-end IP address, port 80 and port 443 are be reachable by all users of your system with a location external to your OKD cluster.
The front-end IP address, port 80 and port 443 are reachable to all nodes that operate in your OKD cluster.
The load balancer backend can communicate with OKD nodes that run the Ingress Controller on ports 80, 443, and 1936.

Prerequisite for health check URL specifications

You can configure most load balancers by setting health check URLs that determine if a service is available or unavailable. OKD provides these health checks for the OpenShift API, Machine Configuration API, and Ingress Controller backend services.

The following examples demonstrate health check specifications for the previously listed backend services:

Example of a Kubernetes API health check specification

Path: HTTPS:6443/readyz
Healthy threshold: 2
Unhealthy threshold: 2
Timeout: 10
Interval: 10

Example of a Machine Config API health check specification

Path: HTTPS:22623/healthz
Healthy threshold: 2
Unhealthy threshold: 2
Timeout: 10
Interval: 10

Example of an Ingress Controller health check specification

Path: HTTP:1936/healthz/ready
Healthy threshold: 2
Unhealthy threshold: 2
Timeout: 5
Interval: 10

Procedure

Configure the HAProxy Ingress Controller, so that you can enable access to the cluster from your load balancer on ports 6443, 443, and 80:

Example HAProxy configuration

#...
listen my-cluster-api-6443
    bind 192.168.1.100:6443
    mode tcp
    balance roundrobin
  option httpchk
  http-check connect
  http-check send meth GET uri /readyz
  http-check expect status 200
    server my-cluster-master-2 192.168.1.101:6443 check inter 10s rise 2 fall 2
    server my-cluster-master-0 192.168.1.102:6443 check inter 10s rise 2 fall 2
    server my-cluster-master-1 192.168.1.103:6443 check inter 10s rise 2 fall 2
listen my-cluster-machine-config-api-22623
    bind 192.168.1.1000.0.0.0:22623
    mode tcp
    balance roundrobin
  option httpchk
  http-check connect
  http-check send meth GET uri /healthz
  http-check expect status 200
    server my-cluster-master-2 192.0168.21.2101:22623 check inter 10s rise 2 fall 2
    server my-cluster-master-0 192.168.1.1020.2.3:22623 check inter 10s rise 2 fall 2
    server my-cluster-master-1 192.168.1.1030.2.1:22623 check inter 10s rise 2 fall 2
listen my-cluster-apps-443
        bind 192.168.1.100:443
        mode tcp
        balance roundrobin
    option httpchk
    http-check connect
    http-check send meth GET uri /healthz/ready
    http-check expect status 200
        server my-cluster-worker-0 192.168.1.111:443 check port 1936 inter 10s rise 2 fall 2
        server my-cluster-worker-1 192.168.1.112:443 check port 1936 inter 10s rise 2 fall 2
        server my-cluster-worker-2 192.168.1.113:443 check port 1936 inter 10s rise 2 fall 2
listen my-cluster-apps-80
        bind 192.168.1.100:80
        mode tcp
        balance roundrobin
    option httpchk
    http-check connect
    http-check send meth GET uri /healthz/ready
    http-check expect status 200
        server my-cluster-worker-0 192.168.1.111:80 check port 1936 inter 10s rise 2 fall 2
        server my-cluster-worker-1 192.168.1.112:80 check port 1936 inter 10s rise 2 fall 2
        server my-cluster-worker-2 192.168.1.113:80 check port 1936 inter 10s rise 2 fall 2
# ...

Use the curl CLI command to verify that the external load balancer and its resources are operational:

Verify that the cluster machine configuration API is accessible to the Kubernetes API server resource, by running the following command and observing the response:

$ curl https://<loadbalancer_ip_address>:6443/version --insecure

If the configuration is correct, you receive a JSON object in response:

{
  "major": "1",
  "minor": "11+",
  "gitVersion": "v1.11.0+ad103ed",
  "gitCommit": "ad103ed",
  "gitTreeState": "clean",
  "buildDate": "2019-01-09T06:44:10Z",
  "goVersion": "go1.10.3",
  "compiler": "gc",
  "platform": "linux/amd64"
}

Verify that the cluster machine configuration API is accessible to the Machine config server resource, by running the following command and observing the output:
```
$ curl -v https://<loadbalancer_ip_address>:22623/healthz --insecure
```
If the configuration is correct, the output from the command shows the following response:
```
HTTP/1.1 200 OK
Content-Length: 0
```

Verify that the controller is accessible to the Ingress Controller resource on port 80, by running the following command and observing the output:

$ curl -I -L -H "Host: console-openshift-console.apps.<cluster_name>.<base_domain>" http://<load_balancer_front_end_IP_address>

If the configuration is correct, the output from the command shows the following response:

HTTP/1.1 302 Found
content-length: 0
location: https://console-openshift-console.apps.ocp4.private.opequon.net/
cache-control: no-cache

Verify that the controller is accessible to the Ingress Controller resource on port 443, by running the following command and observing the output:

$ curl -I -L --insecure --resolve console-openshift-console.apps.<cluster_name>.<base_domain>:443:<Load Balancer Front End IP Address> https://console-openshift-console.apps.<cluster_name>.<base_domain>

If the configuration is correct, the output from the command shows the following response:

HTTP/1.1 200 OK
referrer-policy: strict-origin-when-cross-origin
set-cookie: csrf-token=UlYWOyQ62LWjw2h003xtYSKlh1a0Py2hhctw0WmV2YEdhJjFyQwWcGBsja261dGLgaYO0nxzVErhiXt6QepA7g==; Path=/; Secure; SameSite=Lax
x-content-type-options: nosniff
x-dns-prefetch-control: off
x-frame-options: DENY
x-xss-protection: 1; mode=block
date: Wed, 04 Oct 2023 16:29:38 GMT
content-type: text/html; charset=utf-8
set-cookie: 1e2670d92730b515ce3a1bb65da45062=1bf5e9573c9a2760c964ed1659cc1673; path=/; HttpOnly; Secure; SameSite=None
cache-control: private

Configure the DNS records for your cluster to target the front-end IP addresses of the external load balancer. You must update records to your DNS server for the cluster API and applications over the load balancer.

Examples of modified DNS records
```
<load_balancer_ip_address>  A  api.<cluster_name>.<base_domain>
A record pointing to Load Balancer Front End
```
```
<load_balancer_ip_address>   A apps.<cluster_name>.<base_domain>
A record pointing to Load Balancer Front End
```
DNS propagation might take some time for each DNS record to become available. Ensure that each DNS record propagates before validating each record.

Use the curl CLI command to verify that the external load balancer and DNS record configuration are operational:

Verify that you can access the cluster API, by running the following command and observing the output:

$ curl https://api.<cluster_name>.<base_domain>:6443/version --insecure

If the configuration is correct, you receive a JSON object in response:

{
  "major": "1",
  "minor": "11+",
  "gitVersion": "v1.11.0+ad103ed",
  "gitCommit": "ad103ed",
  "gitTreeState": "clean",
  "buildDate": "2019-01-09T06:44:10Z",
  "goVersion": "go1.10.3",
  "compiler": "gc",
  "platform": "linux/amd64"
  }

Verify that you can access the cluster machine configuration, by running the following command and observing the output:
```
$ curl -v https://api.<cluster_name>.<base_domain>:22623/healthz --insecure
```
If the configuration is correct, the output from the command shows the following response:
```
HTTP/1.1 200 OK
Content-Length: 0
```

Verify that you can access each cluster application on port, by running the following command and observing the output:

$ curl http://console-openshift-console.apps.<cluster_name>.<base_domain> -I -L --insecure

If the configuration is correct, the output from the command shows the following response:

HTTP/1.1 302 Found
content-length: 0
location: https://console-openshift-console.apps.<cluster-name>.<base domain>/
cache-control: no-cacheHTTP/1.1 200 OK
referrer-policy: strict-origin-when-cross-origin
set-cookie: csrf-token=39HoZgztDnzjJkq/JuLJMeoKNXlfiVv2YgZc09c3TBOBU4NI6kDXaJH1LdicNhN1UsQWzon4Dor9GWGfopaTEQ==; Path=/; Secure
x-content-type-options: nosniff
x-dns-prefetch-control: off
x-frame-options: DENY
x-xss-protection: 1; mode=block
date: Tue, 17 Nov 2020 08:42:10 GMT
content-type: text/html; charset=utf-8
set-cookie: 1e2670d92730b515ce3a1bb65da45062=9b714eb87e93cf34853e87a92d6894be; path=/; HttpOnly; Secure; SameSite=None
cache-control: private

Verify that you can access each cluster application on port 443, by running the following command and observing the output:

$ curl https://console-openshift-console.apps.<cluster_name>.<base_domain> -I -L --insecure

If the configuration is correct, the output from the command shows the following response:

HTTP/1.1 200 OK
referrer-policy: strict-origin-when-cross-origin
set-cookie: csrf-token=UlYWOyQ62LWjw2h003xtYSKlh1a0Py2hhctw0WmV2YEdhJjFyQwWcGBsja261dGLgaYO0nxzVErhiXt6QepA7g==; Path=/; Secure; SameSite=Lax
x-content-type-options: nosniff
x-dns-prefetch-control: off
x-frame-options: DENY
x-xss-protection: 1; mode=block
date: Wed, 04 Oct 2023 16:29:38 GMT
content-type: text/html; charset=utf-8
set-cookie: 1e2670d92730b515ce3a1bb65da45062=1bf5e9573c9a2760c964ed1659cc1673; path=/; HttpOnly; Secure; SameSite=None
cache-control: private