Setting up the environment for an OKD installation

Preparing the provisioner node for OKD installation on IBM Cloud

Perform the following steps to prepare the provisioner node.

Procedure

  1. Log in to the provisioner node via ssh.

  2. Create a non-root user (kni) and provide that user with sudo privileges:

    1. # useradd kni
    1. # passwd kni
    1. # echo "kni ALL=(root) NOPASSWD:ALL" | tee -a /etc/sudoers.d/kni
    1. # chmod 0440 /etc/sudoers.d/kni
  3. Create an ssh key for the new user:

    1. # su - kni -c "ssh-keygen -f /home/kni/.ssh/id_rsa -N ''"
  4. Log in as the new user on the provisioner node:

    1. # su - kni
  5. Use Red Hat Subscription Manager to register the provisioner node:

    1. $ sudo subscription-manager register --username=<user> --password=<pass> --auto-attach
    1. $ sudo subscription-manager repos --enable=rhel-8-for-x86_64-appstream-rpms \
    2. --enable=rhel-8-for-x86_64-baseos-rpms

    For more information about Red Hat Subscription Manager, see Using and Configuring Red Hat Subscription Manager.

  6. Install the following packages:

    1. $ sudo dnf install -y libvirt qemu-kvm mkisofs python3-devel jq ipmitool
  7. Modify the user to add the libvirt group to the newly created user:

    1. $ sudo usermod --append --groups libvirt kni
  8. Start firewalld:

    1. $ sudo systemctl start firewalld
  9. Enable firewalld:

    1. $ sudo systemctl enable firewalld
  10. Start the http service:

    1. $ sudo firewall-cmd --zone=public --add-service=http --permanent
    1. $ sudo firewall-cmd --reload
  11. Start and enable the libvirtd service:

    1. $ sudo systemctl enable libvirtd --now
  12. Set the ID of the provisioner node:

    1. $ PRVN_HOST_ID=<ID>

    You can view the ID with the following ibmcloud command:

    1. $ ibmcloud sl hardware list
  13. Set the ID of the public subnet:

    1. $ PUBLICSUBNETID=<ID>

    You can view the ID with the following ibmcloud command:

    1. $ ibmcloud sl subnet list
  14. Set the ID of the private subnet:

    1. $ PRIVSUBNETID=<ID>

    You can view the ID with the following ibmcloud command:

    1. $ ibmcloud sl subnet list
  15. Set the provisioner node public IP address:

    1. $ PRVN_PUB_IP=$(ibmcloud sl hardware detail $PRVN_HOST_ID --output JSON | jq .primaryIpAddress -r)
  16. Set the CIDR for the public network:

    1. $ PUBLICCIDR=$(ibmcloud sl subnet detail $PUBLICSUBNETID --output JSON | jq .cidr)
  17. Set the IP address and CIDR for the public network:

    1. $ PUB_IP_CIDR=$PRVN_PUB_IP/$PUBLICCIDR
  18. Set the gateway for the public network:

    1. $ PUB_GATEWAY=$(ibmcloud sl subnet detail $PUBLICSUBNETID --output JSON | jq .gateway -r)
  19. Set the private IP address of the provisioner node:

    1. $ PRVN_PRIV_IP=$(ibmcloud sl hardware detail $PRVN_HOST_ID --output JSON | \
    2. jq .primaryBackendIpAddress -r)
  20. Set the CIDR for the private network:

    1. $ PRIVCIDR=$(ibmcloud sl subnet detail $PRIVSUBNETID --output JSON | jq .cidr)
  21. Set the IP address and CIDR for the private network:

    1. $ PRIV_IP_CIDR=$PRVN_PRIV_IP/$PRIVCIDR
  22. Set the gateway for the private network:

    1. $ PRIV_GATEWAY=$(ibmcloud sl subnet detail $PRIVSUBNETID --output JSON | jq .gateway -r)
  23. Set up the bridges for the baremetal and provisioning networks:

    1. $ sudo nohup bash -c "
    2. nmcli --get-values UUID con show | xargs -n 1 nmcli con delete
    3. nmcli connection add ifname provisioning type bridge con-name provisioning
    4. nmcli con add type bridge-slave ifname eth1 master provisioning
    5. nmcli connection add ifname baremetal type bridge con-name baremetal
    6. nmcli con add type bridge-slave ifname eth2 master baremetal
    7. nmcli connection modify baremetal ipv4.addresses $PUB_IP_CIDR ipv4.method manual ipv4.gateway $PUB_GATEWAY
    8. nmcli connection modify provisioning ipv4.addresses 172.22.0.1/24,$PRIV_IP_CIDR ipv4.method manual
    9. nmcli connection modify provisioning +ipv4.routes \"10.0.0.0/8 $PRIV_GATEWAY\"
    10. nmcli con down baremetal
    11. nmcli con up baremetal
    12. nmcli con down provisioning
    13. nmcli con up provisioning
    14. init 6
    15. "

    For eth1 and eth2, substitute the appropriate interface name, as needed.

  24. If required, SSH back into the provisioner node:

    1. # ssh kni@provisioner.<cluster-name>.<domain>
  25. Verify the connection bridges have been properly created:

    1. $ sudo nmcli con show

    Example output

    1. NAME UUID TYPE DEVICE
    2. baremetal 4d5133a5-8351-4bb9-bfd4-3af264801530 bridge baremetal
    3. provisioning 43942805-017f-4d7d-a2c2-7cb3324482ed bridge provisioning
    4. virbr0 d9bca40f-eee1-410b-8879-a2d4bb0465e7 bridge virbr0
    5. bridge-slave-eth1 76a8ed50-c7e5-4999-b4f6-6d9014dd0812 ethernet eth1
    6. bridge-slave-eth2 f31c3353-54b7-48de-893a-02d2b34c4736 ethernet eth2
  26. Create a pull-secret.txt file:

    1. $ vim pull-secret.txt

    In a web browser, navigate to Install on Bare Metal with user-provisioned infrastructure. In step 1, click Download pull secret. Paste the contents into the pull-secret.txt file and save the contents in the kni user’s home directory.

Configuring the public subnet

All of the OKD cluster nodes must be on the public subnet. IBM Cloud® does not provide a DHCP server on the subnet. Set it up separately on the provisioner node.

You must reset the BASH variables defined when preparing the provisioner node. Rebooting the provisioner node after preparing it will delete the BASH variables previously set.

Procedure

  1. Install dnsmasq:

    1. $ sudo dnf install dnsmasq
  2. Open the dnsmasq configuration file:

    1. $ sudo vi /etc/dnsmasq.conf
  3. Add the following configuration to the dnsmasq configuration file:

    1. interface=baremetal
    2. except-interface=lo
    3. bind-dynamic
    4. log-dhcp
    5. dhcp-range=<ip_addr>,<ip_addr>,<pub_cidr> (1)
    6. dhcp-option=baremetal,121,0.0.0.0/0,<pub_gateway>,<prvn_priv_ip>,<prvn_pub_ip> (2)
    7. dhcp-hostsfile=/var/lib/dnsmasq/dnsmasq.hostsfile
    1Set the DHCP range. Replace both instances of <ip_addr> with one unused IP address from the public subnet so that the dhcp-range for the baremetal network begins and ends with the same the IP address. Replace <pub_cidr> with the CIDR of the public subnet.
    2Set the DHCP option. Replace <pub_gateway> with the IP address of the gateway for the baremetal network. Replace <prvn_priv_ip> with the IP address of the provisioner node’s private IP address on the provisioning network. Replace <prvn_pub_ip> with the IP address of the provisioner node’s public IP address on the baremetal network.

    To retrieve the value for <pub_cidr>, execute:

    1. $ ibmcloud sl subnet detail <publicsubnetid> --output JSON | jq .cidr

    Replace <publicsubnetid> with the ID of the public subnet.

    To retrieve the value for <pub_gateway>, execute:

    1. $ ibmcloud sl subnet detail <publicsubnetid> --output JSON | jq .gateway -r

    Replace <publicsubnetid> with the ID of the public subnet.

    To retrieve the value for <prvn_priv_ip>, execute:

    1. $ ibmcloud sl hardware detail <id> --output JSON | \
    2. jq .primaryBackendIpAddress -r

    Replace <id> with the ID of the provisioner node.

    To retrieve the value for <prvn_pub_ip>, execute:

    1. $ ibmcloud sl hardware detail <id> --output JSON | jq .primaryIpAddress -r

    Replace <id> with the ID of the provisioner node.

  4. Obtain the list of hardware for the cluster:

    1. $ ibmcloud sl hardware list
  5. Obtain the MAC addresses and IP addresses for each node:

    1. $ ibmcloud sl hardware detail <id> --output JSON | \
    2. jq '.networkComponents[] | \
    3. "\(.primaryIpAddress) \(.macAddress)"' | grep -v null

    Replace <id> with the ID of the node.

    Example output

    1. "10.196.130.144 00:e0:ed:6a:ca:b4"
    2. "141.125.65.215 00:e0:ed:6a:ca:b5"

    Make a note of the MAC address and IP address of the public network. Make a separate note of the MAC address of the private network, which you will use later in the install-config.yaml file. Repeat this procedure for each node until you have all the public MAC and IP addresses for the public baremetal network, and the MAC addresses of the private provisioning network.

  6. Add the MAC and IP address pair of the public baremetal network for each node into the dnsmasq.hostsfile file:

    1. $ sudo vim /var/lib/dnsmasq/dnsmasq.hostsfile

    Example input

    1. 00:e0:ed:6a:ca:b5,141.125.65.215,master-0
    2. <mac>,<ip>,master-1
    3. <mac>,<ip>,master-2
    4. <mac>,<ip>,worker-0
    5. <mac>,<ip>,worker-1
    6. ...

    Replace <mac>,<ip> with the public MAC address and public IP address of the corresponding node name.

  7. Start dnsmasq:

    1. $ sudo systemctl start dnsmasq
  8. Enable dnsmasq so that it starts when booting the node:

    1. $ sudo systemctl enable dnsmasq
  9. Verify dnsmasq is running:

    1. $ sudo systemctl status dnsmasq

    Example output

    1. dnsmasq.service - DNS caching server.
    2. Loaded: loaded (/usr/lib/systemd/system/dnsmasq.service; enabled; vendor preset: disabled)
    3. Active: active (running) since Tue 2021-10-05 05:04:14 CDT; 49s ago
    4. Main PID: 3101 (dnsmasq)
    5. Tasks: 1 (limit: 204038)
    6. Memory: 732.0K
    7. CGroup: /system.slice/dnsmasq.service
    8. └─3101 /usr/sbin/dnsmasq -k
  10. Open ports 53 and 67 with UDP protocol:

    1. $ sudo firewall-cmd --add-port 53/udp --permanent
    1. $ sudo firewall-cmd --add-port 67/udp --permanent
  11. Add provisioning to the external zone with masquerade:

    1. $ sudo firewall-cmd --change-zone=provisioning --zone=external --permanent

    This step ensures network address translation for IPMI calls to the management subnet.

  12. Reload the firewalld configuration:

    1. $ sudo firewall-cmd --reload

Retrieving the OKD installer

Use the stable-4.x version of the installation program and your selected architecture to deploy the generally available stable version of OKD:

  1. $ export VERSION=stable-4.11
  1. $ export RELEASE_ARCH=<architecture>
  1. $ export RELEASE_IMAGE=$(curl -s https://mirror.openshift.com/pub/openshift-v4/$RELEASE_ARCH/clients/ocp/$VERSION/release.txt | grep 'Pull From: quay.io' | awk -F ' ' '{print $3}')

Extracting the OKD installer

After retrieving the installer, the next step is to extract it.

Procedure

  1. Set the environment variables:

    1. $ export cmd=openshift-baremetal-install
    1. $ export pullsecret_file=~/pull-secret.txt
    1. $ export extract_dir=$(pwd)
  2. Get the oc binary:

    1. $ curl -s https://mirror.openshift.com/pub/openshift-v4/clients/ocp/$VERSION/openshift-client-linux.tar.gz | tar zxvf - oc
  3. Extract the installer:

    1. $ sudo cp oc /usr/local/bin
    1. $ oc adm release extract --registry-config "${pullsecret_file}" --command=$cmd --to "${extract_dir}" ${RELEASE_IMAGE}
    1. $ sudo cp openshift-baremetal-install /usr/local/bin

Configuring the install-config.yaml file

The install-config.yaml file requires some additional details. Most of the information is teaching the installer and the resulting cluster enough about the available IBM Cloud® hardware so that it is able to fully manage it. The material difference between installing on bare metal and installing on IBM Cloud is that you must explicitly set the privilege level for IPMI in the BMC section of the install-config.yaml file.

Procedure

  1. Configure install-config.yaml. Change the appropriate variables to match the environment, including pullSecret and sshKey.

    1. apiVersion: v1
    2. baseDomain: <domain>
    3. metadata:
    4. name: <cluster_name>
    5. networking:
    6. machineNetwork:
    7. - cidr: <public-cidr>
    8. networkType: OVNKubernetes
    9. compute:
    10. - name: worker
    11. replicas: 2
    12. controlPlane:
    13. name: master
    14. replicas: 3
    15. platform:
    16. baremetal: {}
    17. platform:
    18. baremetal:
    19. apiVIP: <api_ip>
    20. ingressVIP: <wildcard_ip>
    21. provisioningNetworkInterface: <NIC1>
    22. provisioningNetworkCIDR: <CIDR>
    23. hosts:
    24. - name: openshift-master-0
    25. role: master
    26. bmc:
    27. address: ipmi://10.196.130.145?privilegelevel=OPERATOR (1)
    28. username: root
    29. password: <password>
    30. bootMACAddress: 00:e0:ed:6a:ca:b4 (2)
    31. rootDeviceHints:
    32. deviceName: "/dev/sda"
    33. - name: openshift-worker-0
    34. role: worker
    35. bmc:
    36. address: ipmi://<out-of-band-ip>?privilegelevel=OPERATOR (1)
    37. username: <user>
    38. password: <password>
    39. bootMACAddress: <NIC1_mac_address> (2)
    40. rootDeviceHints:
    41. deviceName: "/dev/sda"
    42. pullSecret: '<pull_secret>'
    43. sshKey: '<ssh_pub_key>'
    1The bmc.address provides a privilegelevel configuration setting with the value set to OPERATOR. This is required for IBM Cloud.
    2Add the MAC address of the private provisioning network NIC for the corresponding node.

    You can use the ibmcloud command-line utility to retrieve the password.

    1. $ ibmcloud sl hardware detail <id> output JSON | \
    2. jq ‘“(.networkManagementIpAddress) (.remoteManagementAccounts[0].password)”‘

    Replace <id> with the ID of the node.

  2. Create a directory to store the cluster configuration:

    1. $ mkdir ~/clusterconfigs
  3. Copy the install-config.yaml file into the directory:

    1. $ cp install-config.yaml ~/clusterconfig
  4. Ensure all bare metal nodes are powered off prior to installing the OKD cluster:

    1. $ ipmitool -I lanplus -U <user> -P <password> -H <management_server_ip> power off
  5. Remove old bootstrap resources if any are left over from a previous deployment attempt:

    1. for i in $(sudo virsh list | tail -n +3 | grep bootstrap | awk {'print $2'});
    2. do
    3. sudo virsh destroy $i;
    4. sudo virsh undefine $i;
    5. sudo virsh vol-delete $i --pool $i;
    6. sudo virsh vol-delete $i.ign --pool $i;
    7. sudo virsh pool-destroy $i;
    8. sudo virsh pool-undefine $i;
    9. done

Additional install-config parameters

See the following tables for the required parameters, the hosts parameter, and the bmc parameter for the install-config.yaml file.

Table 1. Required parameters
ParametersDefaultDescription

baseDomain

The domain name for the cluster. For example, example.com.

bootMode

UEFI

The boot mode for a node. Options are legacy, UEFI, and UEFISecureBoot. If bootMode is not set, Ironic sets it while inspecting the node.

bootstrapExternalStaticIP

The static IP address for the bootstrap VM. You must set this value when deploying a cluster with static IP addresses when there is no DHCP server on the baremetal network.

bootstrapExternalStaticGateway

The static IP address of the gateway for the bootstrap VM. You must set this value when deploying a cluster with static IP addresses when there is no DHCP server on the baremetal network.

sshKey

The sshKey configuration setting contains the key in the ~/.ssh/id_rsa.pub file required to access the control plane nodes and worker nodes. Typically, this key is from the provisioner node.

pullSecret

The pullSecret configuration setting contains a copy of the pull secret downloaded from the Install OpenShift on Bare Metal page when preparing the provisioner node.

  1. metadata:
  2. name:

The name to be given to the OKD cluster. For example, openshift.

  1. networking:
  2. machineNetwork:
  3. - cidr:

The public CIDR (Classless Inter-Domain Routing) of the external network. For example, 10.0.0.0/24.

  1. compute:
  2. - name: worker

The OKD cluster requires a name be provided for worker (or compute) nodes even if there are zero nodes.

  1. compute:
  2. replicas: 2

Replicas sets the number of worker (or compute) nodes in the OKD cluster.

  1. controlPlane:
  2. name: master

The OKD cluster requires a name for control plane (master) nodes.

  1. controlPlane:
  2. replicas: 3

Replicas sets the number of control plane (master) nodes included as part of the OKD cluster.

provisioningNetworkInterface

The name of the network interface on nodes connected to the provisioning network. For OKD 4.9 and later releases, use the bootMACAddress configuration setting to enable Ironic to identify the IP address of the NIC instead of using the provisioningNetworkInterface configuration setting to identify the name of the NIC.

defaultMachinePlatform

The default configuration used for machine pools without a platform configuration.

apiVIP

(Optional) The virtual IP address for Kubernetes API communication.

This setting must either be provided in the install-config.yaml file as a reserved IP from the MachineNetwork or pre-configured in the DNS so that the default name resolves correctly. Use the virtual IP address and not the FQDN when adding a value to the apiVIP configuration setting in the install-config.yaml file. The IP address must be from the primary IPv4 network when using dual stack networking. If not set, the installer uses api.<cluster_name>.<base_domain> to derive the IP address from the DNS.

disableCertificateVerification

False

redfish and redfish-virtualmedia need this parameter to manage BMC addresses. The value should be True when using a self-signed certificate for BMC addresses.

ingressVIP

(Optional) The virtual IP address for ingress traffic.

This setting must either be provided in the install-config.yaml file as a reserved IP from the MachineNetwork or pre-configured in the DNS so that the default name resolves correctly. Use the virtual IP address and not the FQDN when adding a value to the ingressVIP configuration setting in the install-config.yaml file. The IP address must be from the primary IPv4 network when using dual stack networking. If not set, the installer uses test.apps.<cluster_name>.<base_domain> to derive the IP address from the DNS.

Table 2. Optional Parameters
ParametersDefaultDescription

provisioningDHCPRange

172.22.0.10,172.22.0.100

Defines the IP range for nodes on the provisioning network.

provisioningNetworkCIDR

172.22.0.0/24

The CIDR for the network to use for provisioning. This option is required when not using the default address range on the provisioning network.

clusterProvisioningIP

The third IP address of the provisioningNetworkCIDR.

The IP address within the cluster where the provisioning services run. Defaults to the third IP address of the provisioning subnet. For example, 172.22.0.3.

bootstrapProvisioningIP

The second IP address of the provisioningNetworkCIDR.

The IP address on the bootstrap VM where the provisioning services run while the installer is deploying the control plane (master) nodes. Defaults to the second IP address of the provisioning subnet. For example, 172.22.0.2 or 2620:52:0:1307::2.

externalBridge

baremetal

The name of the baremetal bridge of the hypervisor attached to the baremetal network.

provisioningBridge

provisioning

The name of the provisioning bridge on the provisioner host attached to the provisioning network.

architecture

Defines the host architecture for your cluster. Valid values are amd64 or arm64.

defaultMachinePlatform

The default configuration used for machine pools without a platform configuration.

bootstrapOSImage

A URL to override the default operating system image for the bootstrap node. The URL must contain a SHA-256 hash of the image. For example: https://mirror.openshift.com/rhcos-<version>-qemu.qcow2.gz?sha256=<uncompressed_sha256>;.

provisioningNetwork

The provisioningNetwork configuration setting determines whether the cluster uses the provisioning network. If it does, the configuration setting also determines if the cluster manages the network.

Disabled: Set this parameter to Disabled to disable the requirement for a provisioning network. When set to Disabled, you must only use virtual media based provisioning, or bring up the cluster using the assisted installer. If Disabled and using power management, BMCs must be accessible from the baremetal network. If Disabled, you must provide two IP addresses on the baremetal network that are used for the provisioning services.

Managed: Set this parameter to Managed, which is the default, to fully manage the provisioning network, including DHCP, TFTP, and so on.

Unmanaged: Set this parameter to Unmanaged to enable the provisioning network but take care of manual configuration of DHCP. Virtual media provisioning is recommended but PXE is still available if required.

httpProxy

Set this parameter to the appropriate HTTP proxy used within your environment.

httpsProxy

Set this parameter to the appropriate HTTPS proxy used within your environment.

noProxy

Set this parameter to the appropriate list of exclusions for proxy usage within your environment.

Hosts

The hosts parameter is a list of separate bare metal assets used to build the cluster.

Table 3. Hosts
NameDefaultDescription

name

The name of the BareMetalHost resource to associate with the details. For example, openshift-master-0.

role

The role of the bare metal node. Either master or worker.

bmc

Connection details for the baseboard management controller. See the BMC addressing section for additional details.

bootMACAddress

The MAC address of the NIC that the host uses for the provisioning network. Ironic retrieves the IP address using the bootMACAddress configuration setting. Then, it binds to the host.

You must provide a valid MAC address from the host if you disabled the provisioning network.

networkConfig

Set this optional parameter to configure the network interface of a host. See “(Optional) Configuring host network interfaces” for additional details.

Root device hints

The rootDeviceHints parameter enables the installer to provision the Fedora CoreOS (FCOS) image to a particular device. The installer examines the devices in the order it discovers them, and compares the discovered values with the hint values. The installer uses the first discovered device that matches the hint value. The configuration can combine multiple hints, but a device must match all hints for the installer to select it.

Table 4. Subfields
SubfieldDescription

deviceName

A string containing a Linux device name like /dev/vda. The hint must match the actual value exactly.

hctl

A string containing a SCSI bus address like 0:0:0:0. The hint must match the actual value exactly.

model

A string containing a vendor-specific device identifier. The hint can be a substring of the actual value.

vendor

A string containing the name of the vendor or manufacturer of the device. The hint can be a sub-string of the actual value.

serialNumber

A string containing the device serial number. The hint must match the actual value exactly.

minSizeGigabytes

An integer representing the minimum size of the device in gigabytes.

wwn

A string containing the unique storage identifier. The hint must match the actual value exactly.

wwnWithExtension

A string containing the unique storage identifier with the vendor extension appended. The hint must match the actual value exactly.

wwnVendorExtension

A string containing the unique vendor storage identifier. The hint must match the actual value exactly.

rotational

A boolean indicating whether the device should be a rotating disk (true) or not (false).

Example usage

  1. - name: master-0
  2. role: master
  3. bmc:
  4. address: ipmi://10.10.0.3:6203
  5. username: admin
  6. password: redhat
  7. bootMACAddress: de:ad:be:ef:00:40
  8. rootDeviceHints:
  9. deviceName: "/dev/sda"

Creating the OKD manifests

  1. Create the OKD manifests.

    1. $ ./openshift-baremetal-install --dir ~/clusterconfigs create manifests
    1. INFO Consuming Install Config from target directory
    2. WARNING Making control-plane schedulable by setting MastersSchedulable to true for Scheduler cluster settings
    3. WARNING Discarding the OpenShift Manifest that was provided in the target directory because its dependencies are dirty and it needs to be regenerated

Deploying the cluster via the OKD installer

Run the OKD installer:

  1. $ ./openshift-baremetal-install --dir ~/clusterconfigs --log-level debug create cluster

Following the installation

During the deployment process, you can check the installation’s overall status by issuing the tail command to the .openshift_install.log log file in the install directory folder:

  1. $ tail -f /path/to/install-dir/.openshift_install.log