- Deploying distributed units at scale in a disconnected environment
- Provisioning edge sites at scale
- The GitOps approach
- About ZTP and distributed units on single nodes
- Zero touch provisioning building blocks
- Single node clusters
- Site planning considerations for distributed unit deployments
- Low latency for distributed units (DUs)
- Configuring BIOS for distributed unit bare-metal hosts
- Preparing the disconnected environment
- Installing Red Hat Advanced Cluster Management in a disconnected environment
- Enabling assisted installer service on bare metal
- ZTP custom resources
- Creating custom resources to install a single managed cluster
- Configuring static IP addresses for managed clusters
- Automated Discovery image ISO process for provisioning clusters
- Checking the managed cluster status
- Configuring a managed cluster for a disconnected environment
- Configuring IPv6 addresses for a disconnected environment
- Troubleshooting the managed cluster
- Applying the RAN policies for monitoring cluster activity
- Cluster provisioning
- Creating ZTP custom resources for multiple managed clusters
- Troubleshooting GitOps ZTP
Deploying distributed units at scale in a disconnected environment
Use zero touch provisioning (ZTP) to provision distributed units at new edge sites in a disconnected environment. The workflow starts when the site is connected to the network and ends with the CNF workload deployed and running on the site nodes.
ZTP for RAN deployments is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. For more information about the support scope of Red Hat Technology Preview features, see https://access.redhat.com/support/offerings/techpreview/. |
Provisioning edge sites at scale
Telco edge computing presents extraordinary challenges with managing hundreds to tens of thousands of clusters in hundreds of thousands of locations. These challenges require fully-automated management solutions with, as closely as possible, zero human interaction.
Zero touch provisioning (ZTP) allows you to provision new edge sites with declarative configurations of bare-metal equipment at remote sites. Template or overlay configurations install OKD features that are required for CNF workloads. End-to-end functional test suites are used to verify CNF related features. All configurations are declarative in nature.
You start the workflow by creating declarative configurations for ISO images that are delivered to the edge nodes to begin the installation process. The images are used to repeatedly provision large numbers of nodes efficiently and quickly, allowing you keep up with requirements from the field for far edge nodes.
Service providers are deploying a more distributed mobile network architecture allowed by the modular functional framework defined for 5G. This allows service providers to move from appliance-based radio access networks (RAN) to open cloud RAN architecture, gaining flexibility and agility in delivering services to end users.
The following diagram shows how ZTP works within a far edge framework.
The GitOps approach
ZTP uses the GitOps deployment set of practices for infrastructure deployment that allows developers to perform tasks that would otherwise fall under the purview of IT operations. GitOps achieves these tasks using declarative specifications stored in Git repositories, such as YAML files and other defined patterns, that provide a framework for deploying the infrastructure. The declarative output is leveraged by the Open Cluster Manager for multisite deployment.
One of the motivators for a GitOps approach is the requirement for reliability at scale. This is a significant challenge that GitOps helps solve.
GitOps addresses the reliability issue by providing traceability, RBAC, and a single source of truth for the desired state of each site. Scale issues are addressed by GitOps providing structure, tooling, and event driven operations through webhooks.
About ZTP and distributed units on single nodes
You can install a distributed unit (DU) on a single node at scale with Red Hat Advanced Cluster Management (RHACM) (ACM) using the assisted installer (AI) and the policy generator with core-reduction technology enabled. The DU installation is done using zero touch provisioning (ZTP) in a disconnected environment.
ACM manages clusters in a hub and spoke architecture, where a single hub cluster manages many spoke clusters. ACM applies radio access network (RAN) policies from predefined custom resources (CRs). Hub clusters running ACM provision and deploy the spoke clusters using ZTP and AI. DU installation follows the AI installation of OKD on a single node.
The AI service handles provisioning of OKD on single nodes running on bare metal. ACM ships with and deploys the assisted installer when the MultiClusterHub
custom resource is installed.
With ZTP and AI, you can provision OKD single nodes to run your DUs at scale. A high level overview of ZTP for distributed units in a disconnected environment is as follows:
A hub cluster running ACM manages a disconnected internal registry that mirrors the OKD release images. The internal registry is used to provision the spoke single nodes.
You manage the bare-metal host machines for your DUs in an inventory file that uses YAML for formatting. You store the inventory file in a Git repository.
You install the DU bare-metal host machines on site, and make the hosts ready for provisioning. To be ready for provisioning, the following is required for each bare-metal host:
Network connectivity - including DNS for your network. Hosts should be reachable through the hub and managed spoke clusters. Ensure there is layer 3 connectivity between the hub and the host where you want to install your hub cluster.
Baseboard Management Controller (BMC) details for each host - ZTP uses BMC details to connect the URL and credentials for accessing the BMC. Create spoke cluster definition CRs. These define the relevant elements for the managed clusters. Required CRs are as follows:
Custom Resource Description Namespace
Namespace for the managed single node cluster.
BMCSecret CR
Credentials for the host BMC.
Image Pull Secret CR
Pull secret for the disconnected registry.
AgentClusterInstall
Specifies the single node cluster’s configuration such as networking, number of supervisor (control plane) nodes, and so on.
ClusterDeployment
Defines the cluster name, domain, and other details.
KlusterletAddonConfig
Manages installation and termination of add-ons on the ManagedCluster for ACM.
ManagedCluster
Describes the managed cluster for ACM.
InfraEnv
Describes the installation ISO to be mounted on the destination node that the assisted installer service creates. This is the final step of the manifest creation phase.
BareMetalHost
Describes the details of the bare-metal host, including BMC and credentials details.
When a change is detected in the host inventory repository, a host management event is triggered to provision the new or updated host.
The host is provisioned. When the host is provisioned and successfully rebooted, the host agent reports
Ready
status to the hub cluster.
Zero touch provisioning building blocks
ACM deploys single node OpenShift (SNO), which is OKD installed on single nodes, leveraging zero touch provisioning (ZTP). The initial site plan is broken down into smaller components and initial configuration data is stored in a Git repository. Zero touch provisioning uses a declarative GitOps approach to deploy these nodes. The deployment of the nodes includes:
Installing the host operating system (RHCOS) on a blank server.
Deploying OKD on single nodes.
Creating cluster policies and site subscriptions.
Leveraging a GitOps deployment topology for a develop once, deploy anywhere model.
Making the necessary network configurations to the server operating system.
Deploying profile Operators and performing any needed software-related configuration, such as performance profile, PTP, and SR-IOV.
Downloading images needed to run workloads (CNFs).
Single node clusters
You use zero touch provisioning (ZTP) to deploy single node clusters to run distributed units (DUs) on small hardware footprints at disconnected far edge sites. A single node cluster runs OKD on top of one bare-metal host, hence the single node. Edge servers contain a single node with supervisor functions and worker functions on the same host that are deployed at low bandwidth or disconnected edge sites.
OKD is configured on the single node to use workload partitioning. Workload partitioning separates cluster management workloads from user workloads and can run the cluster management workloads on a reserved set of CPUs. Workload partitioning is useful for resource-constrained environments, such as single-node production deployments, where you want to reserve most of the CPU resources for user workloads and configure OKD to use fewer CPU resources within the host.
A single node cluster hosting a DU application on a node is divided into the following configuration categories:
Common - Values are the same for all single node cluster sites managed by a hub cluster.
Pools of sites - Common across a pool of sites where a pool size can be 1 to n.
Site specific - Likely specific to a site with no overlap with other sites, for example, a vlan.
Site planning considerations for distributed unit deployments
Site planning for distributed units (DU) deployments is complex. The following is an overview of the tasks that you complete before the DU hosts are brought online in the production environment.
Develop a network model. The network model depends on various factors such as the size of the area of coverage, number of hosts, projected traffic load, DNS, and DHCP requirements.
Decide how many DU radio nodes are required to provide sufficient coverage and redundancy for your network.
Develop mechanical and electrical specifications for the DU host hardware.
Develop a construction plan for individual DU site installations.
Tune host BIOS settings for production, and deploy the BIOS configuration to the hosts.
Install the equipment on-site, connect hosts to the network, and apply power.
Configure on-site switches and routers.
Perform basic connectivity tests for the host machines.
Establish production network connectivity, and verify host connections to the network.
Provision and deploy on-site DU hosts at scale.
Test and verify on-site operations, performing load and scale testing of the DU hosts before finally bringing the DU infrastructure online in the live production environment.
Low latency for distributed units (DUs)
Low latency is an integral part of the development of 5G networks. Telecommunications networks require as little signal delay as possible to ensure quality of service in a variety of critical use cases.
Low latency processing is essential for any communication with timing constraints that affect functionality and security. For example, 5G Telco applications require a guaranteed one millisecond one-way latency to meet Internet of Things (IoT) requirements. Low latency is also critical for the future development of autonomous vehicles, smart factories, and online gaming. Networks in these environments require almost a real-time flow of data.
Low latency systems are about guarantees with regards to response and processing times. This includes keeping a communication protocol running smoothly, ensuring device security with fast responses to error conditions, or just making sure a system is not lagging behind when receiving a lot of data. Low latency is key for optimal synchronization of radio transmissions.
OKD enables low latency processing for DUs running on COTS hardware by using a number of technologies and specialized hardware devices:
Real-time kernel for RHCOS
Ensures workloads are handled with a high degree of process determinism.
CPU isolation
Avoids CPU scheduling delays and ensures CPU capacity is available consistently.
NUMA awareness
Aligns memory and huge pages with CPU and PCI devices to pin guaranteed container memory and huge pages to the NUMA node. This decreases latency and improves performance of the node.
Huge pages memory management
Using huge page sizes improves system performance by reducing the amount of system resources required to access page tables.
Precision timing synchronization using PTP
Allows synchronization between nodes in the network with sub-microsecond accuracy.
Configuring BIOS for distributed unit bare-metal hosts
Distributed unit (DU) hosts require the BIOS to be configured before the host can be provisioned. The BIOS configuration is dependent on the specific hardware that runs your DUs and the particular requirements of your installation.
In this Developer Preview release, configuration and tuning of BIOS for DU bare-metal host machines is the responsibility of the customer. Automatic setting of BIOS is not handled by the zero touch provisioning workflow. |
Procedure
Set the UEFI/BIOS Boot Mode to
UEFI
.In the host boot sequence order, set Hard drive first.
Apply the specific BIOS configuration for your hardware. The following table describes a representative BIOS configuration for an Intel Xeon Skylake or Intel Cascade Lake server, based on the Intel FlexRAN 4G and 5G baseband PHY reference design.
The exact BIOS configuration depends on your specific hardware and network requirements. The following sample configuration is for illustrative purposes only.
Table 1. Sample BIOS configuration for an Intel Xeon Skylake or Cascade Lake server BIOS Setting Configuration CPU Power and Performance Policy
Performance
Uncore Frequency Scaling
Disabled
Performance P-limit
Disabled
Enhanced Intel SpeedStep ® Tech
Enabled
Intel Configurable TDP
Enabled
Configurable TDP Level
Level 2
Intel® Turbo Boost Technology
Enabled
Energy Efficient Turbo
Disabled
Hardware P-States
Disabled
Package C-State
C0/C1 state
C1E
Disabled
Processor C6
Disabled
Enable global SR-IOV and VT-d settings in the BIOS for the host. These settings are relevant to bare-metal environments. |
Preparing the disconnected environment
Before you can provision distributed units (DU) at scale, you must install Red Hat Advanced Cluster Management (RHACM), which handles the provisioning of the DUs.
RHACM is deployed as an Operator on the OKD hub cluster. It controls clusters and applications from a single console with built-in security policies. RHACM provisions and manage your DU hosts. To install RHACM in a disconnected environment, you create a mirror registry that mirrors the Operator Lifecycle Manager (OLM) catalog that contains the required Operator images. OLM manages, installs, and upgrades Operators and their dependencies in the cluster.
You also use a disconnected mirror host to serve the FCOS ISO and RootFS disk images that provision the DU bare-metal host operating system.
Before you install a cluster on infrastructure that you provision in a restricted network, you must mirror the required container images into that environment. You can also use this procedure in unrestricted networks to ensure your clusters only use container images that have satisfied your organizational controls on external content.
You must have access to the internet to obtain the necessary container images. In this procedure, you place the mirror registry on a mirror host that has access to both your network and the internet. If you do not have access to a mirror host, use the disconnected procedure to copy images to a device that you can move across network boundaries. |
Disconnected environment prerequisites
You must have a container image registry that supports Docker v2-2 in the location that will host the OKD cluster, such as one of the following registries:
If you have an entitlement to Red Hat Quay, see the documentation on deploying Red Hat Quay for proof-of-concept purposes or by using the Quay Operator. If you need additional assistance selecting and installing a registry, contact your sales representative or Red Hat support.
Red Hat does not test third party registries with OKD. |
About the mirror registry
You can mirror the images that are required for OKD installation and subsequent product updates to a container mirror registry such as Red Hat Quay, JFrog Artifactory, Sonatype Nexus Repository, or Harbor. If you do not have access to a large-scale container registry, you can use the mirror registry for Red Hat OpenShift, a small-scale container registry included with OKD subscriptions.
You can use any container registry that supports Docker v2-2, such as Red Hat Quay, the mirror registry for Red Hat OpenShift, Artifactory, Sonatype Nexus Repository, or Harbor. Regardless of your chosen registry, the procedure to mirror content from Red Hat hosted sites on the internet to an isolated image registry is the same. After you mirror the content, you configure each cluster to retrieve this content from your mirror registry.
The internal registry of the OKD cluster cannot be used as the target registry because it does not support pushing without a tag, which is required during the mirroring process. |
If choosing a container registry that is not the mirror registry for Red Hat OpenShift, it must be reachable by every machine in the clusters that you provision. If the registry is unreachable, installation, updating, or normal operations such as workload relocation might fail. For that reason, you must run mirror registries in a highly available way, and the mirror registries must at least match the production availability of your OKD clusters.
When you populate your mirror registry with OKD images, you can follow two scenarios. If you have a host that can access both the internet and your mirror registry, but not your cluster nodes, you can directly mirror the content from that machine. This process is referred to as connected mirroring. If you have no such host, you must mirror the images to a file system and then bring that host or removable media into your restricted environment. This process is referred to as disconnected mirroring.
For mirrored registries, to view the source of pulled images, you must review the Trying to access
log entry in the CRI-O logs. Other methods to view the image pull source, such as using the crictl images
command on a node, show the non-mirrored image name, even though the image is pulled from the mirrored location.
Red Hat does not test third party registries with OKD. |
Additional resources
- For information on viewing the CRI-O logs to view the image source, see Viewing the image pull source.
Preparing your mirror host
Before you perform the mirror procedure, you must prepare the host to retrieve content and push it to the remote location.
Installing the OpenShift CLI by downloading the binary
You can install the OpenShift CLI (oc
) to interact with OKD from a command-line interface. You can install oc
on Linux, Windows, or macOS.
If you installed an earlier version of |
Installing the OpenShift CLI on Linux
You can install the OpenShift CLI (oc
) binary on Linux by using the following procedure.
Procedure
Navigate to https://mirror.openshift.com/pub/openshift-v4/clients/oc/latest/ and choose the folder for your operating system and architecture.
Download
oc.tar.gz
.Unpack the archive:
$ tar xvzf <file>
Place the
oc
binary in a directory that is on yourPATH
.To check your
PATH
, execute the following command:$ echo $PATH
After you install the OpenShift CLI, it is available using the oc
command:
$ oc <command>
Installing the OpenShift CLI on Windows
You can install the OpenShift CLI (oc
) binary on Windows by using the following procedure.
Procedure
Navigate to https://mirror.openshift.com/pub/openshift-v4/clients/oc/latest/ and choose the folder for your operating system and architecture.
Download
oc.zip
.Unzip the archive with a ZIP program.
Move the
oc
binary to a directory that is on yourPATH
.To check your
PATH
, open the command prompt and execute the following command:C:\> path
After you install the OpenShift CLI, it is available using the oc
command:
C:\> oc <command>
Installing the OpenShift CLI on macOS
You can install the OpenShift CLI (oc
) binary on macOS by using the following procedure.
Procedure
Navigate to https://mirror.openshift.com/pub/openshift-v4/clients/oc/latest/ and choose the folder for your operating system and architecture.
Download
oc.tar.gz
.Unpack and unzip the archive.
Move the
oc
binary to a directory on your PATH.To check your
PATH
, open a terminal and execute the following command:$ echo $PATH
After you install the OpenShift CLI, it is available using the oc
command:
$ oc <command>
Configuring credentials that allow images to be mirrored
Create a container image registry credentials file that allows mirroring images from Red Hat to your mirror.
Prerequisites
- You configured a mirror registry to use in your restricted network.
Procedure
Complete the following steps on the installation host:
Generate the base64-encoded user name and password or token for your mirror registry:
$ echo -n '<user_name>:<password>' | base64 -w0 (1)
BGVtbYk3ZHAtqXs=
1 For <user_name>
and<password>
, specify the user name and password that you configured for your registry.Create a
.json
file and add a section that describes your registry to it:{
"auths": {
"<mirror_registry>": { (1)
"auth": "<credentials>", (2)
"email": "you@example.com"
}
}
}
1 For <mirror_registry>
, specify the registry domain name, and optionally the port, that your mirror registry uses to serve content. For example,registry.example.com
orregistry.example.com:5000
2 For <credentials>
, specify the base64-encoded user name and password for the mirror registry.
Mirroring the OKD image repository
Mirror the OKD image repository to your registry to use during cluster installation or upgrade.
Prerequisites
Your mirror host has access to the internet.
You configured a mirror registry to use in your restricted network and can access the certificate and credentials that you configured.
You have created a pull secret for your mirror repository.
If you use self-signed certificates, you have specified a Subject Alternative Name in the certificates.
Procedure
Complete the following steps on the mirror host:
Review the OKD downloads page to determine the version of OKD that you want to install and determine the corresponding tag on the Repository Tags page.
Set the required environment variables:
Export the release version:
$ OCP_RELEASE=<release_version>
For
<release_version>
, specify the tag that corresponds to the version of OKD to install, such as4.5.4
.Export the local registry name and host port:
$ LOCAL_REGISTRY='<local_registry_host_name>:<local_registry_host_port>'
For
<local_registry_host_name>
, specify the registry domain name for your mirror repository, and for<local_registry_host_port>
, specify the port that it serves content on.Export the local repository name:
$ LOCAL_REPOSITORY='<local_repository_name>'
For
<local_repository_name>
, specify the name of the repository to create in your registry, such asocp4/openshift4
.Export the name of the repository to mirror:
$ PRODUCT_REPO='openshift'
Export the path to your registry pull secret:
$ LOCAL_SECRET_JSON='<path_to_pull_secret>'
For
<path_to_pull_secret>
, specify the absolute path to and file name of the pull secret for your mirror registry that you created.Export the release mirror:
$ RELEASE_NAME="okd"
Export the path to the directory to host the mirrored images:
$ REMOVABLE_MEDIA_PATH=<path> (1)
1 Specify the full path, including the initial forward slash (/) character.
Mirror the version images to the mirror registry:
If your mirror host does not have internet access, take the following actions:
Connect the removable media to a system that is connected to the internet.
Review the images and configuration manifests to mirror:
$ oc adm release mirror -a ${LOCAL_SECRET_JSON} \
--from=quay.io/${PRODUCT_REPO}/${RELEASE_NAME}:${OCP_RELEASE} \
--to=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY} \
--to-release-image=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE} --dry-run
Record the entire
imageContentSources
section from the output of the previous command. The information about your mirrors is unique to your mirrored repository, and you must add theimageContentSources
section to theinstall-config.yaml
file during installation.Mirror the images to a directory on the removable media:
$ oc adm release mirror -a ${LOCAL_SECRET_JSON} --to-dir=${REMOVABLE_MEDIA_PATH}/mirror quay.io/${PRODUCT_REPO}/${RELEASE_NAME}:${OCP_RELEASE}
Take the media to the restricted network environment and upload the images to the local container registry.
$ oc image mirror -a ${LOCAL_SECRET_JSON} --from-dir=${REMOVABLE_MEDIA_PATH}/mirror "file://openshift/release:${OCP_RELEASE}*" ${LOCAL_REGISTRY}/${LOCAL_REPOSITORY} (1)
1 For REMOVABLE_MEDIA_PATH
, you must use the same path that you specified when you mirrored the images.
If the local container registry is connected to the mirror host, take the following actions:
Directly push the release images to the local registry by using following command:
$ oc adm release mirror -a ${LOCAL_SECRET_JSON} \
--from=quay.io/${PRODUCT_REPO}/${RELEASE_NAME}:${OCP_RELEASE} \
--to=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY} \
--to-release-image=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE}
This command pulls the release information as a digest, and its output includes the
imageContentSources
data that you require when you install your cluster.Record the entire
imageContentSources
section from the output of the previous command. The information about your mirrors is unique to your mirrored repository, and you must add theimageContentSources
section to theinstall-config.yaml
file during installation.The image name gets patched to Quay.io during the mirroring process, and the podman images will show Quay.io in the registry on the bootstrap virtual machine.
To create the installation program that is based on the content that you mirrored, extract it and pin it to the release:
If your mirror host does not have internet access, run the following command:
$ oc adm release extract -a ${LOCAL_SECRET_JSON} --command=openshift-install "${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE}"
If the local container registry is connected to the mirror host, run the following command:
$ oc adm release extract -a ${LOCAL_SECRET_JSON} --command=openshift-install "${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE}"
To ensure that you use the correct images for the version of OKD that you selected, you must extract the installation program from the mirrored content.
You must perform this step on a machine with an active internet connection.
If you are in a disconnected environment, use the
—image
flag as part of must-gather and point to the payload image.
For clusters using installer-provisioned infrastructure, run the following command:
$ openshift-install
Adding FCOS ISO and RootFS images to a disconnected mirror host
Before you install a cluster on infrastructure that you provision, you must create Fedora CoreOS (FCOS) machines for it to use. Use a disconnected mirror to host the FCOS images you require to provision your distributed unit (DU) bare-metal hosts.
Prerequisites
- Deploy and configure an HTTP server to host the FCOS image resources on the network. You must be able to access the HTTP server from your computer, and from the machines that you create.
The FCOS images might not change with every release of OKD. You must download images with the highest version that is less than or equal to the OKD version that you install. Use the image versions that match your OKD version if they are available. You require ISO and RootFS images to install FCOS on the DU hosts. FCOS qcow2 images are not supported for this installation type. |
Procedure
Log in to the mirror host.
Obtain the FCOS ISO and RootFS images from mirror.openshift.com, for example:
Export the required image names and OKD version as environment variables:
$ export ISO_IMAGE_NAME=<iso_image_name> (1)
$ export ROOTFS_IMAGE_NAME=<rootfs_image_name> (2)
$ export OCP_VERSION=<ocp_version> (3)
1 ISO image name, for example, rhcos-4.10.0-fc.1-x86_64-live.x86_64.iso
2 RootFS image name, for example, rhcos-4.10.0-fc.1-x86_64-live-rootfs.x86_64.img
3 OKD version, for example, latest-4.10
Download the required images:
$ sudo wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/${OCP_VERSION}/${ISO_IMAGE_NAME} -O /var/www/html/${ISO_IMAGE_NAME}
$ sudo wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/${OCP_VERSION}/${ROOTFS_IMAGE_NAME} -O /var/www/html/${ROOTFS_IMAGE_NAME}
Verification steps
Verify that the images downloaded successfully and are being served on the disconnected mirror host, for example:
$ wget http://$(hostname)/${ISO_IMAGE_NAME}
Expected output
...
Saving to: rhcos-4.10.0-fc.1-x86_64-live.x86_64.iso
rhcos-4.10.0-fc.1-x86_64- 11%[====> ] 10.01M 4.71MB/s
...
Installing Red Hat Advanced Cluster Management in a disconnected environment
You use Red Hat Advanced Cluster Management (RHACM) on a hub cluster in the disconnected environment to manage the deployment of distributed unit (DU) profiles on multiple managed spoke clusters.
Prerequisites
Install the OKD CLI (
oc
).Log in as a user with
cluster-admin
privileges.Configure a disconnected mirror registry for use in the cluster.
If you want to deploy Operators to the spoke clusters, you must also add them to this registry. See Mirroring an Operator catalog for more information.
Procedure
- Install RHACM on the hub cluster in the disconnected environment. See Installing RHACM in disconnected networks.
Enabling assisted installer service on bare metal
The Assisted Installer Service (AIS) deploys OKD clusters. Red Hat Advanced Cluster Management (RHACM) ships with AIS. AIS is deployed when you enable the MultiClusterHub Operator on the RHACM hub cluster.
For distributed units (DUs), RHACM supports OKD deployments that run on a single bare-metal host. The single node cluster acts as both a control plane and a worker node.
Prerequisites
Install OKD 4.10 on a hub cluster.
Install RHACM and create the
MultiClusterHub
resource.Create persistent volume custom resources (CR) for database and file system storage.
You have installed the OpenShift CLI (
oc
).
Procedure
Modify the
HiveConfig
resource to enable the feature gate for Assisted Installer:$ oc patch hiveconfig hive --type merge -p '{"spec":{"targetNamespace":"hive","logLevel":"debug","featureGates":{"custom":{"enabled":["AlphaAgentInstallStrategy"]},"featureSet":"Custom"}}}'
Modify the
Provisioning
resource to allow the Bare Metal Operator to watch all namespaces:$ oc patch provisioning provisioning-configuration --type merge -p '{"spec":{"watchAllNamespaces": true }}'
Create the
AgentServiceConfig
CR.Save the following YAML in the
agent_service_config.yaml
file:apiVersion: agent-install.openshift.io/v1beta1
kind: AgentServiceConfig
metadata:
name: agent
spec:
databaseStorage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: <db_volume_size> (1)
filesystemStorage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: <fs_volume_size> (2)
osImages: (3)
- openshiftVersion: "<ocp_version>" (4)
version: "<ocp_release_version>" (5)
url: "<iso_url>" (6)
rootFSUrl: "<root_fs_url>" (7)
cpuArchitecture: "x86_64"
1 Volume size for the databaseStorage
field, for example10Gi
.2 Volume size for the filesystemStorage
field, for example20Gi
.3 List of OS image details. Example describes a single OKD OS version. 4 OKD version to install, for example, 4.8
.5 Specific install version, for example, 47.83.202103251640-0
.6 ISO url, for example, https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.7/4.7.7/rhcos-4.7.7-x86_64-live.x86_64.iso
.7 Root FS image URL, for example, https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.7/4.7.7/rhcos-live-rootfs.x86_64.img
Create the
AgentServiceConfig
CR by running the following command:$ oc create -f agent_service_config.yaml
Example output
agentserviceconfig.agent-install.openshift.io/agent created
ZTP custom resources
Zero touch provisioning (ZTP) uses custom resource (CR) objects to extend the Kubernetes API or introduce your own API into a project or a cluster. These CRs contain the site-specific data required to install and configure a cluster for RAN applications.
A custom resource definition (CRD) file defines your own object kinds. Deploying a CRD into the managed cluster causes the Kubernetes API server to begin serving the specified CR for the entire lifecycle.
For each CR in the <site>.yaml
file on the managed cluster, ZTP uses the data to create installation CRs in a directory named for the cluster.
ZTP provides two ways for defining and installing CRs on managed clusters: a manual approach when you are provisioning a single cluster and an automated approach when provisioning multiple clusters.
Manual CR creation for single clusters
Use this method when you are creating CRs for a single cluster. This is a good way to test your CRs before deploying on a larger scale.
Automated CR creation for multiple managed clusters
Use the automated SiteConfig method when you are installing multiple managed clusters, for example, in batches of up to 100 clusters. SiteConfig uses ArgoCD as the engine for the GitOps method of site deployment. After completing a site plan that contains all of the required parameters for deployment, a policy generator creates the manifests and applies them to the hub cluster.
Both methods create the CRs shown in the following table. On the cluster site, an automated Discovery image ISO file creates a directory with the site name and a file with the cluster name. Every cluster has its own namespace, and all of the CRs are under that namespace. The namespace and the CR names match the cluster name.
Resource | Description | Usage |
---|---|---|
| Contains the connection information for the Baseboard Management Controller (BMC) of the target bare-metal host. | Provides access to the BMC in order to load and boot the Discovery image ISO on the target server by using the Redfish protocol. |
| Contains information for pulling OKD onto the target bare-metal host. | Used with ClusterDeployment to generate the Discovery ISO for the managed cluster. |
| Specifies the managed cluster’s configuration such as networking and the number of supervisor (control plane) nodes. Shows the | Specifies the managed cluster configuration information and provides status during the installation of the cluster. |
| References the | Used with |
| Provides network configuration information such as | Sets up a static IP address for the managed cluster’s Kube API server. |
| Contains hardware information about the target bare-metal host. | Created automatically on the hub when the target machine’s Discovery image ISO boots. |
| When a cluster is managed by the hub, it must be imported and known. This Kubernetes object provides that interface. | The hub uses this resource to manage and show the status of managed clusters. |
| Contains the list of services provided by the hub to be deployed to a | Tells the hub which addon services to deploy to a |
| Logical space for | Propagates resources to the |
| Two custom resources are created: |
|
| Contains OKD image information such as the repository and image name. | Passed into resources to provide OKD images. |
Creating custom resources to install a single managed cluster
This procedure tells you how to manually create and deploy a single managed cluster. If you are creating multiple clusters, perhaps hundreds, use the SiteConfig
method described in “Creating ZTP custom resources for multiple managed clusters”.
Prerequisites
Enable Assisted Installer Service.
Ensure network connectivity:
The container within the hub must be able to reach the Baseboard Management Controller (BMC) address of the target bare-metal host.
The managed cluster must be able to resolve and reach the hub’s API
hostname
and*.app
hostname. Example of the hub’s API and*.app
hostname:console-openshift-console.apps.hub-cluster.internal.domain.com
api.hub-cluster.internal.domain.com
The hub must be able to resolve and reach the API and
*.app
hostname of the managed cluster. Here is an example of the managed cluster’s API and*.app
hostname:console-openshift-console.apps.sno-managed-cluster-1.internal.domain.com
api.sno-managed-cluster-1.internal.domain.com
A DNS Server that is IP reachable from the target bare-metal host.
A target bare-metal host for the managed cluster with the following hardware minimums:
4 CPU or 8 vCPU
32 GiB RAM
120 GiB Disk for root filesystem
When working in a disconnected environment, the release image needs to be mirrored. Use this command to mirror the release image:
oc adm release mirror -a <pull_secret.json>
--from=quay.io/openshift-release-dev/ocp-release:{{ mirror_version_spoke_release }}
--to={{ provisioner_cluster_registry }}/ocp4 --to-release-image={{
provisioner_cluster_registry }}/ocp4:{{ mirror_version_spoke_release }}
You mirrored the ISO and
rootfs
used to generate the spoke cluster ISO to an HTTP server and configured the settings to pull images from there.The images must match the version of the
ClusterImageSet
. To deploy a 4.10.0 version, therootfs
and ISO need to be set at 4.10.0.
Procedure
Create a
ClusterImageSet
for each specific cluster version that needs to be deployed. AClusterImageSet
has the following format:apiVersion: hive.openshift.io/v1
kind: ClusterImageSet
metadata:
name: openshift-4.10.0-rc.0 (1)
spec:
releaseImage: quay.io/openshift-release-dev/ocp-release:4.10.0-x86_64 (2)
1 The descriptive version that you want to deploy. 2 Points to the specific release image to deploy. Create the
Namespace
definition for the managed cluster:apiVersion: v1
kind: Namespace
metadata:
name: <cluster_name> (1)
labels:
name: <cluster_name> (1)
1 The name of the managed cluster to provision. Create the
BMC Secret
custom resource:apiVersion: v1
data:
password: <bmc_password> (1)
username: <bmc_username> (2)
kind: Secret
metadata:
name: <cluster_name>-bmc-secret
namespace: <cluster_name>
type: Opaque
1 The password to the target bare-metal host. Must be base-64 encoded. 2 The username to the target bare-metal host. Must be base-64 encoded. Create the
Image Pull Secret
custom resource:apiVersion: v1
data:
.dockerconfigjson: <pull_secret> (1)
kind: Secret
metadata:
name: assisted-deployment-pull-secret
namespace: <cluster_name>
type: kubernetes.io/dockerconfigjson
1 The OKD pull secret. Must be base-64 encoded. Create the
AgentClusterInstall
custom resource:apiVersion: extensions.hive.openshift.io/v1beta1
kind: AgentClusterInstall
metadata:
# Only include the annotation if using OVN, otherwise omit the annotation
annotations:
agent-install.openshift.io/install-config-overrides: '{"networking":{"networkType":"OVNKubernetes"}}'
name: <cluster_name>
namespace: <cluster_name>
spec:
clusterDeploymentRef:
name: <cluster_name>
imageSetRef:
name: <cluster_image_set> (1)
networking:
clusterNetwork:
- cidr: <cluster_network_cidr> (2)
hostPrefix: 23
machineNetwork:
- cidr: <machine_network_cidr> (3)
serviceNetwork:
- <service_network_cidr> (4)
provisionRequirements:
controlPlaneAgents: 1
workerAgents: 0
sshPublicKey: <public_key> (5)
1 The name of the ClusterImageSet custom resource used to install OKD on the bare-metal host. 2 A block of IPv4 or IPv6 addresses in CIDR notation used for communication among cluster nodes. 3 A block of IPv4 or IPv6 addresses in CIDR notation used for the target bare-metal host external communication. Also used to determine the API and Ingress VIP addresses when provisioning DU single node clusters. 4 A block of IPv4 or IPv6 addresses in CIDR notation used for cluster services internal communication. 5 Entered as plain text. You can use the public key to SSH into the node after it has finished installing. If you want to configure a static IP for the managed cluster at this point, see the procedure in this document for configuring static IP addresses for managed clusters.
Create the
ClusterDeployment
custom resource:apiVersion: hive.openshift.io/v1
kind: ClusterDeployment
metadata:
name: <cluster_name>
namespace: <cluster_name>
spec:
baseDomain: <base_domain> (1)
clusterInstallRef:
group: extensions.hive.openshift.io
kind: AgentClusterInstall
name: <cluster_name>
version: v1beta1
clusterName: <cluster_name>
platform:
agentBareMetal:
agentSelector:
matchLabels:
cluster-name: <cluster_name>
pullSecretRef:
name: assisted-deployment-pull-secret
1 The managed cluster’s base domain. Create the
KlusterletAddonConfig
custom resource:apiVersion: agent.open-cluster-management.io/v1
kind: KlusterletAddonConfig
metadata:
name: <cluster_name>
namespace: <cluster_name>
spec:
clusterName: <cluster_name>
clusterNamespace: <cluster_name>
clusterLabels:
cloud: auto-detect
vendor: auto-detect
applicationManager:
enabled: true
certPolicyController:
enabled: false
iamPolicyController:
enabled: false
policyController:
enabled: true
searchCollector:
enabled: false (1)
1 Set to true
to enable KlusterletAddonConfig orfalse
to disable the KlusterletAddonConfig. KeepsearchCollector
disabled.Create the
ManagedCluster
custom resource:apiVersion: cluster.open-cluster-management.io/v1
kind: ManagedCluster
metadata:
name: <cluster_name>
spec:
hubAcceptsClient: true
Create the
InfraEnv
custom resource:apiVersion: agent-install.openshift.io/v1beta1
kind: InfraEnv
metadata:
name: <cluster_name>
namespace: <cluster_name>
spec:
clusterRef:
name: <cluster_name>
namespace: <cluster_name>
sshAuthorizedKey: <public_key> (1)
agentLabels: (2)
location: "<label-name>"
pullSecretRef:
name: assisted-deployment-pull-secret
1 Entered as plain text. You can use the public key to SSH into the target bare-metal host when it boots from the ISO. 2 Sets a label to match. The labels apply when the agents boot. Create the
BareMetalHost
custom resource:apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: <cluster_name>
namespace: <cluster_name>
annotations:
inspect.metal3.io: disabled
labels:
infraenvs.agent-install.openshift.io: "<cluster_name>"
spec:
bootMode: "UEFI"
bmc:
address: <bmc_address> (1)
disableCertificateVerification: true
credentialsName: <cluster_name>-bmc-secret
bootMACAddress: <mac_address> (2)
automatedCleaningMode: disabled
online: true
1 The baseboard management console address of the installation ISO on the target bare-metal host. 2 The MAC address of the target bare-metal host. Optionally, you can add
bmac.agent-install.openshift.io/hostname: <host-name>
as an annotation to set the managed cluster’s hostname. If you don’t add the annotation, the hostname will default to either a hostname from the DHCP server or local host.After you have created the custom resources, push the entire directory of generated custom resources to the Git repository you created for storing the custom resources.
Next step
To provision additional clusters, repeat this procedure for each cluster.
Configuring static IP addresses for managed clusters
Optionally, after creating the AgentClusterInstall
custom resource, you can configure static IP addresses for the managed clusters.
You must create this custom resource before creating the |
Prerequisites
- Deploy and configure the
AgentClusterInstall
custom resource.
Procedure
Create a
NMStateConfig
custom resource:apiVersion: agent-install.openshift.io/v1beta1
kind: NMStateConfig
metadata:
name: <cluster_name>
namespace: <cluster_name>
labels:
sno-cluster-<cluster-name>: <cluster_name>
spec:
config:
interfaces:
- name: eth0
type: ethernet
state: up
ipv4:
enabled: true
address:
- ip: <ip_address> (1)
prefix-length: <public_network_prefix> (2)
dhcp: false
dns-resolver:
config:
server:
- <dns_resolver> (3)
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: <gateway> (4)
next-hop-interface: eth0
table-id: 254
interfaces:
- name: "eth0" (5)
macAddress: <mac_address> (6)
1 The static IP address of the target bare-metal host. 2 The static IP address’s subnet prefix for the target bare-metal host. 3 The DNS server for the target bare-metal host. 4 The gateway for the target bare-metal host. 5 Must match the name specified in the interfaces
section.6 The mac address of the interface. When creating the
BareMetalHost
custom resource, ensure that one of its mac addresses matches a mac address in theNMStateConfig
target bare-metal host.When creating the
InfraEnv
custom resource, reference the label from theNMStateConfig
custom resource in theInfraEnv
custom resource:apiVersion: agent-install.openshift.io/v1beta1
kind: InfraEnv
metadata:
name: <cluster_name>
namespace: <cluster_name>
spec:
clusterRef:
name: <cluster_name>
namespace: <cluster_name>
sshAuthorizedKey: <public_key>
agentLabels: (1)
location: "<label-name>"
pullSecretRef:
name: assisted-deployment-pull-secret
nmStateConfigLabelSelector:
matchLabels:
sno-cluster-<cluster-name>: <cluster_name> # Match this label
1 Sets a label to match. The labels apply when the agents boot.
Automated Discovery image ISO process for provisioning clusters
After you create the custom resources, the following actions happen automatically:
A Discovery image ISO file is generated and booted on the target machine.
When the ISO file successfully boots on the target machine it reports the hardware information of the target machine.
After all hosts are discovered, OKD is installed.
When OKD finishes installing, the hub installs the
klusterlet
service on the target cluster.The requested add-on services are installed on the target cluster.
The Discovery image ISO process finishes when the Agent
custom resource is created on the hub for the managed cluster.
Checking the managed cluster status
Ensure that cluster provisioning was successful by checking the cluster status.
Prerequisites
- All of the custom resources have been configured and provisioned, and the
Agent
custom resource is created on the hub for the managed cluster.
Procedure
Check the status of the managed cluster:
$ oc get managedcluster
True
indicates the managed cluster is ready.Check the agent status:
$ oc get agent -n <cluster_name>
Use the
describe
command to provide an in-depth description of the agent’s condition. Statuses to be aware of includeBackendError
,InputError
,ValidationsFailing
,InstallationFailed
, andAgentIsConnected
. These statuses are relevant to theAgent
andAgentClusterInstall
custom resources.$ oc describe agent -n <cluster_name>
Check the cluster provisioning status:
$ oc get agentclusterinstall -n <cluster_name>
Use the
describe
command to provide an in-depth description of the cluster provisioning status:$ oc describe agentclusterinstall -n <cluster_name>
Check the status of the managed cluster’s add-on services:
$ oc get managedclusteraddon -n <cluster_name>
Retrieve the authentication information of the
kubeconfig
file for the managed cluster:$ oc get secret -n <cluster_name> <cluster_name>-admin-kubeconfig -o jsonpath={.data.kubeconfig} | base64 -d > <directory>/<cluster_name>-kubeconfig
Configuring a managed cluster for a disconnected environment
After you have completed the preceding procedure, follow these steps to configure the managed cluster for a disconnected environment.
Prerequisites
A disconnected installation of Red Hat Advanced Cluster Management (RHACM) 2.3.
Host the
rootfs
andiso
images on an HTTPD server.
Procedure
Create a
ConfigMap
containing the mirror registry config:apiVersion: v1
kind: ConfigMap
metadata:
name: assisted-installer-mirror-config
namespace: assisted-installer
labels:
app: assisted-service
data:
ca-bundle.crt: <certificate> (1)
registries.conf: | (2)
unqualified-search-registries = ["registry.access.redhat.com", "docker.io"]
[[registry]]
location = <mirror_registry_url> (3)
insecure = false
mirror-by-digest-only = true
1 The mirror registry’s certificate used when creating the mirror registry. 2 The configuration for the mirror registry. 3 The URL of the mirror registry. This updates
mirrorRegistryRef
in theAgentServiceConfig
custom resource, as shown below:Example output
apiVersion: agent-install.openshift.io/v1beta1
kind: AgentServiceConfig
metadata:
name: agent
namespace: assisted-installer
spec:
databaseStorage:
volumeName: <db_pv_name>
accessModes:
- ReadWriteOnce
resources:
requests:
storage: <db_storage_size>
filesystemStorage:
volumeName: <fs_pv_name>
accessModes:
- ReadWriteOnce
resources:
requests:
storage: <fs_storage_size>
mirrorRegistryRef:
name: 'assisted-installer-mirror-config'
osImages:
- openshiftVersion: <ocp_version>
rootfs: <rootfs_url> (1)
url: <iso_url> (1)
1 Must match the URLs of the HTTPD server. For disconnected installations, you must deploy an NTP clock that is reachable through the disconnected network. You can do this by configuring chrony to act as server, editing the
/etc/chrony.conf
file, and adding the following allowed IPv6 range:# Allow NTP client access from local network.
#allow 192.168.0.0/16
local stratum 10
bindcmdaddress ::
allow 2620:52:0:1310::/64
Configuring IPv6 addresses for a disconnected environment
Optionally, when you are creating the AgentClusterInstall
custom resource, you can configure IPV6 addresses for the managed clusters.
Procedure
In the
AgentClusterInstall
custom resource, modify the IP addresses inclusterNetwork
andserviceNetwork
for IPv6 addresses:apiVersion: extensions.hive.openshift.io/v1beta1
kind: AgentClusterInstall
metadata:
# Only include the annotation if using OVN, otherwise omit the annotation
annotations:
agent-install.openshift.io/install-config-overrides: '{"networking":{"networkType":"OVNKubernetes"}}'
name: <cluster_name>
namespace: <cluster_name>
spec:
clusterDeploymentRef:
name: <cluster_name>
imageSetRef:
name: <cluster_image_set>
networking:
clusterNetwork:
- cidr: "fd01::/48"
hostPrefix: 64
machineNetwork:
- cidr: <machine_network_cidr>
serviceNetwork:
- "fd02::/112"
provisionRequirements:
controlPlaneAgents: 1
workerAgents: 0
sshPublicKey: <public_key>
Update the
NMStateConfig
custom resource with the IPv6 addresses you defined.
Troubleshooting the managed cluster
Use this procedure to diagnose any installation issues that might occur with the managed clusters.
Procedure
Check the status of the managed cluster:
$ oc get managedcluster
Example output
NAME HUB ACCEPTED MANAGED CLUSTER URLS JOINED AVAILABLE AGE
SNO-cluster true True True 2d19h
If the status in the
AVAILABLE
column isTrue
, the managed cluster is being managed by the hub.If the status in the
AVAILABLE
column isUnknown
, the managed cluster is not being managed by the hub. Use the following steps to continue checking to get more information.Check the
AgentClusterInstall
install status:$ oc get clusterdeployment -n <cluster_name>
Example output
NAME PLATFORM REGION CLUSTERTYPE INSTALLED INFRAID VERSION POWERSTATE AGE
Sno0026 agent-baremetal false Initialized
2d14h
If the status in the
INSTALLED
column isfalse
, the installation was unsuccessful.If the installation failed, enter the following command to review the status of the
AgentClusterInstall
resource:$ oc describe agentclusterinstall -n <cluster_name> <cluster_name>
Resolve the errors and reset the cluster:
Remove the cluster’s managed cluster resource:
$ oc delete managedcluster <cluster_name>
Remove the cluster’s namespace:
$ oc delete namespace <cluster_name>
This deletes all of the namespace-scoped custom resources created for this cluster. You must wait for the
ManagedCluster
CR deletion to complete before proceeding.Recreate the custom resources for the managed cluster.
Applying the RAN policies for monitoring cluster activity
Zero touch provisioning (ZTP) uses Red Hat Advanced Cluster Management (RHACM) to apply the radio access network (RAN) policies using a policy-based governance approach to automatically monitor cluster activity.
The policy generator (PolicyGen) is a Kustomize plug-in that facilitates creating ACM policies from predefined custom resources. There are three main items: Policy Categorization, Source CR policy, and PolicyGenTemplate. PolicyGen relies on these to generate the policies and their placement bindings and rules.
The following diagram shows how the RAN policy generator interacts with GitOps and ACM.
RAN policies are categorized into three main groups:
Common
A policy that exists in the Common
category is applied to all clusters to be represented by the site plan.
Groups
A policy that exists in the Groups
category is applied to a group of clusters. Every group of clusters could have their own policies that exist under the Groups category. For example, Groups/group1
could have its own policies that are applied to the clusters belonging to group1
.
Sites
A policy that exists in the Sites
category is applied to a specific cluster. Any cluster could have its own policies that exist in the Sites
category. For example, Sites/cluster1
will have its own policies applied to cluster1
.
The following diagram shows how policies are generated.
Applying source custom resource policies
Source custom resource policies include the following:
SR-IOV policies
PTP policies
Performance Add-on Operator policies
MachineConfigPool policies
SCTP policies
You need to define the source custom resource that generates the ACM policy with consideration of possible overlay to its metadata or spec/data. For example, a common-namespace-policy
contains a Namespace
definition that exists in all managed clusters. This namespace
is placed under the Common category and there are no changes for its spec or data across all clusters.
Namespace policy example
The following example shows the source custom resource for this namespace:
apiVersion: v1
kind: Namespace
metadata:
name: openshift-sriov-network-operator
labels:
openshift.io/run-level: "1"
Example output
The generated policy that applies this namespace
includes the namespace
as it is defined above without any change, as shown in this example:
apiVersion: policy.open-cluster-management.io/v1
kind: Policy
metadata:
name: common-sriov-sub-ns-policy
namespace: common-sub
annotations:
policy.open-cluster-management.io/categories: CM Configuration Management
policy.open-cluster-management.io/controls: CM-2 Baseline Configuration
policy.open-cluster-management.io/standards: NIST SP 800-53
spec:
remediationAction: enforce
disabled: false
policy-templates:
- objectDefinition:
apiVersion: policy.open-cluster-management.io/v1
kind: ConfigurationPolicy
metadata:
name: common-sriov-sub-ns-policy-config
spec:
remediationAction: enforce
severity: low
namespaceselector:
exclude:
- kube-*
include:
- '*'
object-templates:
- complianceType: musthave
objectDefinition:
apiVersion: v1
kind: Namespace
metadata:
labels:
openshift.io/run-level: "1"
name: openshift-sriov-network-operator
SRIOV policy example
The following example shows a SriovNetworkNodePolicy
definition that exists in different clusters with a different specification for each cluster. The example also shows the source custom resource for the SriovNetworkNodePolicy
:
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: sriov-nnp
namespace: openshift-sriov-network-operator
spec:
# The $ tells the policy generator to overlay/remove the spec.item in the generated policy.
deviceType: $deviceType
isRdma: false
nicSelector:
pfNames: [$pfNames]
nodeSelector:
node-role.kubernetes.io/worker: ""
numVfs: $numVfs
priority: $priority
resourceName: $resourceName
Example output
The SriovNetworkNodePolicy
name and namespace
are the same for all clusters, so both are defined in the source SriovNetworkNodePolicy
. However, the generated policy requires the $deviceType
, $numVfs
, as input parameters in order to adjust the policy for each cluster. The generated policy is shown in this example:
apiVersion: policy.open-cluster-management.io/v1
kind: Policy
metadata:
name: site-du-sno-1-sriov-nnp-mh-policy
namespace: sites-sub
annotations:
policy.open-cluster-management.io/categories: CM Configuration Management
policy.open-cluster-management.io/controls: CM-2 Baseline Configuration
policy.open-cluster-management.io/standards: NIST SP 800-53
spec:
remediationAction: enforce
disabled: false
policy-templates:
- objectDefinition:
apiVersion: policy.open-cluster-management.io/v1
kind: ConfigurationPolicy
metadata:
name: site-du-sno-1-sriov-nnp-mh-policy-config
spec:
remediationAction: enforce
severity: low
namespaceselector:
exclude:
- kube-*
include:
- '*'
object-templates:
- complianceType: musthave
objectDefinition:
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: sriov-nnp-du-mh
namespace: openshift-sriov-network-operator
spec:
deviceType: vfio-pci
isRdma: false
nicSelector:
pfNames:
- ens7f0
nodeSelector:
node-role.kubernetes.io/worker: ""
numVfs: 8
resourceName: du_mh
Defining the required input parameters as |
The PolicyGenTemplate
The PolicyGenTemplate.yaml
file is a Custom Resource Definition (CRD) that tells PolicyGen where to categorize the generated policies and which items need to be overlaid.
The following example shows the PolicyGenTemplate.yaml
file:
apiVersion: ran.openshift.io/v1
kind: PolicyGenTemplate
metadata:
name: "group-du-sno"
namespace: "group-du-sno"
spec:
bindingRules:
group-du-sno: ""
mcp: "master"
sourceFiles:
- fileName: ConsoleOperatorDisable.yaml
policyName: "console-policy"
- fileName: ClusterLogging.yaml
policyName: "cluster-log-policy"
spec:
curation:
curator:
schedule: "30 3 * * *"
collection:
logs:
type: "fluentd"
fluentd: {}
The group-du-ranGen.yaml
file defines a group of policies under a group named group-du
. This file defines a MachineConfigPool
worker-du
that is used as the node selector for any other policy defined in sourceFiles
. An ACM policy is generated for every source file that exists in sourceFiles
. And, a single placement binding and placement rule is generated to apply the cluster selection rule for group-du
policies.
Using the source file PtpConfigSlave.yaml
as an example, the PtpConfigSlave
has a definition of a PtpConfig
custom resource (CR). The generated policy for the PtpConfigSlave
example is named group-du-ptp-config-policy
. The PtpConfig
CR defined in the generated group-du-ptp-config-policy
is named du-ptp-slave
. The spec
defined in PtpConfigSlave.yaml
is placed under du-ptp-slave
along with the other spec
items defined under the source file.
The following example shows the group-du-ptp-config-policy
:
apiVersion: policy.open-cluster-management.io/v1
kind: Policy
metadata:
name: group-du-ptp-config-policy
namespace: groups-sub
annotations:
policy.open-cluster-management.io/categories: CM Configuration Management
policy.open-cluster-management.io/controls: CM-2 Baseline Configuration
policy.open-cluster-management.io/standards: NIST SP 800-53
spec:
remediationAction: enforce
disabled: false
policy-templates:
- objectDefinition:
apiVersion: policy.open-cluster-management.io/v1
kind: ConfigurationPolicy
metadata:
name: group-du-ptp-config-policy-config
spec:
remediationAction: enforce
severity: low
namespaceselector:
exclude:
- kube-*
include:
- '*'
object-templates:
- complianceType: musthave (1)
objectDefinition:
apiVersion: ptp.openshift.io/v1
kind: PtpConfig
metadata:
name: slave
namespace: openshift-ptp
spec:
recommend:
- match:
- nodeLabel: node-role.kubernetes.io/worker-du
priority: 4
profile: slave
profile:
- interface: ens5f0
name: slave
phc2sysOpts: -a -r -n 24
ptp4lConf: |
[global]
#
# Default Data Set
#
twoStepFlag 1
slaveOnly 0
priority1 128
priority2 128
domainNumber 24
.....
1 | Displays the value of the complianceType field. The default value is musthave which indicates that an object must exist with the same name as specified in object-templates . To find the exact matches to roles and objects, set the value to mustonlyhave . For more information about the accepted values, see Configuration policy YAML table. |
Considerations when creating custom resource policies
The custom resources used to create the ACM policies should be defined with consideration of possible overlay to its metadata and spec/data. For example, if the custom resource
metadata.name
does not change between clusters then you should set themetadata.name
value in the custom resource file. If the custom resource will have multiple instances in the same cluster, then the custom resourcemetadata.name
must be defined in the policy template file.In order to apply the node selector for a specific machine config pool, you have to set the node selector value as
$mcp
in order to let the policy generator overlay the$mcp
value with the defined mcp in the policy template.Subscription source files do not change.
Generating RAN policies
Prerequisites
Install Kustomize
Install the Kustomize Policy Generator plug-in
Procedure
Configure the
kustomization.yaml
file to reference thepolicyGenerator.yaml
file. The following example shows the PolicyGenerator definition:apiVersion: policyGenerator/v1
kind: PolicyGenerator
metadata:
name: acm-policy
namespace: acm-policy-generator
# The arguments should be given and defined as below with same order --policyGenTempPath= --sourcePath= --outPath= --stdout --customResources
argsOneLiner: ./ranPolicyGenTempExamples ./sourcePolicies ./out true false
Where:
policyGenTempPath
is the path to thepolicyGenTemp
files.sourcePath
: is the path to the source policies.outPath
: is the path to save the generated ACM policies.stdout
: Iftrue
, prints the generated policies to the console.customResources
: Iftrue
generates the CRs from thesourcePolicies
files without ACM policies.
Test PolicyGen by running the following commands:
$ cd cnf-features-deploy/ztp/ztp-policy-generator/
$ XDG_CONFIG_HOME=./ kustomize build --enable-alpha-plugins
An
out
directory is created with the expected policies, as shown in this example:out
├── common
│ ├── common-log-sub-ns-policy.yaml
│ ├── common-log-sub-oper-policy.yaml
│ ├── common-log-sub-policy.yaml
│ ├── common-pao-sub-catalog-policy.yaml
│ ├── common-pao-sub-ns-policy.yaml
│ ├── common-pao-sub-oper-policy.yaml
│ ├── common-pao-sub-policy.yaml
│ ├── common-policies-placementbinding.yaml
│ ├── common-policies-placementrule.yaml
│ ├── common-ptp-sub-ns-policy.yaml
│ ├── common-ptp-sub-oper-policy.yaml
│ ├── common-ptp-sub-policy.yaml
│ ├── common-sriov-sub-ns-policy.yaml
│ ├── common-sriov-sub-oper-policy.yaml
│ └── common-sriov-sub-policy.yaml
├── groups
│ ├── group-du
│ │ ├── group-du-mc-chronyd-policy.yaml
│ │ ├── group-du-mc-mount-ns-policy.yaml
│ │ ├── group-du-mcp-du-policy.yaml
│ │ ├── group-du-mc-sctp-policy.yaml
│ │ ├── group-du-policies-placementbinding.yaml
│ │ ├── group-du-policies-placementrule.yaml
│ │ ├── group-du-ptp-config-policy.yaml
│ │ └── group-du-sriov-operconfig-policy.yaml
│ └── group-sno-du
│ ├── group-du-sno-policies-placementbinding.yaml
│ ├── group-du-sno-policies-placementrule.yaml
│ ├── group-sno-du-console-policy.yaml
│ ├── group-sno-du-log-forwarder-policy.yaml
│ └── group-sno-du-log-policy.yaml
└── sites
└── site-du-sno-1
├── site-du-sno-1-policies-placementbinding.yaml
├── site-du-sno-1-policies-placementrule.yaml
├── site-du-sno-1-sriov-nn-fh-policy.yaml
├── site-du-sno-1-sriov-nnp-mh-policy.yaml
├── site-du-sno-1-sriov-nw-fh-policy.yaml
├── site-du-sno-1-sriov-nw-mh-policy.yaml
└── site-du-sno-1-.yaml
The common policies are flat because they will be applied to all clusters. However, the groups and sites have subdirectories for each group and site as they will be applied to different clusters.
Cluster provisioning
Zero touch provisioning (ZTP) provisions clusters using a layered approach. The base components consist of Fedora CoreOS (FCOS), the basic operating system for the cluster, and OKD. After these components are installed, the worker node can join the existing cluster. When the node has joined the existing cluster, the 5G RAN profile Operators are applied.
The following diagram illustrates this architecture.
The following RAN Operators are deployed on every cluster:
Machine Config
Precision Time Protocol (PTP)
Performance Addon Operator
SR-IOV
Local Storage Operator
Logging Operator
Machine Config Operator
The Machine Config Operator enables system definitions and low-level system settings such as workload partitioning, NTP, and SCTP. This Operator is installed with OKD.
A performance profile and its created products are applied to a node according to an associated machine config pool (MCP). The MCP holds valuable information about the progress of applying the machine configurations created by performance addons that encompass kernel args, kube config, huge pages allocation, and deployment of the realtime kernel (rt-kernel). The performance addons controller monitors changes in the MCP and updates the performance profile status accordingly.
Performance Addon Operator
The Performance Addon Operator provides the ability to enable advanced node performance tunings on a set of nodes.
OKD provides a Performance Addon Operator to implement automatic tuning to achieve low latency performance for OKD applications. The cluster administrator uses this performance profile configuration that makes it easier to make these changes in a more reliable way.
The administrator can specify updating the kernel to rt-kernel
, reserving CPUs for management workloads, and using CPUs for running the workloads.
SR-IOV Operator
The Single Root I/O Virtualization (SR-IOV) Network Operator manages the SR-IOV network devices and network attachments in your cluster.
The SR-IOV Operator allows network interfaces to be virtual and shared at a device level with networking functions running within the cluster.
The SR-IOV Network Operator adds the SriovOperatorConfig.sriovnetwork.openshift.io
CustomResourceDefinition resource. The Operator automatically creates a SriovOperatorConfig custom resource named default
in the openshift-sriov-network-operator
namespace. The default
custom resource contains the SR-IOV Network Operator configuration for your cluster.
Precision Time Protocol Operator
Precision Time Protocol (PTP) is used to synchronize clocks in a network. The PTP Operator discovers PTP-capable devices in the cluster and creates and manages linuxptp
services for those devices. The PTP Operator also deploys a PTP fast events infrastructure. vDU applications use PTP fast events notifications to report on clock events that can negatively affect the performance and reliability of the application. PTP fast events are distributed over an Advanced Message Queuing Protocol (AMQP) event notification bus.
Additional resources
- For more information about using PTP hardware in your cluster nodes, see Using PTP hardware.
Creating ZTP custom resources for multiple managed clusters
If you are installing multiple managed clusters, zero touch provisioning (ZTP) uses ArgoCD and SiteConfig
to manage the processes that create the custom resources (CR) and generate and apply the policies for multiple clusters, in batches of no more than 100, using the GitOps approach.
Installing and deploying the clusters is a two stage process, as shown here:
Prerequisites for deploying the ZTP pipeline
OKD cluster version 4.8 or higher and Red Hat GitOps Operator is installed.
Red Hat Advanced Cluster Management (RHACM) version 2.3 or above is installed.
For disconnected environments, make sure your source data Git repository and
ztp-site-generator
container image are accessible from the hub cluster.If you want additional custom content, such as extra install manifests or custom resources (CR) for policies, add them to the
/usr/src/hook/ztp/source-crs/extra-manifest/
directory. Similarly, you can add additional configuration CRs, as referenced from aPolicyGenTemplate
, to the/usr/src/hook/ztp/source-crs/
directory.Create a
Containerfile
that adds your additional manifests to the Red Hat provided image, for example:FROM <registry fqdn>/ztp-site-generator:latest (1)
COPY myInstallManifest.yaml /usr/src/hook/ztp/source-crs/extra-manifest/
COPY mySourceCR.yaml /usr/src/hook/ztp/source-crs/
1 <registry fqdn> must point to a registry containing the ztp-site-generator
container image provided by Red Hat.Build a new container image that includes these additional files:
$> podman build Containerfile.example
Installing the GitOps ZTP pipeline
The procedures in this section tell you how to complete the following tasks:
Prepare the Git repository you need to host site configuration data.
Configure the hub cluster for generating the required installation and policy custom resources (CR).
Deploy the managed clusters using zero touch provisioning (ZTP).
Preparing the ZTP Git repository
Create a Git repository for hosting site configuration data. The zero touch provisioning (ZTP) pipeline requires read access to this repository.
Procedure
Create a directory structure with separate paths for the
SiteConfig
andPolicyGenTemplate
custom resources (CR).Add
pre-sync.yaml
andpost-sync.yaml
fromresource-hook-example/<policygentemplates>/
to the path for thePolicyGenTemplate
CRs.Add
pre-sync.yaml
andpost-sync.yaml
fromresource-hook-example/<siteconfig>/
to the path for theSiteConfig
CRs.If your hub cluster operates in a disconnected environment, you must update the
image
for all four pre and post sync hook CRs.Apply the
policygentemplates.ran.openshift.io
andsiteconfigs.ran.openshift.io
CR definitions.
Preparing the hub cluster for ZTP
You can configure your hub cluster with a set of ArgoCD applications that generate the required installation and policy custom resources (CR) for each site based on a zero touch provisioning (ZTP) GitOps flow.
Procedure
Install the Red Hat OpenShift GitOps Operator on your hub cluster.
Extract the administrator password for ArgoCD:
$ oc get secret openshift-gitops-cluster -n openshift-gitops -o jsonpath='{.data.admin\.password}' | base64 -d
Prepare the ArgoCD pipeline configuration:
Extract the ArgoCD deployment CRs from the ZTP site generator container using the latest container image version:
$ mkdir ztp
$ podman run --rm -v `pwd`/ztp:/mnt/ztp:Z registry.redhat.io/openshift4/ztp-site-generate-rhel8:v4.10.0-1 /bin/bash -c "cp -ar /usr/src/hook/ztp/* /mnt/ztp/"
The remaining steps in this section relate to the
ztp/gitops-subscriptions/argocd/
directory.Modify the source values of the two ArgoCD applications,
deployment/clusters-app.yaml
anddeployment/policies-app.yaml
with appropriate URL,targetRevision
branch, and path values. The path values must match those used in your Git repository.Modify
deployment/clusters-app.yaml
:apiVersion: v1
kind: Namespace
metadata:
name: clusters-sub
---
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: clusters
namespace: openshift-gitops
spec:
destination:
server: https://kubernetes.default.svc
namespace: clusters-sub
project: default
source:
path: ztp/gitops-subscriptions/argocd/resource-hook-example/siteconfig (1)
repoURL: https://github.com/openshift-kni/cnf-features-deploy (2)
targetRevision: master (3)
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
1 The ztp/gitops-subscriptions/argocd/
file path that contains thesiteconfig
CRs for the clusters.2 The URL of the Git repository that contains the siteconfig
custom resources that define site configuration for installing clusters.3 The branch on the Git repository that contains the relevant site configuration data. Modify
deployment/policies-app.yaml
:apiVersion: v1
kind: Namespace
metadata:
name: policies-sub
---
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: policies
namespace: openshift-gitops
spec:
destination:
server: https://kubernetes.default.svc
namespace: policies-sub
project: default
source:
directory:
recurse: true
path: ztp/gitops-subscriptions/argocd/resource-hook-example/policygentemplates (1)
repoURL: https://github.com/openshift-kni/cnf-features-deploy (2)
targetRevision: master (3)
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
1 The ztp/gitops-subscriptions/argocd/
file path that contains thepolicygentemplates
CRs for the clusters.2 The URL of the Git repository that contains the policygentemplates
custom resources that specify configuration data for the site.3 The branch on the Git repository that contains the relevant configuration data.
To apply the pipeline configuration to your hub cluster, enter this command:
$ oc apply -k ./deployment
Creating the site secrets
Add the required secrets for the site to the hub cluster. These resources must be in a namespace with a name that matches the cluster name.
Procedure
Create a secret for authenticating to the site Baseboard Management Controller (BMC). Ensure the secret name matches the name used in the
SiteConfig
. In this example, the secret name istest-sno-bmh-secret
:apiVersion: v1
kind: Secret
metadata:
name: test-sno-bmh-secret
namespace: test-sno
data:
password: dGVtcA==
username: cm9vdA==
type: Opaque
Create the pull secret for the site. The pull secret must contain all credentials necessary for installing OpenShift and all add-on Operators. In this example, the secret name is
assisted-deployment-pull-secret
:apiVersion: v1
kind: Secret
metadata:
name: assisted-deployment-pull-secret
namespace: test-sno
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: <Your pull secret base64 encoded>
The secrets are referenced from the |
Creating the SiteConfig custom resources
ArgoCD acts as the engine for the GitOps method of site deployment. After completing a site plan that contains the required custom resources for the site installation, a policy generator creates the manifests and applies them to the hub cluster.
Procedure
Create one or more
SiteConfig
custom resources,site-config.yaml
files, that contains the site-plan data for the clusters. For example:apiVersion: ran.openshift.io/v1
kind: SiteConfig
metadata:
name: "test-sno"
namespace: "test-sno"
spec:
baseDomain: "clus2.t5g.lab.eng.bos.redhat.com"
pullSecretRef:
name: "assisted-deployment-pull-secret"
clusterImageSetNameRef: "openshift-4.10"
sshPublicKey: "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDB3dwhI5X0ZxGBb9VK7wclcPHLc8n7WAyKjTNInFjYNP9J+Zoc/ii+l3YbGUTuqilDwZN5rVIwBux2nUyVXDfaM5kPd9kACmxWtfEWTyVRootbrNWwRfKuC2h6cOd1IlcRBM1q6IzJ4d7+JVoltAxsabqLoCbK3svxaZoKAaK7jdGG030yvJzZaNM4PiTy39VQXXkCiMDmicxEBwZx1UsA8yWQsiOQ5brod9KQRXWAAST779gbvtgXR2L+MnVNROEHf1nEjZJwjwaHxoDQYHYKERxKRHlWFtmy5dNT6BbvOpJ2e5osDFPMEd41d2mUJTfxXiC1nvyjk9Irf8YJYnqJgBIxi0IxEllUKH7mTdKykHiPrDH5D2pRlp+Donl4n+sw6qoDc/3571O93+RQ6kUSAgAsvWiXrEfB/7kGgAa/BD5FeipkFrbSEpKPVu+gue1AQeJcz9BuLqdyPUQj2VUySkSg0FuGbG7fxkKeF1h3Sga7nuDOzRxck4I/8Z7FxMF/e8DmaBpgHAUIfxXnRqAImY9TyAZUEMT5ZPSvBRZNNmLbfex1n3NLcov/GEpQOqEYcjG5y57gJ60/av4oqjcVmgtaSOOAS0kZ3y9YDhjsaOcpmRYYijJn8URAH7NrW8EZsvAoF6GUt6xHq5T258c6xSYUm5L0iKvBqrOW9EjbLw== root@cnfdc2.clus2.t5g.lab.eng.bos.redhat.com"
clusters:
- clusterName: "test-sno"
clusterType: "sno"
clusterProfile: "du"
clusterLabels:
group-du-sno: ""
common: true
sites : "test-sno"
clusterNetwork:
- cidr: 1001:db9::/48
hostPrefix: 64
machineNetwork:
- cidr: 2620:52:0:10e7::/64
serviceNetwork:
- 1001:db7::/112
additionalNTPSources:
- 2620:52:0:1310::1f6
nodes:
- hostName: "test-sno.clus2.t5g.lab.eng.bos.redhat.com"
bmcAddress: "idrac-virtualmedia+https://[2620:52::10e7:f602:70ff:fee4:f4e2]/redfish/v1/Systems/System.Embedded.1"
bmcCredentialsName:
name: "test-sno-bmh-secret"
bootMACAddress: "0C:42:A1:8A:74:EC"
bootMode: "UEFI"
rootDeviceHints:
hctl: '0:1:0'
cpuset: "0-1,52-53"
nodeNetwork:
interfaces:
- name: eno1
macAddress: "0C:42:A1:8A:74:EC"
config:
interfaces:
- name: eno1
type: ethernet
state: up
macAddress: "0C:42:A1:8A:74:EC"
ipv4:
enabled: false
ipv6:
enabled: true
address:
- ip: 2620:52::10e7:e42:a1ff:fe8a:900
prefix-length: 64
dns-resolver:
config:
search:
- clus2.t5g.lab.eng.bos.redhat.com
server:
- 2620:52:0:1310::1f6
routes:
config:
- destination: ::/0
next-hop-interface: eno1
next-hop-address: 2620:52:0:10e7::fc
table-id: 254
Save the files and push them to the zero touch provisioning (ZTP) Git repository accessible from the hub cluster and defined as a source repository of the ArgoCD application.
ArgoCD detects that the application is out of sync. Upon sync, either automatic or manual, ArgoCD synchronizes the PolicyGenTemplate
to the hub cluster and launches the associated resource hooks. These hooks are responsible for generating the policy wrapped configuration CRs that apply to the spoke cluster. The resource hooks convert the site definitions to installation custom resources and applies them to the hub cluster:
Namespace
- Unique per siteAgentClusterInstall
BareMetalHost
ClusterDeployment
InfraEnv
NMStateConfig
ExtraManifestsConfigMap
- Extra manifests. The additional manifests include workload partitioning, chronyd, mountpoint hiding, sctp enablement, and more.ManagedCluster
KlusterletAddonConfig
Red Hat Advanced Cluster Management (RHACM) (ACM) deploys the hub cluster.
Creating the PolicyGenTemplates
Use the following procedure to create the PolicyGenTemplates
you will need for generating policies in your Git repository for the hub cluster.
Procedure
Create the
PolicyGenTemplates
and save them to the zero touch provisioning (ZTP) Git repository accessible from the hub cluster and defined as a source repository of the ArgoCD application.ArgoCD detects that the application is out of sync. Upon sync, either automatic or manual, ArgoCD applies the new
PolicyGenTemplate
to the hub cluster and launches the associated resource hooks. These hooks are responsible for generating the policy wrapped configuration CRs that apply to the spoke cluster and perform the following actions:Create the Red Hat Advanced Cluster Management (RHACM) (ACM) policies according to the basic distributed unit (DU) profile and required customizations.
Apply the generated policies to the hub cluster.
The ZTP process creates policies that direct ACM to apply the desired configuration to the cluster nodes.
Checking the installation status
The ArgoCD pipeline detects the SiteConfig
and PolicyGenTemplate
custom resources (CRs) in the Git repository and syncs them to the hub cluster. In the process, it generates installation and policy CRs and applies them to the hub cluster. You can monitor the progress of this synchronization in the ArgoCD dashboard.
Procedure
Monitor the progress of cluster installation using the following commands:
$ export CLUSTER=<cluster_name>
$ oc get agentclusterinstall -n $CLUSTER $CLUSTER -o jsonpath='{.status.conditions[?(@.type=="Completed")]}' | jq
$ curl -sk $(oc get agentclusterinstall -n $CLUSTER $CLUSTER -o jsonpath='{.status.debugInfo.eventsURL}') | jq '.[-2,-1]'
Use the Red Hat Advanced Cluster Management (RHACM) (ACM) dashboard to monitor the progress of policy reconciliation.
Site cleanup
To remove a site and the associated installation and policy custom resources (CRs), remove the SiteConfig
and site-specific PolicyGenTemplate
CRs from the Git repository. The pipeline hooks remove the generated CRs.
Before removing a |
Removing the ArgoCD pipeline
Use the following procedure if you want to remove the ArgoCD pipeline and all generated artifacts.
Procedure
Detach all clusters from ACM.
Delete all
SiteConfig
andPolicyGenTemplate
custom resources (CRs) from your Git repository.Delete the following namespaces:
All policy namespaces:
$ oc get policy -A
clusters-sub
policies-sub
Process the directory using the Kustomize tool:
$ oc delete -k cnf-features-deploy/ztp/gitops-subscriptions/argocd/deployment
Troubleshooting GitOps ZTP
As noted, the ArgoCD pipeline synchronizes the SiteConfig
and PolicyGenTemplate
custom resources (CR) from the Git repository to the hub cluster. During this process, post-sync hooks create the installation and policy CRs that are also applied to the hub cluster. Use the following procedures to troubleshoot issues that might occur in this process.
Validating the generation of installation CRs
SiteConfig
applies Installation custom resources (CR) to the hub cluster in a namespace with the name matching the site name. To check the status, enter the following command:
$ oc get AgentClusterInstall -n <cluster_name>
If no object is returned, use the following procedure to troubleshoot the ArgoCD pipeline flow from SiteConfig
to the installation CRs.
Procedure
Check the synchronization of the
SiteConfig
to the hub cluster using either of the following commands:$ oc get siteconfig -A
or
$ oc get siteconfig -n clusters-sub
If the
SiteConfig
is missing, one of the following situations has occurred:The clusters application failed to synchronize the CR from the Git repository to the hub. Use the following command to verify this:
$ oc describe -n openshift-gitops application clusters
Check for
Status: Synced
and that theRevision:
is the SHA of the commit you pushed to the subscribed repository.The pre-sync hook failed, possibly due to a failure to pull the container image. Check the ArgoCD dashboard for the status of the pre-sync job in the clusters application.
Verify the post hook job ran:
$ oc describe job -n clusters-sub siteconfig-post
If successful, the returned output indicates
succeeded: 1
.If the job fails, ArgoCD retries it. In some cases, the first pass will fail and the second pass will indicate that the job passed.
Check for errors in the post hook job:
$ oc get pod -n clusters-sub
Note the name of the
siteconfig-post-xxxxx
pod:$ oc logs -n clusters-sub siteconfig-post-xxxxx
If the logs indicate errors, correct the conditions and push the corrected
SiteConfig
orPolicyGenTemplate
to the Git repository.
Validating the generation of policy CRs
ArgoCD generates the policy custom resources (CRs) in the same namespace as the PolicyGenTemplate
from which they were created. The same troubleshooting flow applies to all policy CRs generated from PolicyGenTemplates
regardless of whether they are common, group, or site based.
To check the status of the policy CRs, enter the following commands:
$ export NS=<namespace>
$ oc get policy -n $NS
The returned output displays the expected set of policy wrapped CRs. If no object is returned, use the following procedure to troubleshoot the ArgoCD pipeline flow from SiteConfig
to the policy CRs.
Procedure
Check the synchronization of the
PolicyGenTemplate
to the hub cluster:$ oc get policygentemplate -A
or
$ oc get policygentemplate -n $NS
If the
PolicyGenTemplate
is not synchronized, one of the following situations has occurred:The clusters application failed to synchronize the CR from the Git repository to the hub. Use the following command to verify this:
$ oc describe -n openshift-gitops application clusters
Check for
Status: Synced
and that theRevision:
is the SHA of the commit you pushed to the subscribed repository.The pre-sync hook failed, possibly due to a failure to pull the container image. Check the ArgoCD dashboard for the status of the pre-sync job in the clusters application.
Ensure the policies were copied to the cluster namespace. When ACM recognizes that policies apply to a
ManagedCluster
, ACM applies the policy CR objects to the cluster namespace:$ oc get policy -n <cluster_name>
ACM copies all applicable common, group, and site policies here. The policy names are
<policyNamespace>
and<policyName>
.Check the placement rule for any policies not copied to the cluster namespace. The
matchSelector
in thePlacementRule
for those policies should match the labels on theManagedCluster
:$ oc get placementrule -n $NS
Make a note of the
PlacementRule
name for the missing common, group, or site policy:oc get placementrule -n $NS <placmentRuleName> -o yaml
The
status decisions
value should include your cluster name.The
key value
of thematchSelector
in the spec should match the labels on your managed cluster. Check the labels onManagedCluster
:oc get ManagedCluster $CLUSTER -o jsonpath='{.metadata.labels}' | jq
Example
apiVersion: apps.open-cluster-management.io/v1
kind: PlacementRule
metadata:
name: group-test1-policies-placementrules
namespace: group-test1-policies
spec:
clusterSelector:
matchExpressions:
- key: group-test1
operator: In
values:
- ""
status:
decisions:
- clusterName: <cluster_name>
clusterNamespace: <cluster_name>
Ensure all policies are compliant:
oc get policy -n $CLUSTER
If the Namespace, OperatorGroup, and Subscription policies are compliant but the Operator configuration policies are not it is likely that the Operators did not install.