Advanced managed cluster configuration with PolicyGenTemplate resources

You can use PolicyGenTemplate CRs to deploy custom functionality in your managed clusters.

Deploying additional changes to clusters

If you require cluster configuration changes outside of the base GitOps ZTP pipeline configuration, there are three options:

Apply the additional configuration after the ZTP pipeline is complete

When the GitOps ZTP pipeline deployment is complete, the deployed cluster is ready for application workloads. At this point, you can install additional Operators and apply configurations specific to your requirements. Ensure that additional configurations do not negatively affect the performance of the platform or allocated CPU budget.

Add content to the ZTP library

The base source custom resources (CRs) that you deploy with the GitOps ZTP pipeline can be augmented with custom content as required.

Create extra manifests for the cluster installation

Extra manifests are applied during installation and make the installation process more efficient.

Providing additional source CRs or modifying existing source CRs can significantly impact the performance or CPU profile of OKD.

Additional resources

Using PolicyGenTemplate CRs to override source CRs content

PolicyGenTemplate custom resources (CRs) allow you to overlay additional configuration details on top of the base source CRs provided with the GitOps plugin in the ztp-site-generate container. You can think of PolicyGenTemplate CRs as a logical merge or patch to the base CR. Use PolicyGenTemplate CRs to update a single field of the base CR, or overlay the entire contents of the base CR. You can update values and insert fields that are not in the base CR.

The following example procedure describes how to update fields in the generated PerformanceProfile CR for the reference configuration based on the PolicyGenTemplate CR in the group-du-sno-ranGen.yaml file. Use the procedure as a basis for modifying other parts of the PolicyGenTemplate based on your requirements.

Prerequisites

  • Create a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for Argo CD.

Procedure

  1. Review the baseline source CR for existing content. You can review the source CRs listed in the reference PolicyGenTemplate CRs by extracting them from the zero touch provisioning (ZTP) container.

    1. Create an /out folder:

      1. $ mkdir -p ./out
    2. Extract the source CRs:

      1. $ podman run --log-driver=none --rm registry.redhat.io/openshift4/ztp-site-generate-rhel8:v4.12.1 extract /home/ztp --tar | tar x -C ./out
  2. Review the baseline PerformanceProfile CR in ./out/source-crs/PerformanceProfile.yaml:

    1. apiVersion: performance.openshift.io/v2
    2. kind: PerformanceProfile
    3. metadata:
    4. name: $name
    5. annotations:
    6. ran.openshift.io/ztp-deploy-wave: "10"
    7. spec:
    8. additionalKernelArgs:
    9. - "idle=poll"
    10. - "rcupdate.rcu_normal_after_boot=0"
    11. cpu:
    12. isolated: $isolated
    13. reserved: $reserved
    14. hugepages:
    15. defaultHugepagesSize: $defaultHugepagesSize
    16. pages:
    17. - size: $size
    18. count: $count
    19. node: $node
    20. machineConfigPoolSelector:
    21. pools.operator.machineconfiguration.openshift.io/$mcp: ""
    22. net:
    23. userLevelNetworking: true
    24. nodeSelector:
    25. node-role.kubernetes.io/$mcp: ''
    26. numa:
    27. topologyPolicy: "restricted"
    28. realTimeKernel:
    29. enabled: true

    Any fields in the source CR which contain $…​ are removed from the generated CR if they are not provided in the PolicyGenTemplate CR.

  3. Update the PolicyGenTemplate entry for PerformanceProfile in the group-du-sno-ranGen.yaml reference file. The following example PolicyGenTemplate CR stanza supplies appropriate CPU specifications, sets the hugepages configuration, and adds a new field that sets globallyDisableIrqLoadBalancing to false.

    1. - fileName: PerformanceProfile.yaml
    2. policyName: "config-policy"
    3. metadata:
    4. name: openshift-node-performance-profile
    5. spec:
    6. cpu:
    7. # These must be tailored for the specific hardware platform
    8. isolated: "2-19,22-39"
    9. reserved: "0-1,20-21"
    10. hugepages:
    11. defaultHugepagesSize: 1G
    12. pages:
    13. - size: 1G
    14. count: 10
    15. globallyDisableIrqLoadBalancing: false
  4. Commit the PolicyGenTemplate change in Git, and then push to the Git repository being monitored by the GitOps ZTP argo CD application.

Example output

The ZTP application generates an RHACM policy that contains the generated PerformanceProfile CR. The contents of that CR are derived by merging the metadata and spec contents from the PerformanceProfile entry in the PolicyGenTemplate onto the source CR. The resulting CR has the following content:

  1. ---
  2. apiVersion: performance.openshift.io/v2
  3. kind: PerformanceProfile
  4. metadata:
  5. name: openshift-node-performance-profile
  6. spec:
  7. additionalKernelArgs:
  8. - idle=poll
  9. - rcupdate.rcu_normal_after_boot=0
  10. cpu:
  11. isolated: 2-19,22-39
  12. reserved: 0-1,20-21
  13. globallyDisableIrqLoadBalancing: false
  14. hugepages:
  15. defaultHugepagesSize: 1G
  16. pages:
  17. - count: 10
  18. size: 1G
  19. machineConfigPoolSelector:
  20. pools.operator.machineconfiguration.openshift.io/master: ""
  21. net:
  22. userLevelNetworking: true
  23. nodeSelector:
  24. node-role.kubernetes.io/master: ""
  25. numa:
  26. topologyPolicy: restricted
  27. realTimeKernel:
  28. enabled: true

In the /source-crs folder that you extract from the ztp-site-generate container, the $ syntax is not used for template substitution as implied by the syntax. Rather, if the policyGen tool sees the $ prefix for a string and you do not specify a value for that field in the related PolicyGenTemplate CR, the field is omitted from the output CR entirely.

An exception to this is the $mcp variable in /source-crs YAML files that is substituted with the specified value for mcp from the PolicyGenTemplate CR. For example, in example/policygentemplates/group-du-standard-ranGen.yaml, the value for mcp is worker:

  1. spec:
  2. bindingRules:
  3. group-du-standard: “”
  4. mcp: worker

The policyGen tool replace instances of $mcp with worker in the output CRs.

Adding new content to the GitOps ZTP pipeline

The source CRs in the GitOps ZTP site generator container provide a set of critical features and node tuning settings for RAN Distributed Unit (DU) applications. These are applied to the clusters that you deploy with ZTP. To add or modify existing source CRs in the ztp-site-generate container, rebuild the ztp-site-generate container and make it available to the hub cluster, typically from the disconnected registry associated with the hub cluster. Any valid OKD CR can be added.

Perform the following procedure to add new content to the ZTP pipeline.

Procedure

  1. Create a directory containing a Containerfile and the source CR YAML files that you want to include in the updated ztp-site-generate container, for example:

    1. ztp-update/
    2. ├── example-cr1.yaml
    3. ├── example-cr2.yaml
    4. └── ztp-update.in
  2. Add the following content to the ztp-update.in Containerfile:

    1. FROM registry.redhat.io/openshift4/ztp-site-generate-rhel8:v4.12
    2. ADD example-cr2.yaml /kustomize/plugin/ran.openshift.io/v1/policygentemplate/source-crs/
    3. ADD example-cr1.yaml /kustomize/plugin/ran.openshift.io/v1/policygentemplate/source-crs/
  3. Open a terminal at the ztp-update/ folder and rebuild the container:

    1. $ podman build -t ztp-site-generate-rhel8-custom:v4.12-custom-1
  4. Push the built container image to your disconnected registry, for example:

    1. $ podman push localhost/ztp-site-generate-rhel8-custom:v4.12-custom-1 registry.example.com:5000/ztp-site-generate-rhel8-custom:v4.12-custom-1
  5. Patch the Argo CD instance on the hub cluster to point to the newly built container image:

    1. $ oc patch -n openshift-gitops argocd openshift-gitops --type=json -p '[{"op": "replace", "path":"/spec/repo/initContainers/0/image", "value": "registry.example.com:5000/ztp-site-generate-rhel8-custom:v4.12-custom-1"} ]'

    When the Argo CD instance is patched, the openshift-gitops-repo-server pod automatically restarts.

Verification

  1. Verify that the new openshift-gitops-repo-server pod has completed initialization and that the previous repo pod is terminated:

    1. $ oc get pods -n openshift-gitops | grep openshift-gitops-repo-server

    Example output

    1. openshift-gitops-server-7df86f9774-db682 1/1 Running 1 28s

    You must wait until the new openshift-gitops-repo-server pod has completed initialization and the previous pod is terminated before the newly added container image content is available.

Additional resources

  • Alternatively, you can patch the ArgoCD instance as described in Configuring the hub cluster with ArgoCD by modifying argocd-openshift-gitops-patch.json with an updated initContainer image before applying the patch file.

Configuring policy compliance evaluation timeouts for PolicyGenTemplate CRs

Use Red Hat Advanced Cluster Management (RHACM) installed on a hub cluster to monitor and report on whether your managed clusters are compliant with applied policies. RHACM uses policy templates to apply predefined policy controllers and policies. Policy controllers are Kubernetes custom resource definition (CRD) instances.

You can override the default policy evaluation intervals with PolicyGenTemplate custom resources (CRs). You configure duration settings that define how long a ConfigurationPolicy CR can be in a state of policy compliance or non-compliance before RHACM re-evaluates the applied cluster policies.

The zero touch provisioning (ZTP) policy generator generates ConfigurationPolicy CR policies with pre-defined policy evaluation intervals. The default value for the noncompliant state is 10 seconds. The default value for the compliant state is 10 minutes. To disable the evaluation interval, set the value to never.

Prerequisites

  • You have installed the OpenShift CLI (oc).

  • You have logged in to the hub cluster as a user with cluster-admin privileges.

  • You have created a Git repository where you manage your custom site configuration data.

Procedure

  1. To configure the evaluation interval for all policies in a PolicyGenTemplate CR, add evaluationInterval to the spec field, and then set the appropriate compliant and noncompliant values. For example:

    1. spec:
    2. evaluationInterval:
    3. compliant: 30m
    4. noncompliant: 20s
  2. To configure the evaluation interval for the spec.sourceFiles object in a PolicyGenTemplate CR, add evaluationInterval to the sourceFiles field, for example:

    1. spec:
    2. sourceFiles:
    3. - fileName: SriovSubscription.yaml
    4. policyName: "sriov-sub-policy"
    5. evaluationInterval:
    6. compliant: never
    7. noncompliant: 10s
  3. Commit the PolicyGenTemplate CRs files in the Git repository and push your changes.

Verification

Check that the managed spoke cluster policies are monitored at the expected intervals.

  1. Log in as a user with cluster-admin privileges on the managed cluster.

  2. Get the pods that are running in the open-cluster-management-agent-addon namespace. Run the following command:

    1. $ oc get pods -n open-cluster-management-agent-addon

    Example output

    1. NAME READY STATUS RESTARTS AGE
    2. config-policy-controller-858b894c68-v4xdb 1/1 Running 22 (5d8h ago) 10d
  3. Check the applied policies are being evaluated at the expected interval in the logs for the config-policy-controller pod:

    1. $ oc logs -n open-cluster-management-agent-addon config-policy-controller-858b894c68-v4xdb

    Example output

    1. 2022-05-10T15:10:25.280Z info configuration-policy-controller controllers/configurationpolicy_controller.go:166 Skipping the policy evaluation due to the policy not reaching the evaluation interval {"policy": "compute-1-config-policy-config"}
    2. 2022-05-10T15:10:25.280Z info configuration-policy-controller controllers/configurationpolicy_controller.go:166 Skipping the policy evaluation due to the policy not reaching the evaluation interval {"policy": "compute-1-common-compute-1-catalog-policy-config"}

Signalling ZTP cluster deployment completion with validator inform policies

Create a validator inform policy that signals when the zero touch provisioning (ZTP) installation and configuration of the deployed cluster is complete. This policy can be used for deployments of single-node OpenShift clusters, three-node clusters, and standard clusters.

Procedure

  1. Create a standalone PolicyGenTemplate custom resource (CR) that contains the source file validatorCRs/informDuValidator.yaml. You only need one standalone PolicyGenTemplate CR for each cluster type. For example, this CR applies a validator inform policy for single-node OpenShift clusters:

    Example single-node cluster validator inform policy CR (group-du-sno-validator-ranGen.yaml)

    1. apiVersion: ran.openshift.io/v1
    2. kind: PolicyGenTemplate
    3. metadata:
    4. name: "group-du-sno-validator" (1)
    5. namespace: "ztp-group" (2)
    6. spec:
    7. bindingRules:
    8. group-du-sno: "" (3)
    9. bindingExcludedRules:
    10. ztp-done: "" (4)
    11. mcp: "master" (5)
    12. sourceFiles:
    13. - fileName: validatorCRs/informDuValidator.yaml
    14. remediationAction: inform (6)
    15. policyName: "du-policy" (7)
    1The name of PolicyGenTemplates object. This name is also used as part of the names for the placementBinding, placementRule, and policy that are created in the requested namespace.
    2This value should match the namespace used in the group PolicyGenTemplates.
    3The group-du-* label defined in bindingRules must exist in the SiteConfig files.
    4The label defined in bindingExcludedRules must beztp-done:. The ztp-done label is used in coordination with the Topology Aware Lifecycle Manager.
    5mcp defines the MachineConfigPool object that is used in the source file validatorCRs/informDuValidator.yaml. It should be master for single node and three-node cluster deployments and worker for standard cluster deployments.
    6Optional. The default value is inform.
    7This value is used as part of the name for the generated RHACM policy. The generated validator policy for the single node example is group-du-sno-validator-du-policy.
  2. Commit the PolicyGenTemplate CR file in your Git repository and push the changes.

Additional resources

Configuring PTP fast events using PolicyGenTemplate CRs

You can configure PTP fast events for vRAN clusters that are deployed using the GitOps Zero Touch Provisioning (ZTP) pipeline. Use PolicyGenTemplate custom resources (CRs) as the basis to create a hierarchy of configuration files tailored to your specific site requirements.

Prerequisites

  • Create a Git repository where you manage your custom site configuration data.

Procedure

  1. Add the following YAML into .spec.sourceFiles in the common-ranGen.yaml file to configure the AMQP Operator:

    1. #AMQ interconnect operator for fast events
    2. - fileName: AmqSubscriptionNS.yaml
    3. policyName: "subscriptions-policy"
    4. - fileName: AmqSubscriptionOperGroup.yaml
    5. policyName: "subscriptions-policy"
    6. - fileName: AmqSubscription.yaml
    7. policyName: "subscriptions-policy"
  2. Apply the following PolicyGenTemplate changes to group-du-3node-ranGen.yaml, group-du-sno-ranGen.yaml, or group-du-standard-ranGen.yaml files according to your requirements:

    1. In .sourceFiles, add the PtpOperatorConfig CR file that configures the AMQ transport host to the config-policy:

      1. - fileName: PtpOperatorConfigForEvent.yaml
      2. policyName: "config-policy"
    2. Configure the linuxptp and phc2sys for the PTP clock type and interface. For example, add the following stanza into .sourceFiles:

      1. - fileName: PtpConfigSlave.yaml (1)
      2. policyName: "config-policy"
      3. metadata:
      4. name: "du-ptp-slave"
      5. spec:
      6. profile:
      7. - name: "slave"
      8. interface: "ens5f1" (2)
      9. ptp4lOpts: "-2 -s --summary_interval -4" (3)
      10. phc2sysOpts: "-a -r -m -n 24 -N 8 -R 16" (4)
      11. ptpClockThreshold: (5)
      12. holdOverTimeout: 30 #secs
      13. maxOffsetThreshold: 100 #nano secs
      14. minOffsetThreshold: -100 #nano secs
      1Can be one PtpConfigMaster.yaml, PtpConfigSlave.yaml, or PtpConfigSlaveCvl.yaml depending on your requirements. PtpConfigSlaveCvl.yaml configures linuxptp services for an Intel E810 Columbiaville NIC. For configurations based on group-du-sno-ranGen.yaml or group-du-3node-ranGen.yaml, use PtpConfigSlave.yaml.
      2Device specific interface name.
      3You must append the —summary_interval -4 value to ptp4lOpts in .spec.sourceFiles.spec.profile to enable PTP fast events.
      4Required phc2sysOpts values. -m prints messages to stdout. The linuxptp-daemon DaemonSet parses the logs and generates Prometheus metrics.
      5Optional. If the ptpClockThreshold stanza is not present, default values are used for the ptpClockThreshold fields. The stanza shows default ptpClockThreshold values. The ptpClockThreshold values configure how long after the PTP master clock is disconnected before PTP events are triggered. holdOverTimeout is the time value in seconds before the PTP clock event state changes to FREERUN when the PTP master clock is disconnected. The maxOffsetThreshold and minOffsetThreshold settings configure offset values in nanoseconds that compare against the values for CLOCK_REALTIME (phc2sys) or master offset (ptp4l). When the ptp4l or phc2sys offset value is outside this range, the PTP clock state is set to FREERUN. When the offset value is within this range, the PTP clock state is set to LOCKED.
  3. Apply the following PolicyGenTemplate changes to your specific site YAML files, for example, example-sno-site.yaml:

    1. In .sourceFiles, add the Interconnect CR file that configures the AMQ router to the config-policy:

      1. - fileName: AmqInstance.yaml
      2. policyName: "config-policy"
  4. Merge any other required changes and files with your custom site repository.

  5. Push the changes to your site configuration repository to deploy PTP fast events to new sites using GitOps ZTP.

Additional resources

Configuring the Image Registry Operator for local caching of images

OKD manages image caching using a local registry. In edge computing use cases, clusters are often subject to bandwidth restrictions when communicating with centralized image registries, which might result in long image download times.

Long download times are unavoidable during initial deployment. Over time, there is a risk that CRI-O will erase the /var/lib/containers/storage directory in the case of an unexpected shutdown. To address long image download times, you can create a local image registry on remote managed clusters using GitOps ZTP. This is useful in Edge computing scenarios where clusters are deployed at the far edge of the network.

Before you can set up the local image registry with GitOps ZTP, you need to configure disk partitioning in the SiteConfig CR that you use to install the remote managed cluster. After installation, you configure the local image registry using a PolicyGenTemplate CR. Then, the ZTP pipeline creates Persistent Volume (PV) and Persistent Volume Claim (PVC) CRs and patches the imageregistry configuration.

The local image registry can only be used for user application images and cannot be used for the OKD or Operator Lifecycle Manager operator images.

Additional resources

Configuring disk partitioning with SiteConfig

Configure disk partitioning for a managed cluster using a SiteConfig CR and GitOps ZTP. The disk partition details in the SiteConfig CR must match the underlying disk.

Use persistent naming for devices to avoid device names such as /dev/sda and /dev/sdb being switched at every reboot. You can use rootDeviceHints to choose the bootable device and then use same device for further partitioning.

Prerequisites

  • You have installed the OpenShift CLI (oc).

  • You have logged in to the hub cluster as a user with cluster-admin privileges.

  • You have created a Git repository where you manage your custom site configuration data for use with GitOps Zero Touch Provisioning (ZTP).

Procedure

  1. Add the following YAML that describes the host disk partitioning to the SiteConfig CR that you use to install the managed cluster:

    1. nodes:
    2. rootDeviceHints:
    3. wwn: "0x62cea7f05c98c2002708a0a22ff480ea"
    4. diskPartition:
    5. - device: /dev/disk/by-id/wwn-0x62cea7f05c98c2002708a0a22ff480ea (1)
    6. partitions:
    7. - mount_point: /var/imageregistry
    8. size: 102500 (2)
    9. start: 344844 (3)
    1This setting depends on the hardware. The setting can be a serial number or device name. The value must match the value set for rootDeviceHints.
    2The minimum value for size is 102500 MiB.
    3The minimum value for start is 25000 MiB. The total value of size and start must not exceed the disk size, or the installation will fail.
  2. Save the SiteConfig CR and push it to the site configuration repo.

The ZTP pipeline provisions the cluster using the SiteConfig CR and configures the disk partition.

Configuring the image registry using PolicyGenTemplate CRs

Use PolicyGenTemplate (PGT) CRs to apply the CRs required to configure the image registry and patch the imageregistry configuration.

Prerequisites

  • You have configured a disk partition in the managed cluster.

  • You have installed the OpenShift CLI (oc).

  • You have logged in to the hub cluster as a user with cluster-admin privileges.

  • You have created a Git repository where you manage your custom site configuration data for use with GitOps Zero Touch Provisioning (ZTP).

Procedure

  1. Configure the storage class, persistent volume claim, persistent volume, and image registry configuration in the appropriate PolicyGenTemplate CR. For example, to configure an individual site, add the following YAML to the file example-sno-site.yaml:

    1. sourceFiles:
    2. # storage class
    3. - fileName: StorageClass.yaml
    4. policyName: "sc-for-image-registry"
    5. metadata:
    6. name: image-registry-sc
    7. annotations:
    8. ran.openshift.io/ztp-deploy-wave: "100" (1)
    9. # persistent volume claim
    10. - fileName: StoragePVC.yaml
    11. policyName: "pvc-for-image-registry"
    12. metadata:
    13. name: image-registry-pvc
    14. namespace: openshift-image-registry
    15. annotations:
    16. ran.openshift.io/ztp-deploy-wave: "100"
    17. spec:
    18. accessModes:
    19. - ReadWriteMany
    20. resources:
    21. requests:
    22. storage: 100Gi
    23. storageClassName: image-registry-sc
    24. volumeMode: Filesystem
    25. # persistent volume
    26. - fileName: ImageRegistryPV.yaml (2)
    27. policyName: "pv-for-image-registry"
    28. metadata:
    29. annotations:
    30. ran.openshift.io/ztp-deploy-wave: "100"
    31. - fileName: ImageRegistryConfig.yaml
    32. policyName: "config-for-image-registry"
    33. complianceType: musthave
    34. metadata:
    35. annotations:
    36. ran.openshift.io/ztp-deploy-wave: "100"
    37. spec:
    38. storage:
    39. pvc:
    40. claim: "image-registry-pvc"
    1Set the appropriate value for ztp-deploy-wave depending on whether you are configuring image registries at the site, common, or group level. ztp-deploy-wave: “100” is suitable for development or testing because it allows you to group the referenced source files together.
    2In ImageRegistryPV.yaml, ensure that the spec.local.path field is set to /var/imageregistry to match the value set for the mount_point field in the SiteConfig CR.

    Do not set complianceType: mustonlyhave for the - fileName: ImageRegistryConfig.yaml configuration. This can cause the registry pod deployment to fail.

  2. Commit the PolicyGenTemplate change in Git, and then push to the Git repository being monitored by the GitOps ZTP ArgoCD application.

Verification

Use the following steps to troubleshoot errors with the local image registry on the managed clusters:

  • Verify successful login to the registry while logged in to the managed cluster. Run the following commands:

    1. Export the managed cluster name:

      1. $ cluster=<managed_cluster_name>
    2. Get the managed cluster kubeconfig details:

      1. $ oc get secret -n $cluster $cluster-admin-password -o jsonpath='{.data.password}' | base64 -d > kubeadmin-password-$cluster
    3. Download and export the cluster kubeconfig:

      1. $ oc get secret -n $cluster $cluster-admin-kubeconfig -o jsonpath='{.data.kubeconfig}' | base64 -d > kubeconfig-$cluster && export KUBECONFIG=./kubeconfig-$cluster
    4. Verify access to the image registry from the managed cluster. See “Accessing the registry”.

  • Check that the Config CRD in the imageregistry.operator.openshift.io group instance is not reporting errors. Run the following command while logged in to the managed cluster:

    1. $ oc get image.config.openshift.io cluster -o yaml

    Example output

    1. apiVersion: config.openshift.io/v1
    2. kind: Image
    3. metadata:
    4. annotations:
    5. include.release.openshift.io/ibm-cloud-managed: "true"
    6. include.release.openshift.io/self-managed-high-availability: "true"
    7. include.release.openshift.io/single-node-developer: "true"
    8. release.openshift.io/create-only: "true"
    9. creationTimestamp: "2021-10-08T19:02:39Z"
    10. generation: 5
    11. name: cluster
    12. resourceVersion: "688678648"
    13. uid: 0406521b-39c0-4cda-ba75-873697da75a4
    14. spec:
    15. additionalTrustedCA:
    16. name: acm-ice
  • Check that the PersistentVolumeClaim on the managed cluster is populated with data. Run the following command while logged in to the managed cluster:

    1. $ oc get pv image-registry-sc
  • Check that the registry* pod is running and is located under the openshift-image-registry namespace.

    1. $ oc get pods -n openshift-image-registry | grep registry*

    Example output

    1. cluster-image-registry-operator-68f5c9c589-42cfg 1/1 Running 0 8d
    2. image-registry-5f8987879-6nx6h 1/1 Running 0 8d
  • Check that the disk partition on the managed cluster is correct:

    1. Open a debug shell to the managed cluster:

      1. $ oc debug node/sno-1.example.com
    2. Run lsblk to check the host disk partitions:

      1. sh-4.4# lsblk
      2. NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
      3. sda 8:0 0 446.6G 0 disk
      4. |-sda1 8:1 0 1M 0 part
      5. |-sda2 8:2 0 127M 0 part
      6. |-sda3 8:3 0 384M 0 part /boot
      7. |-sda4 8:4 0 336.3G 0 part /sysroot
      8. `-sda5 8:5 0 100.1G 0 part /var/imageregistry (1)
      9. sdb 8:16 0 446.6G 0 disk
      10. sr0 11:0 1 104M 0 rom
      1/var/imageregistry indicates that the disk is correctly partitioned.

Additional resources

Configuring bare-metal event monitoring using PolicyGenTemplate CRs

You can configure bare-metal hardware events for vRAN clusters that are deployed using the GitOps Zero Touch Provisioning (ZTP) pipeline.

Prerequisites

  • Install the OpenShift CLI (oc).

  • Log in as a user with cluster-admin privileges.

  • Create a Git repository where you manage your custom site configuration data.

Procedure

  1. To configure the AMQ Interconnect Operator and the Bare Metal Event Relay Operator, add the following YAML to spec.sourceFiles in the common-ranGen.yaml file:

    1. # AMQ interconnect operator for fast events
    2. - fileName: AmqSubscriptionNS.yaml
    3. policyName: "subscriptions-policy"
    4. - fileName: AmqSubscriptionOperGroup.yaml
    5. policyName: "subscriptions-policy"
    6. - fileName: AmqSubscription.yaml
    7. policyName: "subscriptions-policy"
    8. # Bare Metal Event Rely operator
    9. - fileName: BareMetalEventRelaySubscriptionNS.yaml
    10. policyName: "subscriptions-policy"
    11. - fileName: BareMetalEventRelaySubscriptionOperGroup.yaml
    12. policyName: "subscriptions-policy"
    13. - fileName: BareMetalEventRelaySubscription.yaml
    14. policyName: "subscriptions-policy"
  2. Add the Interconnect CR to .spec.sourceFiles in the site configuration file, for example, the example-sno-site.yaml file:

    1. - fileName: AmqInstance.yaml
    2. policyName: "config-policy"
  3. Add the HardwareEvent CR to spec.sourceFiles in your specific group configuration file, for example, in the group-du-sno-ranGen.yaml file:

    1. - fileName: HardwareEvent.yaml
    2. policyName: "config-policy"
    3. spec:
    4. nodeSelector: {}
    5. transportHost: "amqp://<amq_interconnect_name>.<amq_interconnect_namespace>.svc.cluster.local" (1)
    6. logLevel: "info"
    1The transportHost URL is composed of the existing AMQ Interconnect CR name and namespace. For example, in transportHost: “amqp://amq-router.amq-router.svc.cluster.local”, the AMQ Interconnect name and namespace are both set to amq-router.

    Each baseboard management controller (BMC) requires a single HardwareEvent resource only.

  4. Commit the PolicyGenTemplate change in Git, and then push the changes to your site configuration repository to deploy bare-metal events monitoring to new sites using GitOps ZTP.

  5. Create the Redfish Secret by running the following command:

    1. $ oc -n openshift-bare-metal-events create secret generic redfish-basic-auth \
    2. --from-literal=username=<bmc_username> --from-literal=password=<bmc_password> \
    3. --from-literal=hostaddr="<bmc_host_ip_addr>"

Additional resources

Additional resources

Using hub templates in PolicyGenTemplate CRs

Topology Aware Lifecycle Manager supports partial Red Hat Advanced Cluster Management (RHACM) hub cluster template functions in configuration policies used with GitOps ZTP.

Hub-side cluster templates allow you to define configuration policies that can be dynamically customized to the target clusters. This reduces the need to create separate policies for many clusters with similiar configurations but with different values.

Policy templates are restricted to the same namespace as the namespace where the policy is defined. This means that you must create the objects referenced in the hub template in the same namespace where the policy is created.

The following supported hub template functions are available for use in GitOps ZTP with TALM:

  • fromConfigmap returns the value of the provided data key in the named ConfigMap resource.

    There is a 1 MiB size limit for ConfigMap CRs. The effective size for ConfigMap CRs is further limited by the last-applied-configuration annotation. To avoid the last-applied-configuration limitation, add the following annotation to the template ConfigMap:

    1. argocd.argoproj.io/sync-options: Replace=true
  • base64enc returns the base64-encoded value of the input string

  • base64dec returns the decoded value of the base64-encoded input string

  • indent returns the input string with added indent spaces

  • autoindent returns the input string with added indent spaces based on the spacing used in the parent template

  • toInt casts and returns the integer value of the input value

  • toBool converts the input string into a boolean value, and returns the boolean

Various Open source community functions are also available for use with GitOps ZTP.

Additional resources

Example hub templates

The following code examples are valid hub templates. Each of these templates return values from the ConfigMap CR with the name test-config in the default namespace.

  • Returns the value with the key common-key:

    1. {{hub fromConfigMap "default" "test-config" "common-key" hub}}
  • Returns a string by using the concatenated value of the .ManagedClusterName field and the string -name:

    1. {{hub fromConfigMap "default" "test-config" (printf "%s-name" .ManagedClusterName) hub}}
  • Casts and returns a boolean value from the concatenated value of the .ManagedClusterName field and the string -name:

    1. {{hub fromConfigMap "default" "test-config" (printf "%s-name" .ManagedClusterName) | toBool hub}}
  • Casts and returns an integer value from the concatenated value of the .ManagedClusterName field and the string -name:

    1. {{hub (printf "%s-name" .ManagedClusterName) | fromConfigMap "default" "test-config" | toInt hub}}

Specifying host NICs in site PolicyGenTemplate CRs with hub cluster templates

You can manage host NICs in a single ConfigMap CR and use hub cluster templates to populate the custom NIC values in the generated polices that get applied to the cluster hosts. Using hub cluster templates in site PolicyGenTemplate (PGT) CRs means that you do not need to create multiple single site PGT CRs for each site.

The following example shows you how to use a single ConfigMap CR to manage cluster host NICs and apply them to the cluster as polices by using a single PolicyGenTemplate site CR.

When you use the fromConfigmap function, the printf variable is only available for the template resource data key fields. You cannot use it with name and namespace fields.

Prerequisites

  • You have installed the OpenShift CLI (oc).

  • You have logged in to the hub cluster as a user with cluster-admin privileges.

  • You have created a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for the GitOps ZTP ArgoCD application.

Procedure

  1. Create a ConfigMap resource that describes the NICs for a group of hosts. For example:

    1. apiVersion: v1
    2. kind: ConfigMap
    3. metadata:
    4. name: sriovdata
    5. namespace: ztp-site
    6. annotations:
    7. argocd.argoproj.io/sync-options: Replace=true (1)
    8. data:
    9. example-sno-du_fh-numVfs: "8"
    10. example-sno-du_fh-pf: ens1f0
    11. example-sno-du_fh-priority: "10"
    12. example-sno-du_fh-vlan: "140"
    13. example-sno-du_mh-numVfs: "8"
    14. example-sno-du_mh-pf: ens3f0
    15. example-sno-du_mh-priority: "10"
    16. example-sno-du_mh-vlan: "150"
    1The argocd.argoproj.io/sync-options annotation is required only if the ConfigMap is larger than 1 MiB in size.

    The ConfigMap must be in the same namespace with the policy that has the hub template substitution.

  2. Commit the ConfigMap CR in Git, and then push to the Git repository being monitored by the Argo CD application.

  3. Create a site PGT CR that uses templates to pull the required data from the ConfigMap object. For example:

    1. apiVersion: ran.openshift.io/v1
    2. kind: PolicyGenTemplate
    3. metadata:
    4. name: "site"
    5. namespace: "ztp-site"
    6. spec:
    7. remediationAction: inform
    8. bindingRules:
    9. group-du-sno: ""
    10. mcp: "master"
    11. sourceFiles:
    12. - fileName: SriovNetwork.yaml
    13. policyName: "config-policy"
    14. metadata:
    15. name: "sriov-nw-du-fh"
    16. spec:
    17. resourceName: du_fh
    18. vlan: '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_fh-vlan" .ManagedClusterName) | toInt hub}}'
    19. - fileName: SriovNetworkNodePolicy.yaml
    20. policyName: "config-policy"
    21. metadata:
    22. name: "sriov-nnp-du-fh"
    23. spec:
    24. deviceType: netdevice
    25. isRdma: true
    26. nicSelector:
    27. pfNames:
    28. - '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_fh-pf" .ManagedClusterName) | autoindent hub}}'
    29. numVfs: '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_fh-numVfs" .ManagedClusterName) | toInt hub}}'
    30. priority: '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_fh-priority" .ManagedClusterName) | toInt hub}}'
    31. resourceName: du_fh
    32. - fileName: SriovNetwork.yaml
    33. policyName: "config-policy"
    34. metadata:
    35. name: "sriov-nw-du-mh"
    36. spec:
    37. resourceName: du_mh
    38. vlan: '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_mh-vlan" .ManagedClusterName) | toInt hub}}'
    39. - fileName: SriovNetworkNodePolicy.yaml
    40. policyName: "config-policy"
    41. metadata:
    42. name: "sriov-nnp-du-mh"
    43. spec:
    44. deviceType: vfio-pci
    45. isRdma: false
    46. nicSelector:
    47. pfNames:
    48. - '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_mh-pf" .ManagedClusterName) hub}}'
    49. numVfs: '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_mh-numVfs" .ManagedClusterName) | toInt hub}}'
    50. priority: '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_mh-priority" .ManagedClusterName) | toInt hub}}'
    51. resourceName: du_mh
  4. Commit the site PolicyGenTemplate CR in Git and push to the Git repository that is monitored by the ArgoCD application.

    Subsequent changes to the referenced ConfigMap CR are not automatically synced to the applied policies. You need to manually sync the new ConfigMap changes to update existing PolicyGenTemplate CRs. See “Syncing new ConfigMap changes to existing PolicyGenTemplate CRs”.

Specifying VLAN IDs in group PolicyGenTemplate CRs with hub cluster templates

You can manage VLAN IDs for managed clusters in a single ConfigMap CR and use hub cluster templates to populate the VLAN IDs in the generated polices that get applied to the clusters.

The following example shows how you how manage VLAN IDs in single ConfigMap CR and apply them in individual cluster polices by using a single PolicyGenTemplate group CR.

When using the fromConfigmap function, the printf variable is only available for the template resource data key fields. You cannot use it with name and namespace fields.

Prerequisites

  • You have installed the OpenShift CLI (oc).

  • You have logged in to the hub cluster as a user with cluster-admin privileges.

  • You have created a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for the Argo CD application.

Procedure

  1. Create a ConfigMap CR that describes the VLAN IDs for a group of cluster hosts. For example:

    1. apiVersion: v1
    2. kind: ConfigMap
    3. metadata:
    4. name: site-data
    5. namespace: ztp-group
    6. annotations:
    7. argocd.argoproj.io/sync-options: Replace=true (1)
    8. data:
    9. site-1-vlan: "101"
    10. site-2-vlan: "234"
    1The argocd.argoproj.io/sync-options annotation is required only if the ConfigMap is larger than 1 MiB in size.

    The ConfigMap must be in the same namespace with the policy that has the hub template substitution.

  2. Commit the ConfigMap CR in Git, and then push to the Git repository being monitored by the Argo CD application.

  3. Create a group PGT CR that uses a hub template to pull the required VLAN IDs from the ConfigMap object. For example, add the following YAML snippet to the group PGT CR:

    1. - fileName: SriovNetwork.yaml
    2. policyName: "config-policy"
    3. metadata:
    4. name: "sriov-nw-du-mh"
    5. annotations:
    6. ran.openshift.io/ztp-deploy-wave: "10"
    7. spec:
    8. resourceName: du_mh
    9. vlan: '{{hub fromConfigMap "" "site-data" (printf "%s-vlan" .ManagedClusterName) | toInt hub}}'
  4. Commit the group PolicyGenTemplate CR in Git, and then push to the Git repository being monitored by the Argo CD application.

    Subsequent changes to the referenced ConfigMap CR are not automatically synced to the applied policies. You need to manually sync the new ConfigMap changes to update existing PolicyGenTemplate CRs. See “Syncing new ConfigMap changes to existing PolicyGenTemplate CRs”.

Syncing new ConfigMap changes to existing PolicyGenTemplate CRs

Prerequisites

  • You have installed the OpenShift CLI (oc).

  • You have logged in to the hub cluster as a user with cluster-admin privileges.

  • You have created a PolicyGenTemplate CR that pulls information from a ConfigMap CR using hub cluster templates.

Procedure

  1. Update the contents of your ConfigMap CR, and apply the changes in the hub cluster.

  2. To sync the contents of the updated ConfigMap CR to the deployed policy, do either of the following:

    1. Option 1: Delete the existing policy. ArgoCD uses the PolicyGenTemplate CR to immediately recreate the deleted policy. For example, run the following command:

      1. $ oc delete policy <policy_name> -n <policy_namespace>
    2. Option 2: Apply a special annotation policy.open-cluster-management.io/trigger-update to the policy with a different value every time when you update the ConfigMap. For example:

      1. $ oc annotate policy <policy_name> -n <policy_namespace> policy.open-cluster-management.io/trigger-update="1"

      You must apply the updated policy for the changes to take effect. For more information, see Special annotation for reprocessing.

  3. Optional: If it exists, delete the ClusterGroupUpdate CR that contains the policy. For example:

    1. $ oc delete clustergroupupgrade <cgu_name> -n <cgu_namespace>
    1. Create a new ClusterGroupUpdate CR that includes the policy to apply with the updated ConfigMap changes. For example, add the following YAML to the file cgr-example.yaml:

      1. apiVersion: ran.openshift.io/v1alpha1
      2. kind: ClusterGroupUpgrade
      3. metadata:
      4. name: <cgr_name>
      5. namespace: <policy_namespace>
      6. spec:
      7. managedPolicies:
      8. - <managed_policy>
      9. enable: true
      10. clusters:
      11. - <managed_cluster_1>
      12. - <managed_cluster_2>
      13. remediationStrategy:
      14. maxConcurrency: 2
      15. timeout: 240
    2. Apply the updated policy:

      1. $ oc apply -f cgr-example.yaml