- Managing Compliance Operator result and remediation
- Filters for compliance check results
- Reviewing a remediation
- Applying remediation when using customized machine config pools
- Applying a remediation
- Remediating a platform check manually
- Updating remediations
- Unapplying a remediation
- Removing a KubeletConfig remediation
- Inconsistent ComplianceScan
- Additional resources
Managing Compliance Operator result and remediation
Each ComplianceCheckResult
represents a result of one compliance rule check. If the rule can be remediated automatically, a ComplianceRemediation
object with the same name, owned by the ComplianceCheckResult
is created. Unless requested, the remediations are not applied automatically, which gives an OKD administrator the opportunity to review what the remediation does and only apply a remediation once it has been verified.
Filters for compliance check results
By default, the ComplianceCheckResult
objects are labeled with several useful labels that allow you to query the checks and decide on the next steps after the results are generated.
List checks that belong to a specific suite:
$ oc get compliancecheckresults -l compliance.openshift.io/suite=example-compliancesuite
List checks that belong to a specific scan:
$ oc get compliancecheckresults -l compliance.openshift.io/scan=example-compliancescan
Not all ComplianceCheckResult
objects create ComplianceRemediation
objects. Only ComplianceCheckResult
objects that can be remediated automatically do. A ComplianceCheckResult
object has a related remediation if it is labeled with the compliance.openshift.io/automated-remediation
label. The name of the remediation is the same as the name of the check.
List all failing checks that can be remediated automatically:
$ oc get compliancecheckresults -l 'compliance.openshift.io/check-status=FAIL,compliance.openshift.io/automated-remediation'
List all failing checks that must be remediated manually:
$ oc get compliancecheckresults -l 'compliance.openshift.io/check-status=FAIL,!compliance.openshift.io/automated-remediation'
The manual remediation steps are typically stored in the description
attribute in the ComplianceCheckResult
object.
ComplianceCheckResult Status | Description |
---|---|
PASS | Compliance check ran to completion and passed. |
FAIL | Compliance check ran to completion and failed. |
INFO | Compliance check ran to completion and found something not severe enough to be considered an error. |
MANUAL | Compliance check does not have a way to automatically assess the success or failure and must be checked manually. |
INCONSISTENT | Compliance check reports different results from different sources, typically cluster nodes. |
ERROR | Compliance check ran, but could not complete properly. |
NOT-APPLICABLE | Compliance check did not run because it is not applicable or not selected. |
Reviewing a remediation
Review both the ComplianceRemediation
object and the ComplianceCheckResult
object that owns the remediation. The ComplianceCheckResult
object contains human-readable descriptions of what the check does and the hardening trying to prevent, as well as other metadata
like the severity and the associated security controls. The ComplianceRemediation
object represents a way to fix the problem described in the ComplianceCheckResult
. After first scan, check for remediations with the state MissingDependencies
.
Below is an example of a check and a remediation called sysctl-net-ipv4-conf-all-accept-redirects
. This example is redacted to only show spec
and status
and omits metadata
:
spec:
apply: false
current:
object:
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
spec:
config:
ignition:
version: 3.2.0
storage:
files:
- path: /etc/sysctl.d/75-sysctl_net_ipv4_conf_all_accept_redirects.conf
mode: 0644
contents:
source: data:,net.ipv4.conf.all.accept_redirects%3D0
outdated: {}
status:
applicationState: NotApplied
The remediation payload is stored in the spec.current
attribute. The payload can be any Kubernetes object, but because this remediation was produced by a node scan, the remediation payload in the above example is a MachineConfig
object. For Platform scans, the remediation payload is often a different kind of an object (for example, a ConfigMap
or Secret
object), but typically applying that remediation is up to the administrator, because otherwise the Compliance Operator would have required a very broad set of permissions to manipulate any generic Kubernetes object. An example of remediating a Platform check is provided later in the text.
To see exactly what the remediation does when applied, the MachineConfig
object contents use the Ignition objects for the configuration. See the Ignition specification for further information about the format. In our example, the spec.config.storage.files[0].path
attribute specifies the file that is being create by this remediation (/etc/sysctl.d/75-sysctl_net_ipv4_conf_all_accept_redirects.conf
) and the spec.config.storage.files[0].contents.source
attribute specifies the contents of that file.
The contents of the files are URL-encoded. |
Use the following Python script to view the contents:
$ echo "net.ipv4.conf.all.accept_redirects%3D0" | python3 -c "import sys, urllib.parse; print(urllib.parse.unquote(''.join(sys.stdin.readlines())))"
Example output
net.ipv4.conf.all.accept_redirects=0
Applying remediation when using customized machine config pools
When you create a custom MachineConfigPool
, add a label to the MachineConfigPool
so that machineConfigPoolSelector
present in the KubeletConfig
can match the label with MachineConfigPool
.
Do not set |
Procedure
List the nodes.
$ oc get nodes
Example output
NAME STATUS ROLES AGE VERSION
ip-10-0-128-92.us-east-2.compute.internal Ready master 5h21m v1.24.0
ip-10-0-158-32.us-east-2.compute.internal Ready worker 5h17m v1.24.0
ip-10-0-166-81.us-east-2.compute.internal Ready worker 5h17m v1.24.0
ip-10-0-171-170.us-east-2.compute.internal Ready master 5h21m v1.24.0
ip-10-0-197-35.us-east-2.compute.internal Ready master 5h22m v1.24.0
Add a label to nodes.
$ oc label node ip-10-0-166-81.us-east-2.compute.internal node-role.kubernetes.io/<machine_config_pool_name>=
Example output
node/ip-10-0-166-81.us-east-2.compute.internal labeled
Create custom
MachineConfigPool
CR.apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
name: <machine_config_pool_name>
labels:
pools.operator.machineconfiguration.openshift.io/<machine_config_pool_name>: '' (1)
spec:
machineConfigSelector:
matchExpressions:
- {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,<machine_config_pool_name>]}
nodeSelector:
matchLabels:
node-role.kubernetes.io/<machine_config_pool_name>: ""
1 The labels
field defines label name to add for Machine config pool(MCP).Verify MCP created successfully.
$ oc get mcp -w
Applying a remediation
The boolean attribute spec.apply
controls whether the remediation should be applied by the Compliance Operator. You can apply the remediation by setting the attribute to true
:
$ oc patch complianceremediations/<scan_name>-sysctl-net-ipv4-conf-all-accept-redirects --patch '{"spec":{"apply":true}}' --type=merge
After the Compliance Operator processes the applied remediation, the status.ApplicationState
attribute would change to Applied or to Error if incorrect. When a machine config remediation is applied, that remediation along with all other applied remediations are rendered into a MachineConfig
object named 75-$scan-name-$suite-name
. That MachineConfig
object is subsequently rendered by the Machine Config Operator and finally applied to all the nodes in a machine config pool by an instance of the machine control daemon running on each node.
Note that when the Machine Config Operator applies a new MachineConfig
object to nodes in a pool, all the nodes belonging to the pool are rebooted. This might be inconvenient when applying multiple remediations, each of which re-renders the composite 75-$scan-name-$suite-name
MachineConfig
object. To prevent applying the remediation immediately, you can pause the machine config pool by setting the .spec.paused
attribute of a MachineConfigPool
object to true
.
Make sure the pools are unpaused when the CA certificate rotation happens. If the MCPs are paused, the MCO cannot push the newly rotated certificates to those nodes. This causes the cluster to become degraded and causes failure in multiple |
The Compliance Operator can apply remediations automatically. Set autoApplyRemediations: true
in the ScanSetting
top-level object.
Applying remediations automatically should only be done with careful consideration. |
Remediating a platform check manually
Checks for Platform scans typically have to be remediated manually by the administrator for two reasons:
It is not always possible to automatically determine the value that must be set. One of the checks requires that a list of allowed registries is provided, but the scanner has no way of knowing which registries the organization wants to allow.
Different checks modify different API objects, requiring automated remediation to possess
root
or superuser access to modify objects in the cluster, which is not advised.
Procedure
The example below uses the
ocp4-ocp-allowed-registries-for-import
rule, which would fail on a default OKD installation. Inspect the ruleoc get rule.compliance/ocp4-ocp-allowed-registries-for-import -oyaml
, the rule is to limit the registries the users are allowed to import images from by setting theallowedRegistriesForImport
attribute, The warning attribute of the rule also shows the API object checked, so it can be modified and remediate the issue:$ oc edit image.config.openshift.io/cluster
Example output
apiVersion: config.openshift.io/v1
kind: Image
metadata:
annotations:
release.openshift.io/create-only: "true"
creationTimestamp: "2020-09-10T10:12:54Z"
generation: 2
name: cluster
resourceVersion: "363096"
selfLink: /apis/config.openshift.io/v1/images/cluster
uid: 2dcb614e-2f8a-4a23-ba9a-8e33cd0ff77e
spec:
allowedRegistriesForImport:
- domainName: registry.redhat.io
status:
externalRegistryHostnames:
- default-route-openshift-image-registry.apps.user-cluster-09-10-12-07.devcluster.openshift.com
internalRegistryHostname: image-registry.openshift-image-registry.svc:5000
Re-run the scan:
$ oc annotate compliancescans/<scan_name> compliance.openshift.io/rescan=
Updating remediations
When a new version of compliance content is used, it might deliver a new and different version of a remediation than the previous version. The Compliance Operator will keep the old version of the remediation applied. The OKD administrator is also notified of the new version to review and apply. A ComplianceRemediation object that had been applied earlier, but was updated changes its status to Outdated. The outdated objects are labeled so that they can be searched for easily.
The previously applied remediation contents would then be stored in the spec.outdated
attribute of a ComplianceRemediation
object and the new updated contents would be stored in the spec.current
attribute. After updating the content to a newer version, the administrator then needs to review the remediation. As long as the spec.outdated
attribute exists, it would be used to render the resulting MachineConfig
object. After the spec.outdated
attribute is removed, the Compliance Operator re-renders the resulting MachineConfig
object, which causes the Operator to push the configuration to the nodes.
Procedure
Search for any outdated remediations:
$ oc get complianceremediations -lcomplianceoperator.openshift.io/outdated-remediation=
Example output
NAME STATE
workers-scan-no-empty-passwords Outdated
The currently applied remediation is stored in the
Outdated
attribute and the new, unapplied remediation is stored in theCurrent
attribute. If you are satisfied with the new version, remove theOutdated
field. If you want to keep the updated content, remove theCurrent
andOutdated
attributes.Apply the newer version of the remediation:
$ oc patch complianceremediations workers-scan-no-empty-passwords --type json -p '[{"op":"remove", "path":/spec/outdated}]'
The remediation state will switch from
Outdated
toApplied
:$ oc get complianceremediations workers-scan-no-empty-passwords
Example output
NAME STATE
workers-scan-no-empty-passwords Applied
The nodes will apply the newer remediation version and reboot.
Unapplying a remediation
It might be required to unapply a remediation that was previously applied.
Procedure
Set the
apply
flag tofalse
:$ oc patch complianceremediations/<scan_name>-sysctl-net-ipv4-conf-all-accept-redirects -p '{"spec":{"apply":false}}' --type=merge
The remediation status will change to
NotApplied
and the compositeMachineConfig
object would be re-rendered to not include the remediation.All affected nodes with the remediation will be rebooted.
Removing a KubeletConfig remediation
KubeletConfig
remediations are included in node-level profiles. In order to remove a KubeletConfig remediation, you must manually remove it from the KubeletConfig
objects. This example demonstrates how to remove the compliance check for the one-rule-tp-node-master-kubelet-eviction-thresholds-set-hard-imagefs-available
remediation.
Procedure
Locate the
scan-name
and compliance check for theone-rule-tp-node-master-kubelet-eviction-thresholds-set-hard-imagefs-available
remediation:$ oc get remediation one-rule-tp-node-master-kubelet-eviction-thresholds-set-hard-imagefs-available -o yaml
Example output
apiVersion: compliance.openshift.io/v1alpha1
kind: ComplianceRemediation
metadata:
annotations:
compliance.openshift.io/xccdf-value-used: var-kubelet-evictionhard-imagefs-available
creationTimestamp: "2022-01-05T19:52:27Z"
generation: 1
labels:
compliance.openshift.io/scan-name: one-rule-tp-node-master (1)
compliance.openshift.io/suite: one-rule-ssb-node
name: one-rule-tp-node-master-kubelet-eviction-thresholds-set-hard-imagefs-available
namespace: openshift-compliance
ownerReferences:
- apiVersion: compliance.openshift.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: ComplianceCheckResult
name: one-rule-tp-node-master-kubelet-eviction-thresholds-set-hard-imagefs-available
uid: fe8e1577-9060-4c59-95b2-3e2c51709adc
resourceVersion: "84820"
uid: 5339d21a-24d7-40cb-84d2-7a2ebb015355
spec:
apply: true
current:
object:
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
spec:
kubeletConfig:
evictionHard:
imagefs.available: 10% (2)
outdated: {}
type: Configuration
status:
applicationState: Applied
1 The scan name of the remediation. 2 The remediation that was added to the KubeletConfig
objects.If the remediation invokes an
evictionHard
kubelet configuration, you must specify all of theevictionHard
parameters:memory.available
,nodefs.available
,nodefs.inodesFree
,imagefs.available
, andimagefs.inodesFree
. If you do not specify all parameters, only the specified parameters are applied and the remediation will not function properly.Remove the remediation:
Set
apply
to false for the remediation object:$ oc patch complianceremediations/one-rule-tp-node-master-kubelet-eviction-thresholds-set-hard-imagefs-available -p '{"spec":{"apply":false}}' --type=merge
Using the
scan-name
, find theKubeletConfig
object that the remediation was applied to:$ oc get kubeletconfig --selector compliance.openshift.io/scan-name=one-rule-tp-node-master
Example output
NAME AGE
compliance-operator-kubelet-master 2m34s
Manually remove the remediation,
imagefs.available: 10%
, from theKubeletConfig
object:$ oc edit KubeletConfig compliance-operator-kubelet-master
All affected nodes with the remediation will be rebooted.
You must also exclude the rule from any scheduled scans in your tailored profiles that auto-applies the remediation, otherwise, the remediation will be re-applied during the next scheduled scan. |
Inconsistent ComplianceScan
The ScanSetting
object lists the node roles that the compliance scans generated from the ScanSetting
or ScanSettingBinding
objects would scan. Each node role usually maps to a machine config pool.
It is expected that all machines in a machine config pool are identical and all scan results from the nodes in a pool should be identical. |
If some of the results are different from others, the Compliance Operator flags a ComplianceCheckResult
object where some of the nodes will report as INCONSISTENT
. All ComplianceCheckResult
objects are also labeled with compliance.openshift.io/inconsistent-check
.
Because the number of machines in a pool might be quite large, the Compliance Operator attempts to find the most common state and list the nodes that differ from the common state. The most common state is stored in the compliance.openshift.io/most-common-status
annotation and the annotation compliance.openshift.io/inconsistent-source
contains pairs of hostname:status
of check statuses that differ from the most common status. If no common state can be found, all the hostname:status
pairs are listed in the compliance.openshift.io/inconsistent-source annotation
.
If possible, a remediation is still created so that the cluster can converge to a compliant status. However, this might not always be possible and correcting the difference between nodes must be done manually. The compliance scan must be re-run to get a consistent result by annotating the scan with the compliance.openshift.io/rescan=
option:
$ oc annotate compliancescans/<scan_name> compliance.openshift.io/rescan=