Configuring the Linux cgroup version on your nodes
By default, OKD uses Linux control group version 2 (cgroup v2) in your cluster. You can switch to Linux control group version 1 (cgroup v1), if needed.
cgroup v2 is the next version of the kernel control group and offers multiple improvements. However, it can have some unwanted effects on your nodes.
Configuring Linux cgroup
You can switch to Linux control group version 1 (cgroup v1), if needed, by using a machine config. Enabling cgroup v1 in OKD disables the cgroup v2 controllers and hierarchies in your cluster.
Prerequisites
- Have administrative privilege to a working OKD cluster.
Procedure
Create a
MachineConfig
object file that identifies the kernel argument (for example,worker-cgroup-v1.yaml
)apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker (1)
name: worker-cgroup-v1 (2)
spec:
config:
ignition:
version: 3.2.0
kernelArguments:
- systemd.unified_cgroup_hierarchy=0 (3)
1 Applies the new kernel argument only to worker nodes. 2 Applies a name to the machine config. 3 Configures cgroup v1 on the associated nodes. Create the new machine config:
$ oc create -f 05-worker-cgroup-v1.yaml
Check to see that the new machine config was added:
$ oc get MachineConfig
Example output
NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE
00-master 52dd3ba6a9a527fc3ab42afac8d12b693534c8c9 3.2.0 33m
00-worker 52dd3ba6a9a527fc3ab42afac8d12b693534c8c9 3.2.0 33m
01-master-container-runtime 52dd3ba6a9a527fc3ab42afac8d12b693534c8c9 3.2.0 33m
01-master-kubelet 52dd3ba6a9a527fc3ab42afac8d12b693534c8c9 3.2.0 33m
01-worker-container-runtime 52dd3ba6a9a527fc3ab42afac8d12b693534c8c9 3.2.0 33m
01-worker-kubelet 52dd3ba6a9a527fc3ab42afac8d12b693534c8c9 3.2.0 33m
99-worker-cgroup-v1 3.2.0 105s
99-master-generated-registries 52dd3ba6a9a527fc3ab42afac8d12b693534c8c9 3.2.0 33m
99-master-ssh 3.2.0 40m
99-worker-generated-registries 52dd3ba6a9a527fc3ab42afac8d12b693534c8c9 3.2.0 33m
99-worker-ssh 3.2.0 40m
rendered-master-23e785de7587df95a4b517e0647e5ab7 52dd3ba6a9a527fc3ab42afac8d12b693534c8c9 3.2.0 33m
rendered-master-c5e92d98103061c4818cfcefcf462770 60746a843e7ef8855ae00f2ffcb655c53e0e8296 3.2.0 115s
rendered-worker-5d596d9293ca3ea80c896a1191735bb1 52dd3ba6a9a527fc3ab42afac8d12b693534c8c9 3.2.0 33m
Check the nodes:
$ oc get nodes
Example output
NAME STATUS ROLES AGE VERSION
ip-10-0-136-161.ec2.internal Ready worker 28m v1.26.0
ip-10-0-136-243.ec2.internal Ready master 34m v1.26.0
ip-10-0-141-105.ec2.internal Ready,SchedulingDisabled worker 28m v1.26.0
ip-10-0-142-249.ec2.internal Ready master 34m v1.26.0
ip-10-0-153-11.ec2.internal Ready worker 28m v1.26.0
ip-10-0-153-150.ec2.internal Ready master 34m v1.26.0
You can see that the command disables scheduling on each worker node.
After a node returns to the
Ready
state, start a debug session for that node:$ oc debug node/<node_name>
Set
/host
as the root directory within the debug shell:sh-4.4# chroot /host
Check that the
sys/fs/cgroup/cgroup2fs
file has been moved to thetmpfs
file system:$ stat -c %T -f /sys/fs/cgroup
Example output
tmpfs