FCOS image layering

FCOS image layering

Fedora CoreOS (FCOS) image layering allows you to easily extend the functionality of your base FCOS image by layering additional images onto the base image. This layering does not modify the base FCOS image. Instead, it creates a custom layered image that includes all FCOS functionality and adds additional functionality to specific nodes in the cluster.

You create a custom layered image by using a Containerfile and applying it to nodes by using a MachineConfig object. The Machine Config Operator overrides the base FCOS image, as specified by the osImageURL value in the associated machine config, and boots the new image. You can remove the custom layered image by deleting the machine config, The MCO reboots the nodes back to the base FCOS image.

With FCOS image layering, you can install RPMs into your base image, and your custom content will be booted alongside FCOS. The Machine Config Operator (MCO) can roll out these custom layered images and monitor these custom containers in the same way it does for the default FCOS image. FCOS image layering gives you greater flexibility in how you manage your FCOS nodes.

Installing realtime kernel and extensions RPMs as custom layered content is not recommended. This is because these RPMs can conflict with RPMs installed by using a machine config. If there is a conflict, the MCO enters a degraded state when it tries to install the machine config RPM. You need to remove the conflicting extension from your machine config before proceeding.

As soon as you apply the custom layered image to your cluster, you effectively take ownership of your custom layered images and those nodes. While Red Hat remains responsible for maintaining and updating the base FCOS image on standard nodes, you are responsible for maintaining and updating images on nodes that use a custom layered image. You assume the responsibility for the package you applied with the custom layered image and any issues that might arise with the package.

To apply a custom layered image, you create a Containerfile that references an OKD image and the RPM that you want to apply. You then push the resulting custom layered image to an image registry. In a non-production OKD cluster, create a MachineConfig object for the targeted node pool that points to the new image.

Use the same base FCOS image installed on the rest of your cluster. Use the oc adm release info —image-for rhel-coreos command to obtain the base image used in your cluster.

FCOS image layering allows you to use the following types of images to create custom layered images:

OKD Hotfixes. You can work with Customer Experience and Engagement (CEE) to obtain and apply Hotfix packages on top of your FCOS image. In some instances, you might want a bug fix or enhancement before it is included in an official OKD release. FCOS image layering allows you to easily add the Hotfix before it is officially released and remove the Hotfix when the underlying FCOS image incorporates the fix.

Some Hotfixes require a Red Hat Support Exception and are outside of the normal scope of OKD support coverage or life cycle policies.

In the event you want a Hotfix, it will be provided to you based on Red Hat Hotfix policy. Apply it on top of the base image and test that new custom layered image in a non-production environment. When you are satisfied that the custom layered image is safe to use in production, you can roll it out on your own schedule to specific node pools. For any reason, you can easily roll back the custom layered image and return to using the default FCOS.

Example Containerfile to apply a Hotfix
```
# Using a 4.12.0 image
FROM quay.io/openshift-release-dev/ocp-release@sha256:6499bc69a0707fcad481c3cb73225b867d
#Install hotfix rpm
RUN rpm-ostree override replace https://example.com/myrepo/haproxy-1.0.16-5.el8.src.rpm && \
    rpm-ostree cleanup -m && \
    ostree container commit
```

Fedora packages. You can download Fedora packages from the Red Hat Customer Portal, such as chrony, firewalld, and iputils.

Example Containerfile to apply the firewalld utility

FROM quay.io/openshift-release-dev/ocp-release@sha256:6499bc69a0707fcad481c3cb73225b867d
ADD configure-firewall-playbook.yml .
RUN rpm-ostree install firewalld ansible && \
    ansible-playbook configure-firewall-playbook.yml && \
    rpm -e ansible && \
    ostree container commit

Example Containerfile to apply the libreswan utility

# Get RHCOS base image of target cluster `oc adm release info --image-for rhel-coreos`
# hadolint ignore=DL3006
FROM quay.io/openshift-release/ocp-release@sha256...
# Install our config file
COPY my-host-to-host.conf /etc/ipsec.d/
# RHEL entitled host is needed here to access RHEL packages
# Install libreswan as extra RHEL package
RUN rpm-ostree install libreswan && \
    systemctl enable ipsec && \
    ostree container commit

Because libreswan requires additional RHEL packages, the image must be built on an entitled Fedora host.

Third-party packages. You can download and install RPMs from third-party organizations, such as the following types of packages:

Bleeding edge drivers and kernel enhancements to improve performance or add capabilities.
Forensic client tools to investigate possible and actual break-ins.
Security agents.
Inventory agents that provide a coherent view of the entire cluster.
SSH Key management packages.

Example Containerfile to apply a third-party package from EPEL

# Get RHCOS base image of target cluster `oc adm release info --image-for rhel-coreos`
# hadolint ignore=DL3006
FROM quay.io/openshift-release/ocp-release@sha256...
#Enable EPEL (more info at https://docs.fedoraproject.org/en-US/epel/ ) and install htop
RUN rpm-ostree install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm && \
    rpm-ostree install htop && \
    ostree container commit

Example Containerfile to apply a third-party package that has Fedora dependencies

# Get RHCOS base image of target cluster `oc adm release info --image-for rhel-coreos`
# hadolint ignore=DL3006
FROM quay.io/openshift-release/ocp-release@sha256...
# RHEL entitled host is needed here to access RHEL packages
# Install fish as third party package from EPEL
RUN rpm-ostree install https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/f/fish-3.3.1-3.el9.x86_64.rpm && \
    ostree container commit

This Containerfile installs the Linux fish program. Because fish requires additional RHEL packages, the image must be built on an entitled Fedora host.

After you create the machine config, the Machine Config Operator (MCO) performs the following steps:

Renders a new machine config for the specified pool or pools.
Performs cordon and drain operations on the nodes in the pool or pools.
Writes the rest of the machine config parameters onto the nodes.
Applies the custom layered image to the node.
Reboots the node using the new image.

It is strongly recommended that you test your images outside of your production environment before rolling out to your cluster.

Applying a FCOS custom layered image

You can easily configure Fedora CoreOS (FCOS) image layering on the nodes in specific machine config pools. The Machine Config Operator (MCO) reboots those nodes with the new custom layered image, overriding the base Fedora CoreOS (FCOS) image.

To apply a custom layered image to your cluster, you must have the custom layered image in a repository that your cluster can access. Then, create a MachineConfig object that points to the custom layered image. You need a separate MachineConfig object for each machine config pool that you want to configure.

When you configure a custom layered image, OKD no longer automatically updates any node that uses the custom layered image. You become responsible for manually updating your nodes as appropriate. If you roll back the custom layer, OKD will again automatically update the node. See the Additional resources section that follows for important information about updating nodes that use a custom layered image.

Prerequisites

You must create a custom layered image that is based on an OKD image digest, not a tag.

You should use the same base FCOS image that is installed on the rest of your cluster. Use the oc adm release info —image-for rhel-coreos command to obtain the base image being used in your cluster.

For example, the following Containerfile creates a custom layered image from an OKD 4.13 image and a Hotfix package:

Example Containerfile for a custom layer image
```
# Using a 4.12.0 image
FROM quay.io/openshift-release/ocp-release@sha256:6499bc69a0707fcad481c3cb73225b867d (1)
#Install hotfix rpm
RUN rpm-ostree override replace https://example.com/hotfixes/haproxy-1.0.16-5.el8.src.rpm && \ (2)
    rpm-ostree cleanup -m && \
    ostree container commit
```
1 Specifies an OKD release image.
2 Specifies the path to the Hotfix package.

Instructions on how to create a Containerfile are beyond the scope of this documentation.
Because the process for building a custom layered image is performed outside of the cluster, you must use the --authfile /path/to/pull-secret option with Podman or Buildah. Alternatively, to have the pull secret read by these tools automatically, you can add it to one of the default file locations: ~/.docker/config.json, $XDG_RUNTIME_DIR/containers/auth.json, ~/.docker/config.json, or ~/.dockercfg. Refer to the containers-auth.json man page for more information.
You must push the custom layered image to a repository that your cluster can access.

Procedure

Create a machine config file.
1. Create a YAML file similar to the following:
```
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker (1)
  name: os-layer-custom
spec:
  osImageURL: quay.io/my-registry/custom-image@sha256:306b606615dcf8f0e5e7d87fee3 (2)
```
  1 Specifies the machine config pool to apply the custom layered image.
  2 Specifies the path to the custom layered image in the repository.
2. Create the MachineConfig object:
```
$ oc create -f <file_name>.yaml
```
  It is strongly recommended that you test your images outside of your production environment before rolling out to your cluster.

Verification

You can verify that the custom layered image is applied by performing any of the following checks:

Check that the worker machine config pool has rolled out with the new machine config:

Check that the new machine config is created:

$ oc get mc

Sample output

NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                          5bdb57489b720096ef912f738b46330a8f577803   3.2.0             95m
00-worker                                          5bdb57489b720096ef912f738b46330a8f577803   3.2.0             95m
01-master-container-runtime                        5bdb57489b720096ef912f738b46330a8f577803   3.2.0             95m
01-master-kubelet                                  5bdb57489b720096ef912f738b46330a8f577803   3.2.0             95m
01-worker-container-runtime                        5bdb57489b720096ef912f738b46330a8f577803   3.2.0             95m
01-worker-kubelet                                  5bdb57489b720096ef912f738b46330a8f577803   3.2.0             95m
99-master-generated-registries                     5bdb57489b720096ef912f738b46330a8f577803   3.2.0             95m
99-master-ssh                                                                                 3.2.0             98m
99-worker-generated-registries                     5bdb57489b720096ef912f738b46330a8f577803   3.2.0             95m
99-worker-ssh                                                                                 3.2.0             98m
os-layer-custom                                                                                                 10s (1)
rendered-master-15961f1da260f7be141006404d17d39b   5bdb57489b720096ef912f738b46330a8f577803   3.2.0             95m
rendered-worker-5aff604cb1381a4fe07feaf1595a797e   5bdb57489b720096ef912f738b46330a8f577803   3.2.0             95m
rendered-worker-5de4837625b1cbc237de6b22bc0bc873   5bdb57489b720096ef912f738b46330a8f577803   3.2.0             4s  (2)

1	New machine config
2	New rendered machine config

Check that the osImageURL value in the new machine config points to the expected image:

$ oc describe mc rendered-master-4e8be63aef68b843b546827b6ebe0913

Example output

Name:         rendered-master-4e8be63aef68b843b546827b6ebe0913
Namespace:
Labels:       <none>
Annotations:  machineconfiguration.openshift.io/generated-by-controller-version: 8276d9c1f574481043d3661a1ace1f36cd8c3b62
              machineconfiguration.openshift.io/release-image-version: 4.13.0-ec.3
API Version:  machineconfiguration.openshift.io/v1
Kind:         MachineConfig
...
  Os Image URL: quay.io/my-registry/custom-image@sha256:306b606615dcf8f0e5e7d87fee3

Check that the associated machine config pool is updating with the new machine config:

$ oc get mcp

Sample output

NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-6faecdfa1b25c114a58cf178fbaa45e2   True      False      False      3              3                   3                     0                      39m
worker   rendered-worker-6b000dbc31aaee63c6a2d56d04cd4c1b   False     True       False      3              0                   0                     0                      39m (1)

1	When the `UPDATING` field is `True`, the machine config pool is updating with the new machine config. When the field becomes `False`, the worker machine config pool has rolled out to the new machine config.

Check the nodes to see that scheduling on the nodes is disabled. This indicates that the change is being applied:

$ oc get nodes

Example output

NAME                                         STATUS                     ROLES                  AGE   VERSION
ip-10-0-148-79.us-west-1.compute.internal    Ready                      worker                 32m   v1.26.0
ip-10-0-155-125.us-west-1.compute.internal   Ready,SchedulingDisabled   worker                 35m   v1.26.0
ip-10-0-170-47.us-west-1.compute.internal    Ready                      control-plane,master   42m   v1.26.0
ip-10-0-174-77.us-west-1.compute.internal    Ready                      control-plane,master   42m   v1.26.0
ip-10-0-211-49.us-west-1.compute.internal    Ready                      control-plane,master   42m   v1.26.0
ip-10-0-218-151.us-west-1.compute.internal   Ready                      worker                 31m   v1.26.0

When the node is back in the Ready state, check that the node is using the custom layered image:

Open an oc debug session to the node. For example:

$ oc debug node/ip-10-0-155-125.us-west-1.compute.internal

Set /host as the root directory within the debug shell:
```
sh-4.4# chroot /host
```

Run the rpm-ostree status command to view that the custom layered image is in use:

sh-4.4# sudo rpm-ostree status

Example output

State: idle
Deployments:
* ostree-unverified-registry:quay.io/my-registry/custom-image@sha256:306b606615dcf8f0e5e7d87fee3
                   Digest: sha256:306b606615dcf8f0e5e7d87fee3

Additional resources

Updating with a FCOS custom layered image

Removing a FCOS custom layered image

You can easily revert Fedora CoreOS (FCOS) image layering from the nodes in specific machine config pools. The Machine Config Operator (MCO) reboots those nodes with the cluster base Fedora CoreOS (FCOS) image, overriding the custom layered image.

To remove a Fedora CoreOS (FCOS) custom layered image from your cluster, you need to delete the machine config that applied the image.

Procedure

Delete the machine config that applied the custom layered image.
```
$ oc delete mc os-layer-custom
```
After deleting the machine config, the nodes reboot.

Verification

You can verify that the custom layered image is removed by performing any of the following checks:

Check that the worker machine config pool is updating with the previous machine config:

$ oc get mcp

Sample output

NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-6faecdfa1b25c114a58cf178fbaa45e2   True      False      False      3              3                   3                     0                      39m
worker   rendered-worker-6b000dbc31aaee63c6a2d56d04cd4c1b   False     True       False      3              0                   0                     0                      39m (1)

1	When the `UPDATING` field is `True`, the machine config pool is updating with the previous machine config. When the field becomes `False`, the worker machine config pool has rolled out to the previous machine config.

Check the nodes to see that scheduling on the nodes is disabled. This indicates that the change is being applied:

$ oc get nodes

Example output

NAME                                         STATUS                     ROLES                  AGE   VERSION
ip-10-0-148-79.us-west-1.compute.internal    Ready                      worker                 32m   v1.26.0
ip-10-0-155-125.us-west-1.compute.internal   Ready,SchedulingDisabled   worker                 35m   v1.26.0
ip-10-0-170-47.us-west-1.compute.internal    Ready                      control-plane,master   42m   v1.26.0
ip-10-0-174-77.us-west-1.compute.internal    Ready                      control-plane,master   42m   v1.26.0
ip-10-0-211-49.us-west-1.compute.internal    Ready                      control-plane,master   42m   v1.26.0
ip-10-0-218-151.us-west-1.compute.internal   Ready                      worker                 31m   v1.26.0

When the node is back in the Ready state, check that the node is using the base image:

Open an oc debug session to the node. For example:

$ oc debug node/ip-10-0-155-125.us-west-1.compute.internal

Set /host as the root directory within the debug shell:
```
sh-4.4# chroot /host
```

Run the rpm-ostree status command to view that the custom layered image is in use:

sh-4.4# sudo rpm-ostree status

Example output

State: idle
Deployments:
* ostree-unverified-registry:podman pull quay.io/openshift-release-dev/ocp-release@sha256:e2044c3cfebe0ff3a99fc207ac5efe6e07878ad59fd4ad5e41f88cb016dacd73
                   Digest: sha256:e2044c3cfebe0ff3a99fc207ac5efe6e07878ad59fd4ad5e41f88cb016dacd73

Updating with a FCOS custom layered image

When you configure Fedora CoreOS (FCOS) image layering, OKD no longer automatically updates the node pool that uses the custom layered image. You become responsible to manually update your nodes as appropriate.

To update a node that uses a custom layered image, follow these general steps:

The cluster automatically upgrades to version x.y.z+1, except for the nodes that use the custom layered image.
You could then create a new Containerfile that references the updated OKD image and the RPM that you had previously applied.
Create a new machine config that points to the updated custom layered image.

Updating a node with a custom layered image is not required. However, if that node gets too far behind the current OKD version, you could experience unexpected results.

1	Specifies an OKD release image.
2	Specifies the path to the Hotfix package.

1	Specifies the machine config pool to apply the custom layered image.
2	Specifies the path to the custom layered image in the repository.