To apply a custom layered image to your cluster by using the on-cluster build process, make a MachineOSConfig
custom resource (CR) that specifies the following parameters:
-
the Containerfile to build
-
the machine config pool to associate the build
-
where the final image should be pushed and pulled from
-
the push and pull secrets to use
When you create the object, the Machine Config Operator (MCO) creates a MachineOSBuild
object and a builder pod. The build process also creates transient objects, such as config maps, which are cleaned up after the build is complete. The MachineOSBuild
object and the associated builder-*
pod use the same naming scheme, <MachineOSConfig_CR_name>-<hash>
, for example:
Example MachineOSBuild
object
NAME PREPARED BUILDING SUCCEEDED INTERRUPTED FAILED
layered-c8765e26ebc87e1e17a7d6e0a78e8bae False False True False False
Example builder pod
NAME READY STATUS RESTARTS AGE
build-layered-c8765e26ebc87e1e17a7d6e0a78e8bae 2/2 Running 0 11m
When the build is complete, the MCO pushes the new custom layered image to your repository for use when deploying new nodes. You can see the digested image pull spec for the new custom layered image in the MachineOSBuild
object and machine-os-builder
pod.
You should not need to interact with these new objects or the machine-os-builder
pod. However, you can use all of these resources for troubleshooting, if necessary.
You need a separate MachineOSConfig
CR for each machine config pool where you want to use a custom layered image.
|
On-cluster image layering is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
|
Making certain changes to a MachineOSConfig
object triggers an automatic rebuild of the associated custom layered image. You can mitigate the effects of the rebuild by pausing the machine config pool where the custom layered image is applied as described in "Pausing the machine config pools." For example, if you want to remove and replace a MachineOSCOnfig
object, pausing the machine config pools before making the change prevents the MCO from reverting the associated nodes to the base image, reducing the number of reboots needed.
When a machine config pool is paused, the oc get machineconfigpools
reports the following status:
Example output
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
layered rendered-layered-221507009cbcdec0eec8ab3ccd789d18 False False False 1 0 0 0 3h23m (1)
master rendered-master-a0b404d061a6183cc36d302363422aba True False False 3 3 3 0 4h14m
worker rendered-worker-221507009cbcdec0eec8ab3ccd789d18 True False False 2 2 2 0 4h14m
1 |
The layered machine config pool is paused, as indicated by the three False statuses and the READYMACHINECOUNT at 0 . |
After the changes have been rolled out, you can unpause the machine config pool.
In the case of a build failure, for example due to network issues or an invalid secret, the MCO retries the build three additional times before the job fails. The MCO creates a different build pod for each build attempt. You can use the build pod logs to troubleshoot any build failures. However, the MCO automatically removes these build pods after a short period of time.
Example failed MachineOSBuild
object
NAME PREPARED BUILDING SUCCEEDED INTERRUPTED FAILED
layered-c8765e26ebc87e1e17a7d6e0a78e8bae False False False False True
On-cluster layering Technology Preview known limitations
Note the following limitations when working with the on-cluster layering feature:
-
On-cluster layering is supported only for OpenShift Container Platform clusters on the AMD64 architecture.
-
On-cluster layering is not supported on multi-architecture compute machines, single-node OpenShift, or disconnected clusters.
-
If you scale up a machine set that uses a custom layered image, the nodes reboot two times. The first, when the node is initially created with the base image and a second time when the custom layered image is applied.
-
Node disruption policies are not supported on nodes with a custom layered image. As a result the following configuration changes cause a node reboot:
-
Modifying the configuration files in the /var
or /etc
directory
-
Adding or modifying a systemd service
-
Changing SSH keys
-
Removing mirroring rules from ICSP
, ITMS
, and IDMS
objects
-
Changing the trusted CA, by updating the user-ca-bundle
configmap in the openshift-config
namespace
-
The images used in creating custom layered images take up space in your push registry. Always be aware of the free space in your registry and prune the images as needed.
Prerequisites
-
You have enabled the TechPreviewNoUpgrade
feature set by using the feature gates. For more information, see "Enabling features using feature gates".
-
You have a copy of the pull secret in the openshift-machine-config-operator
namespace that the MCO needs to pull the base operating system image.
-
You have the push secret of the registry that the MCO needs to push the new custom layered image to.
-
You have a pull secret that your nodes need to pull the new custom layered image from your registry. This should be a different secret than the one used to push the image to the repository.
-
You are familiar with how to configure a Containerfile. Instructions on how to create a Containerfile are beyond the scope of this documentation.
-
Optional: You have a separate machine config pool for the nodes where you want to apply the custom layered image.
Procedure
-
Create a MachineOSconfig
object:
-
Create a YAML file similar to the following:
apiVersion: machineconfiguration.openshift.io/v1aplha1 (1)
kind: MachineOSConfig
metadata:
name: layered (2)
spec:
machineConfigPool:
name: <mcp_name> (3)
buildInputs:
containerFile: (4)
- containerfileArch: noarch
content: |-
FROM configs AS final
RUN rpm-ostree install tree && \
ostree container commit
imageBuilder: (5)
imageBuilderType: PodImageBuilder
baseImagePullSecret: (6)
name: global-pull-secret-copy
renderedImagePushspec: image-registry.openshift-image-registry.svc:5000/openshift/os-image:latest (7)
renderedImagePushSecret: (8)
name: builder-dockercfg-7lzwl
buildOutputs: (9)
currentImagePullSecret:
name: builder-dockercfg-7lzwl
1 |
Specifies the machineconfiguration.openshift.io/v1 API that is required for MachineConfig CRs. |
2 |
Specifies a name for the MachineOSConfig object. This name is used with other on-cluster layering resources. The examples in this documentation use the name layered . |
3 |
Specifies the name of the machine config pool associated with the nodes where you want to deploy the custom layered image. |
4 |
Specifies the Containerfile to configure the custom layered image. |
5 |
Specifies the name of the image builder to use. This must be PodImageBuilder . |
6 |
Specifies the name of the pull secret that the MCO needs in order to pull the base operating system image from the registry. |
7 |
Specifies the image registry to push the newly-built custom layered image to. This can be any registry that your cluster has access to. This example uses the internal OpenShift Container Platform registry. |
8 |
Specifies the name of the push secret that the MCO needs in order to push the newly-built custom layered image to that registry. |
9 |
Specifies the secret required by the image registry that the nodes need in order to pull the newly-built custom layered image. This should be a different secret than the one used to push the image to your repository. |
-
Create the MachineOSConfig
object:
$ oc create -f <file_name>.yaml
-
If necessary, when the MachineOSBuild
object has been created and is in the READY
state, modify the node spec for the nodes where you want to use the new custom layered image:
-
Check that the MachineOSBuild
object is READY
. When the SUCCEEDED
value is True
, the build is complete.
Example output showing that the MachineOSBuild
object is ready
NAME PREPARED BUILDING SUCCEEDED INTERRUPTED FAILED
layered-ad5a3cad36303c363cf458ab0524e7c0-builder False False True False False
-
Edit the nodes where you want to deploy the custom layered image by adding a label for the machine config pool you specified in the MachineOSConfig
object:
$ oc label node <node_name> 'node-role.kubernetes.io/<mcp_name>='
- node-role.kubernetes.io/<mcp_name>=
-
Specifies a node selector that identifies the nodes to deploy the custom layered image.
When you save the changes, the MCO drains, cordons, and reboots the nodes. After the reboot, the node will be using the new custom layered image.
Verification
-
Verify that the new pods are ready by running the following command:
$ oc get pods -n openshift-machine-config-operator
Example output
NAME READY STATUS RESTARTS AGE
build-layered-ad5a3cad36303c363cf458ab0524e7c0-hxrws 2/2 Running 0 2m40s (1)
# ...
machine-os-builder-6fb66cfb99-zcpvq 1/1 Running 0 2m42s (2)
1 |
This is the build pod where the custom layered image is building, named in the build-<MachineOSConfig_CR_name>-<hash> format. |
2 |
This pod can be used for troubleshooting. |
-
Verify that the MachineOSConfig
object contains a reference to the new custom layered image by running the following command:
Example output
NAME PREPARED BUILDING SUCCEEDED INTERRUPTED FAILED
layered-ad5a3cad36303c363cf458ab0524e7c0 False True False False False (1)
1 |
The MachineOSBuild is named in the <MachineOSConfig_CR_name>-<hash> format. |
-
Verify that the MachineOSBuild
object contains a reference to the new custom layered image by running the following command:
$ oc describe machineosbuild <object_name>
Example output
Name: layered-ad5a3cad36303c363cf458ab0524e7c0
# ...
API Version: machineconfiguration.openshift.io/v1alpha1
Kind: MachineOSBuild
# ...
Spec:
Config Generation: 1
Desired Config:
Name: rendered-layered-ad5a3cad36303c363cf458ab0524e7c0
Machine OS Config:
Name: layered-alpha1
Rendered Image Pushspec: image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/os-images:layered-ad5a3cad36303c363cf458ab0524e7c0
# ...
Last Transition Time: 2025-02-12T19:21:28Z
Message: Build Ready
Reason: Ready
Status: True
Type: Succeeded
Final Image Pullspec: image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/os-images@sha256:312e48825e074b01a913deedd6de68abd44894ede50b2d14f99d722f13cda04b (1)
1 |
Digested image pull spec for the new custom layered image. |
-
Verify that the appropriate nodes are using the new custom layered image:
-
Start a debug session as root for a control plane node:
$ oc debug node/<node_name>
-
Set /host
as the root directory within the debug shell:
-
Run the rpm-ostree status
command to view that the custom layered image is in use:
sh-5.1# rpm-ostree status
Example output
# ...
Deployments:
* ostree-unverified-registry:image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/os-images@sha256:312e48825e074b01a913deedd6de68abd44894ede50b2d14f99d722f13cda04b
Digest: sha256:312e48825e074b01a913deedd6de68abd44894ede50b2d14f99d722f13cda04b (1)
Version: 418.94.202502100215-0 (2025-02-12T19:20:44Z)
1 |
Digested image pull spec for the new custom layered image. |
Modifying a custom layered image
You can modify an on-cluster custom layered image, as needed. This allows you to install additional packages, remove existing packages, change the pull or push repositories, update secrets, or other similar changes. You can edit the MachineOSConfig
object, apply changes to the YAML file that created the MachineOSConfig
object, or create a new YAML file for that purpose.
If you modify and apply the MachineOSConfig
object YAML or create a new YAML file, the YAML overwrites any changes you made directly to the MachineOSConfig
object itself.
Making certain changes to a MachineOSConfig
object triggers an automatic rebuild of the associated custom layered image. You can mitigate the effects of the rebuild by pausing the machine config pool where the custom layered image is applied as described in "Pausing the machine config pools." For example, if you want to remove and replace a MachineOSCOnfig
object, pausing the machine config pools before making the change prevents the MCO from reverting the associated nodes to the base image, reducing the number of reboots needed.
When a machine config pool is paused, the oc get machineconfigpools
reports the following status:
Example output
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
layered rendered-layered-221507009cbcdec0eec8ab3ccd789d18 False False False 1 0 0 0 3h23m (1)
master rendered-master-a0b404d061a6183cc36d302363422aba True False False 3 3 3 0 4h14m
worker rendered-worker-221507009cbcdec0eec8ab3ccd789d18 True False False 2 2 2 0 4h14m
1 |
The layered machine config pool is paused, as indicated by the three False statuses and the READYMACHINECOUNT at 0 . |
After the changes have been rolled out, you can unpause the machine config pool.
Verification
-
Verify that the new MachineOSBuild
object was created by using the following command:
Example output
NAME PREPARED BUILDING SUCCEEDED INTERRUPTED FAILED
layered-a5457b883f5239cdcb71b57e1a30b6ef False False True False False
layered-f91f0f5593dd337d89bf4d38c877590b False True False False False (1)
1 |
The value True in the BUILDING column indicates that the MachineOSBuild object is building. When the SUCCEEDED column reports True , the build is complete. |
-
You can watch as the new machine config is rolled out to the nodes by using the following command:
$ oc get machineconfigpools
Example output
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
layered rendered-layered-221507009cbcdec0eec8ab3ccd789d18 False True False 1 0 0 0 167m (1)
master rendered-master-a0b404d061a6183cc36d302363422aba True False False 3 3 3 0 3h38m
worker rendered-worker-221507009cbcdec0eec8ab3ccd789d18 True False False 2 2 2 0 3h38m
1 |
The value FALSE in the UPDATED column indicates that the MachineOSBuild object is building. When the UPDATED column reports FALSE , the new custom layered image has rolled out to the nodes. |
-
When the node is back in the Ready
state, check that the changes were applied:
-
Open an oc debug
session to the node by running the following command:
$ oc debug node/<node_name>
-
Set /host
as the root directory within the debug shell by running the following command:
-
Use an appropriate command to verify that change was applied. The following examples shows that the rngd
daemon was installed:
sh-5.1# rpm -qa |grep rng-tools
Example output
rng-tools-6.17-3.fc41.x86_64
Reverting an on-cluster custom layered image
You can revert an on-cluster custom layered image from nodes by removing the label for the machine config pool (MCP) that you specified in the MachineOSConfig
object. After you remove the label, the Machine Config Operator (MCO) reboots the nodes in that MCP with the cluster base Red Hat Enterprise Linux CoreOS (RHCOS) image, along with any previously-made machine config changes, overriding the custom layered image.
|
If the node where the custom layered image is deployed uses a custom machine config pool, before you remove the label, make sure the node is associated with a second MCP.
|
You can reapply the custom layered image to the node by using the oc label node/<node_name> 'node-role.kubernetes.io/<mcp_name>='
label.
Verification
You can verify that the custom layered image is removed by performing the following checks:
-
Check that the worker machine config pool is updating with the previous machine config:
Sample output
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
layered rendered-layered-bde4e4206442c0a48b1a1fb35ba56e85 True False False 0 0 0 0 4h46m
master rendered-master-8332482204e0b76002f15ecad15b6c2d True False False 3 3 3 0 5h26m
worker rendered-worker-bde4e4206442c0a48b1a1fb35ba56e85 False True False 3 2 2 0 5h26m (1)
1 |
The value FALSE in the UPDATED column indicates that the MachineOSBuild object is building. When the UPDATED column reports FALSE , the base image has rolled out to the nodes. |
-
Check the nodes to see that scheduling on the nodes is disabled. This indicates that the change is being applied:
Example output
NAME STATUS ROLES AGE VERSION
ip-10-0-148-79.us-west-1.compute.internal Ready worker 32m v1.31.3
ip-10-0-155-125.us-west-1.compute.internal Ready,SchedulingDisabled worker 35m v1.31.3
ip-10-0-170-47.us-west-1.compute.internal Ready control-plane,master 42m v1.31.3
ip-10-0-174-77.us-west-1.compute.internal Ready control-plane,master 42m v1.31.3
ip-10-0-211-49.us-west-1.compute.internal Ready control-plane,master 42m v1.31.3
ip-10-0-218-151.us-west-1.compute.internal Ready worker 31m v1.31.3
-
When the node is back in the Ready
state, check that the node is using the base image:
-
Open an oc debug
session to the node. For example:
$ oc debug node/<node_name>
-
Set /host
as the root directory within the debug shell:
-
Run an rpm-ostree status
command to view that the base image is in use:
sh-5.1# rpm-ostree status
Example output
State: idle
Deployments:
* ostree-unverified-image:containers-storage:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:76721c875a2b79688be46b1dca654c2c6619a6be28b29a2822cd86c3f9d8e3c1
Digest: sha256:76721c875a2b79688be46b1dca654c2c6619a6be28b29a2822cd86c3f9d8e3c1
Version: 418.94.202501300706-0 (2025-01-30T07:10:58Z)