×

Release notes for Red Hat Windows Machine Config Operator 10.15.2

This release of the WMCO provides new features and bug fixes for running Windows compute nodes in an OpenShift Container Platform cluster. The components of the WMCO 10.15.2 were released in RHBA-2024:2704.

Bug fixes

  • Previously, on Azure clusters the WMCO would check if an external Cloud Controller Manager (CCM) was being used on the cluster. CCM use is the default. If a CCM is being used, the Operator would adjust configuration logic accordingly. Because the status condition that the WMCO used to check for the CCM was removed, the WMCO proceeded as if a CCM was not in use. This fix removes the check. As a result, the WMCO always configures the required logic on Azure clusters. (OCPBUGS-31704)

  • Previously, the kubelet was unable to authenticate with private Elastic Container Registries (ECR) registries. Because of this error, the kubelet was not able to pull images from these registries. With this fix, the kubelet is able to pull images from these registries as expected. (OCPBUGS-26602)

  • Previously, the WMCO was logging error messages when any commands being run through an SSH connection to a Windows instance failed. This was incorrect behavior because some commands are expected to fail. For example, when WMCO reboots a node the Operator runs PowerShell commands on the instance until they fail, meaning the SSH connection rebooted as expected. With this fix, only actualy errors are now logged. (OCPBUGS-20255)

Release notes for Red Hat Windows Machine Config Operator 10.15.1

This release of the WMCO provides new features and bug fixes for running Windows compute nodes in an OpenShift Container Platform cluster. The components of the WMCO 10.15.1 were released in RHBA-2024:1191.

Due to an internal issue, the planned WMCO 10.15.0 could not be released. Multiple bug and security fixes, described below, were included in WMCO 10.15.0. These fixes are included in WMCO 10.15.1. For specific details about these bug and security fixes, see the RHSA-2024:0954 errata.

New features and improvements

CPU and memory usage metrics are now available

CPU and memory usage metrics for Windows pods are now available in Prometheus. The metrics are shown in the OpenShift Container Platform web console on the Metrics tab for each Windows pod and can be queried by users.

Operator SDK upgrade

The WMCO now uses the Operator SDK version 1.32.0.

Kubernetes upgrade

The WMCO now uses Kubernetes 1.28.

Bug fixes

  • Previously, there was a flaw in the handling of multiplexed streams in the HTTP/2 protocol, which is utilized by the WMCO. A client could repeatedly make a request for a new multiplex stream and then immediately send an RST_STREAM frame to cancel those requests. This activity created additional work for the server by setting up and dismantling streams, but avoided any server-side limitations on the maximum number of active streams per connection. As a result, a denial of service occurred due to server resource consumption. This issue has been fixed. (BZ-2243296)

  • Previously, there was a flaw in Kubernetes, where a user who can create pods and persistent volumes on Windows nodes was able to escalate to admin privileges on those nodes. Kubernetes clusters were only affected if they were using an in-tree storage plugin for Windows nodes. This issue has been fixed. (BZ-2247163)

  • Previously, there was a flaw in the SSH channel integrity. By manipulating sequence numbers during the handshake, an attacker could remove the initial messages on the secure channel without causing a MAC failure. For example, an attacker could disable the ping extension and thus disable the new countermeasure in OpenSSH 9.5 against keystroke timing attacks. This issue has been fixed. (BZ-2254210)

  • Previously, the routes from a Windows Bring-Your-Own-Host (BYOH) VM to the metadata endpoint were being added as non-persistent routes, so the routes were removed when a VM was removed (deconfigured) or re-configured. This would cause the node to fail if configured again, as the metadata endpoint was unreachable. With this fix, the WMCO runs the AWS EC2 launch v2 service after removal or re-configuration. As a result, the routes are restored so that the VM can be configured into a node, as expected. (OCPBUGS-15988)

  • Previously, the WMCO did not properly wait for Windows virtual machines (VMs) to finish rebooting. This led to occasional timing issues where the WMCO would attempt to interact with a node that was in the middle of a reboot, causing WMCO to log an error and restart node configuration. Now, the WMCO waits for the instance to completely reboot. (OCPBUGS-17217)

  • Previously, the WMCO configuration was missing the DeleteEmptyDirData: true field, which is required for draining nodes that have emptyDir volumes attached. As a consequence, customers that had nodes with emptyDir volumes would see the following error in the logs: cannot delete Pods with local storage. With this fix, the DeleteEmptyDirData: true field was added to the node drain helper struct in the WMCO. As a result, customers are able to drain nodes with emptyDir volumes attached. (OCPBUGS-27300)

  • Previously, because of a lack of synchronization between Windows machine set nodes and BYOH instances, during an update the machine set nodes and the BYOH instances could update simultaneously. This could impact running workloads. This fix introduces a locking mechanism so that machine set nodes and BYOH instances update individually. (OCPBUGS-8996)

  • Previously, because of a missing secret, the WMCO could not configure proper credentials for the WICD on Nutanix clusters. As a consequence, the WMCO could not create Windows nodes. With this fix, the WMCO creates long-lived credentials for the WICD service account. As a result, the WMCO is able to configure a Windows node on Nutanix clusters. (OCPBUGS-25350)

  • Previously, because of bad logic in the networking configuration script, the WICD was incorrectly reading carriage returns in the CNI configuration file as changes, and identified the file as modified. This caused the CNI configuration to be unnecessarily reloaded, potentially resulting in container restarts and brief network outages. With this fix, the WICD now reloads the CNI configuration only when the CNI configuration is actually modified. (OCPBUGS-25756)