MachineHealthChecks automatically repairs unhealthy Machines in a particular
To monitor machine health, you create a resource to define the
configuration for a controller. You set a condition to check for, such as
staying in the
NotReady status for 15 minutes or displaying a permanent condition
in the node-problem-detector, and a label for the set of machines to monitor.
You cannot apply a MachineHealthCheck to a machine with the master role.
The controller that observes a MachineHealthCheck resource checks for the status
that you defined. If a machine fails the health check, it is automatically deleted
and a new one is created to take its place. When a machine is deleted, you
machine deleted event. To limit disruptive impact of the machine
deletion, the controller drains and deletes only one node at a time. If there
are more unhealthy machines than the
maxUnhealthy threshold allows for in the
targeted pool of machines, remediation stops so that manual intervention can take
To stop the check, you remove the resource.