Product of (Maximum number of nodes that can drain in parallel) and (Highest total VM memory request allocations across nodes)
Live migration is the process of moving a running virtual machine (VM) to another node in the cluster without interrupting the virtual workload. Live migration enables smooth transitions during cluster upgrades or any time a node needs to be drained for maintenance or configuration changes.
By default, live migration traffic is encrypted using Transport Layer Security (TLS).
Live migration has the following requirements:
The cluster must have shared storage with ReadWriteMany
(RWX) access mode.
The cluster must have sufficient RAM and network bandwidth.
You must ensure that there is enough memory request capacity in the cluster to support node drains that result in live migrations. You can determine the approximate required spare memory by using the following calculation: Product of (Maximum number of nodes that can drain in parallel) and (Highest total VM memory request allocations across nodes) The default number of migrations that can run in parallel in the cluster is 5. |
If a VM uses a host model CPU, the nodes must support the CPU.
Configuring a dedicated Multus network for live migration is highly recommended. A dedicated network minimizes the effects of network saturation on tenant workloads during migration.
You can adjust your cluster-wide live migration settings based on the type of workload and migration scenario. This enables you to control how many VMs migrate at the same time, the network bandwidth you want to use for each migration, and how long OpenShift Virtualization attempts to complete the migration before canceling the process. Configure these settings in the HyperConverged
custom resource (CR).
If you are migrating multiple VMs per node at the same time, set a bandwidthPerMigration
limit to prevent a large or busy VM from using a large portion of the node’s network bandwidth. By default, the bandwidthPerMigration
value is 0
, which means unlimited.
A large VM running a heavy workload (for example, database processing), with higher memory dirty rates, requires a higher bandwidth to complete the migration.
Post copy mode, when enabled, triggers if the initial pre-copy phase does not complete within the defined timeout. During post copy, the VM CPUs pause on the source host while transferring the minimum required memory pages. Then the VM CPUs activate on the destination host, and the remaining memory pages transfer into the destination node at runtime. This can impact performance during the transfer. Post copy mode should not be used for critical data, or with unstable networks. |
You can perform the following live migration tasks:
Monitor the progress of all live migrations in the Migration tab of the OpenShift Virtualization web console.
View VM migration metrics in the Metrics tab of the web console.