×

Offline migration to the OVN-Kubernetes network plugin overview

The offline migration method is a manual process that includes some downtime, during which your cluster is unreachable. This method is primarily used for self-managed OpenShift Container Platform deployments.

Although a rollback procedure is provided, the offline migration is intended to be a one-way process.

OpenShift SDN CNI is deprecated as of OpenShift Container Platform 4.14. As of OpenShift Container Platform 4.15, the network plugin is not an option for new installations. In a subsequent future release, the OpenShift SDN network plugin is planned to be removed and no longer supported. Red Hat will provide bug fixes and support for this feature until it is removed, but this feature will no longer receive enhancements. As an alternative to OpenShift SDN CNI, you can use OVN Kubernetes CNI instead.

The following sections provide more information about the offline migration method.

Supported platforms when using the offline migration method

The following table provides information about the supported platforms for the offline migration type.

Table 1. Supported platforms for the offline migration method
Platform Offline Migration

Bare metal hardware (IPI and UPI)

Amazon Web Services (AWS) (IPI and UPI)

Google Cloud Platform (GCP) (IPI and UPI)

IBM Cloud® (IPI and UPI)

Microsoft Azure (IPI and UPI)

Red Hat OpenStack Platform (RHOSP) (IPI and UPI)

VMware vSphere (IPI and UPI)

AliCloud (IPI and UPI)

Nutanix (IPI and UPI)

Considerations for offline migration to the OVN-Kubernetes network plugin

If you have more than 150 nodes in your OpenShift Container Platform cluster, then open a support case for consultation on your migration to the OVN-Kubernetes network plugin.

The subnets assigned to nodes and the IP addresses assigned to individual pods are not preserved during the migration.

While the OVN-Kubernetes network plugin implements many of the capabilities present in the OpenShift SDN network plugin, the configuration is not the same.

  • If your cluster uses any of the following OpenShift SDN network plugin capabilities, you must manually configure the same capability in the OVN-Kubernetes network plugin:

    • Namespace isolation

    • Egress router pods

  • OVN-Kubernetes, the default network provider in OpenShift Container Platform 4.14 and later versions, uses the following IP address ranges internally: 100.64.0.0/16, 169.254.169.0/29, 100.88.0.0/16, fd98::/64, fd69::/125, and fd97::/64. If your cluster uses OVN-Kubernetes, do not include any of these IP address ranges in any other CIDR definitions in your cluster or infrastructure.

The following sections highlight the differences in configuration between the aforementioned capabilities in OVN-Kubernetes and OpenShift SDN network plugins.

Namespace isolation

OVN-Kubernetes supports only the network policy isolation mode.

For a cluster using OpenShift SDN that is configured in either the multitenant or subnet isolation mode, you can still migrate to the OVN-Kubernetes network plugin. Note that after the migration operation, multitenant isolation mode is dropped, so you must manually configure network policies to achieve the same level of project-level isolation for pods and services.

Egress IP addresses

OpenShift SDN supports two different Egress IP modes:

  • In the automatically assigned approach, an egress IP address range is assigned to a node.

  • In the manually assigned approach, a list of one or more egress IP addresses is assigned to a node.

The migration process supports migrating Egress IP configurations that use the automatically assigned mode.

The differences in configuring an egress IP address between OVN-Kubernetes and OpenShift SDN is described in the following table:

Table 2. Differences in egress IP address configuration
OVN-Kubernetes OpenShift SDN
  • Create an EgressIPs object

  • Add an annotation on a Node object

  • Patch a NetNamespace object

  • Patch a HostSubnet object

For more information on using egress IP addresses in OVN-Kubernetes, see "Configuring an egress IP address".

Egress network policies

The difference in configuring an egress network policy, also known as an egress firewall, between OVN-Kubernetes and OpenShift SDN is described in the following table:

Table 3. Differences in egress network policy configuration
OVN-Kubernetes OpenShift SDN
  • Create an EgressFirewall object in a namespace

  • Create an EgressNetworkPolicy object in a namespace

Because the name of an EgressFirewall object can only be set to default, after the migration all migrated EgressNetworkPolicy objects are named default, regardless of what the name was under OpenShift SDN.

If you subsequently rollback to OpenShift SDN, all EgressNetworkPolicy objects are named default as the prior name is lost.

For more information on using an egress firewall in OVN-Kubernetes, see "Configuring an egress firewall for a project".

Egress router pods

OVN-Kubernetes supports egress router pods in redirect mode. OVN-Kubernetes does not support egress router pods in HTTP proxy mode or DNS proxy mode.

When you deploy an egress router with the Cluster Network Operator, you cannot specify a node selector to control which node is used to host the egress router pod.

Multicast

The difference between enabling multicast traffic on OVN-Kubernetes and OpenShift SDN is described in the following table:

Table 4. Differences in multicast configuration
OVN-Kubernetes OpenShift SDN
  • Add an annotation on a Namespace object

  • Add an annotation on a NetNamespace object

For more information on using multicast in OVN-Kubernetes, see "Enabling multicast for a project".

Network policies

OVN-Kubernetes fully supports the Kubernetes NetworkPolicy API in the networking.k8s.io/v1 API group. No changes are necessary in your network policies when migrating from OpenShift SDN.

How the offline migration process works

The following table summarizes the migration process by segmenting between the user-initiated steps in the process and the actions that the migration performs in response.

Table 5. Offline migration to OVN-Kubernetes from OpenShift SDN
User-initiated steps Migration activity

Set the migration field of the Network.operator.openshift.io custom resource (CR) named cluster to OVNKubernetes. Make sure the migration field is null before setting it to a value.

Cluster Network Operator (CNO)

Updates the status of the Network.config.openshift.io CR named cluster accordingly.

Machine Config Operator (MCO)

Rolls out an update to the systemd configuration necessary for OVN-Kubernetes; the MCO updates a single machine per pool at a time by default, causing the total time the migration takes to increase with the size of the cluster.

Update the networkType field of the Network.config.openshift.io CR.

CNO

Performs the following actions:

  • Destroys the OpenShift SDN control plane pods.

  • Deploys the OVN-Kubernetes control plane pods.

  • Updates the Multus daemon sets and config map objects to reflect the new network plugin.

Reboot each node in the cluster.

Cluster

As nodes reboot, the cluster assigns IP addresses to pods on the OVN-Kubernetes cluster network.

Migrating to the OVN-Kubernetes network plugin by using the offline migration method

As a cluster administrator, you can change the network plugin for your cluster to OVN-Kubernetes. During the migration, you must reboot every node in your cluster.

While performing the migration, your cluster is unavailable and workloads might be interrupted. Perform the migration only when an interruption in service is acceptable.

Prerequisites
  • A cluster configured with the OpenShift SDN CNI network plugin in the network policy isolation mode.

  • Install the OpenShift CLI (oc).

  • Access to the cluster as a user with the cluster-admin role.

  • A recent backup of the etcd database is available.

  • A reboot can be triggered manually for each node.

  • The cluster is in a known good state, without any errors.

  • Before migration to OVN-Kubernetes, a security group rule must be in place to allow UDP packets on port 6081 for all nodes on all cloud platforms.

Procedure
  1. To backup the configuration for the cluster network, enter the following command:

    $ oc get Network.config.openshift.io cluster -o yaml > cluster-openshift-sdn.yaml
  2. To prepare all the nodes for the migration, set the migration field on the Cluster Network Operator configuration object by entering the following command:

    $ oc patch Network.operator.openshift.io cluster --type='merge' \
      --patch '{ "spec": { "migration": { "networkType": "OVNKubernetes" } } }'

    This step does not deploy OVN-Kubernetes immediately. Instead, specifying the migration field triggers the Machine Config Operator (MCO) to apply new machine configs to all the nodes in the cluster in preparation for the OVN-Kubernetes deployment.

  3. Optional: You can disable automatic migration of several OpenShift SDN capabilities to the OVN-Kubernetes equivalents:

    • Egress IPs

    • Egress firewall

    • Multicast

    To disable automatic migration of the configuration for any of the previously noted OpenShift SDN features, specify the following keys:

    $ oc patch Network.operator.openshift.io cluster --type='merge' \
      --patch '{
        "spec": {
          "migration": {
            "networkType": "OVNKubernetes",
            "features": {
              "egressIP": <bool>,
              "egressFirewall": <bool>,
              "multicast": <bool>
            }
          }
        }
      }'

    where:

    bool: Specifies whether to enable migration of the feature. The default is true.

  4. Optional: You can customize the following settings for OVN-Kubernetes to meet your network infrastructure requirements:

    • Maximum transmission unit (MTU). Consider the following before customizing the MTU for this optional step:

      • If you use the default MTU, and you want to keep the default MTU during migration, this step can be ignored.

      • If you used a custom MTU, and you want to keep the custom MTU during migration, you must declare the custom MTU value in this step.

      • This step does not work if you want to change the MTU value during migration. Instead, you must first follow the instructions for "Changing the cluster MTU". You can then keep the custom MTU value by performing this procedure and declaring the custom MTU value in this step.

        OpenShift-SDN and OVN-Kubernetes have different overlay overhead. MTU values should be selected by following the guidelines found on the "MTU value selection" page.

    • Geneve (Generic Network Virtualization Encapsulation) overlay network port

    • OVN-Kubernetes IPv4 internal subnet

    • OVN-Kubernetes IPv6 internal subnet

    To customize either of the previously noted settings, enter and customize the following command. If you do not need to change the default value, omit the key from the patch.

    $ oc patch Network.operator.openshift.io cluster --type=merge \
      --patch '{
        "spec":{
          "defaultNetwork":{
            "ovnKubernetesConfig":{
              "mtu":<mtu>,
              "genevePort":<port>,
              "v4InternalSubnet":"<ipv4_subnet>",
              "v6InternalSubnet":"<ipv6_subnet>"
        }}}}'

    where:

    mtu

    The MTU for the Geneve overlay network. This value is normally configured automatically, but if the nodes in your cluster do not all use the same MTU, then you must set this explicitly to 100 less than the smallest node MTU value.

    port

    The UDP port for the Geneve overlay network. If a value is not specified, the default is 6081. The port cannot be the same as the VXLAN port that is used by OpenShift SDN. The default value for the VXLAN port is 4789.

    ipv4_subnet

    An IPv4 address range for internal use by OVN-Kubernetes. You must ensure that the IP address range does not overlap with any other subnet used by your OpenShift Container Platform installation. The IP address range must be larger than the maximum number of nodes that can be added to the cluster. The default value is 100.64.0.0/16.

    ipv6_subnet

    An IPv6 address range for internal use by OVN-Kubernetes. You must ensure that the IP address range does not overlap with any other subnet used by your OpenShift Container Platform installation. The IP address range must be larger than the maximum number of nodes that can be added to the cluster. The default value is fd98::/48.

    Example patch command to update mtu field
    $ oc patch Network.operator.openshift.io cluster --type=merge \
      --patch '{
        "spec":{
          "defaultNetwork":{
            "ovnKubernetesConfig":{
              "mtu":1200
        }}}}'
  5. As the MCO updates machines in each machine config pool, it reboots each node one by one. You must wait until all the nodes are updated. Check the machine config pool status by entering the following command:

    $ oc get mcp

    A successfully updated node has the following status: UPDATED=true, UPDATING=false, DEGRADED=false.

    By default, the MCO updates one machine per pool at a time, causing the total time the migration takes to increase with the size of the cluster.

  6. Confirm the status of the new machine configuration on the hosts:

    1. To list the machine configuration state and the name of the applied machine configuration, enter the following command:

      $ oc describe node | egrep "hostname|machineconfig"
      Example output
      kubernetes.io/hostname=master-0
      machineconfiguration.openshift.io/currentConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b
      machineconfiguration.openshift.io/desiredConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b
      machineconfiguration.openshift.io/reason:
      machineconfiguration.openshift.io/state: Done

      Verify that the following statements are true:

      • The value of machineconfiguration.openshift.io/state field is Done.

      • The value of the machineconfiguration.openshift.io/currentConfig field is equal to the value of the machineconfiguration.openshift.io/desiredConfig field.

    2. To confirm that the machine config is correct, enter the following command:

      $ oc get machineconfig <config_name> -o yaml | grep ExecStart

      where <config_name> is the name of the machine config from the machineconfiguration.openshift.io/currentConfig field.

      The machine config must include the following update to the systemd configuration:

      ExecStart=/usr/local/bin/configure-ovs.sh OVNKubernetes
    3. If a node is stuck in the NotReady state, investigate the machine config daemon pod logs and resolve any errors.

      1. To list the pods, enter the following command:

        $ oc get pod -n openshift-machine-config-operator
        Example output
        NAME                                         READY   STATUS    RESTARTS   AGE
        machine-config-controller-75f756f89d-sjp8b   1/1     Running   0          37m
        machine-config-daemon-5cf4b                  2/2     Running   0          43h
        machine-config-daemon-7wzcd                  2/2     Running   0          43h
        machine-config-daemon-fc946                  2/2     Running   0          43h
        machine-config-daemon-g2v28                  2/2     Running   0          43h
        machine-config-daemon-gcl4f                  2/2     Running   0          43h
        machine-config-daemon-l5tnv                  2/2     Running   0          43h
        machine-config-operator-79d9c55d5-hth92      1/1     Running   0          37m
        machine-config-server-bsc8h                  1/1     Running   0          43h
        machine-config-server-hklrm                  1/1     Running   0          43h
        machine-config-server-k9rtx                  1/1     Running   0          43h

        The names for the config daemon pods are in the following format: machine-config-daemon-<seq>. The <seq> value is a random five character alphanumeric sequence.

      2. Display the pod log for the first machine config daemon pod shown in the previous output by enter the following command:

        $ oc logs <pod> -n openshift-machine-config-operator

        where pod is the name of a machine config daemon pod.

      3. Resolve any errors in the logs shown by the output from the previous command.

  7. To start the migration, configure the OVN-Kubernetes network plugin by using one of the following commands:

    • To specify the network provider without changing the cluster network IP address block, enter the following command:

      $ oc patch Network.config.openshift.io cluster \
        --type='merge' --patch '{ "spec": { "networkType": "OVNKubernetes" } }'
    • To specify a different cluster network IP address block, enter the following command:

      $ oc patch Network.config.openshift.io cluster \
        --type='merge' --patch '{
          "spec": {
            "clusterNetwork": [
              {
                "cidr": "<cidr>",
                "hostPrefix": <prefix>
              }
            ],
            "networkType": "OVNKubernetes"
          }
        }'

      where cidr is a CIDR block and prefix is the slice of the CIDR block apportioned to each node in your cluster. You cannot use any CIDR block that overlaps with the 100.64.0.0/16 CIDR block because the OVN-Kubernetes network provider uses this block internally.

      You cannot change the service network address block during the migration.

  8. Verify that the Multus daemon set rollout is complete before continuing with subsequent steps:

    $ oc -n openshift-multus rollout status daemonset/multus

    The name of the Multus pods is in the form of multus-<xxxxx> where <xxxxx> is a random sequence of letters. It might take several moments for the pods to restart.

    Example output
    Waiting for daemon set "multus" rollout to finish: 1 out of 6 new pods have been updated...
    ...
    Waiting for daemon set "multus" rollout to finish: 5 of 6 updated pods are available...
    daemon set "multus" successfully rolled out
  9. To complete changing the network plugin, reboot each node in your cluster. You can reboot the nodes in your cluster with either of the following approaches:

    • With the oc rsh command, you can use a bash script similar to the following:

      #!/bin/bash
      readarray -t POD_NODES <<< "$(oc get pod -n openshift-machine-config-operator -o wide| grep daemon|awk '{print $1" "$7}')"
      
      for i in "${POD_NODES[@]}"
      do
        read -r POD NODE <<< "$i"
        until oc rsh -n openshift-machine-config-operator "$POD" chroot /rootfs shutdown -r +1
          do
            echo "cannot reboot node $NODE, retry" && sleep 3
          done
      done
    • With the ssh command, you can use a bash script similar to the following. The script assumes that you have configured sudo to not prompt for a password.

      #!/bin/bash
      
      for ip in $(oc get nodes  -o jsonpath='{.items[*].status.addresses[?(@.type=="InternalIP")].address}')
      do
         echo "reboot node $ip"
         ssh -o StrictHostKeyChecking=no core@$ip sudo shutdown -r -t 3
      done
  10. Confirm that the migration succeeded:

    1. To confirm that the network plugin is OVN-Kubernetes, enter the following command. The value of status.networkType must be OVNKubernetes.

      $ oc get network.config/cluster -o jsonpath='{.status.networkType}{"\n"}'
    2. To confirm that the cluster nodes are in the Ready state, enter the following command:

      $ oc get nodes
    3. To confirm that your pods are not in an error state, enter the following command:

      $ oc get pods --all-namespaces -o wide --sort-by='{.spec.nodeName}'

      If pods on a node are in an error state, reboot that node.

    4. To confirm that all of the cluster Operators are not in an abnormal state, enter the following command:

      $ oc get co

      The status of every cluster Operator must be the following: AVAILABLE="True", PROGRESSING="False", DEGRADED="False". If a cluster Operator is not available or degraded, check the logs for the cluster Operator for more information.

  11. Complete the following steps only if the migration succeeds and your cluster is in a good state:

    1. To remove the migration configuration from the CNO configuration object, enter the following command:

      $ oc patch Network.operator.openshift.io cluster --type='merge' \
        --patch '{ "spec": { "migration": null } }'
    2. To remove custom configuration for the OpenShift SDN network provider, enter the following command:

      $ oc patch Network.operator.openshift.io cluster --type='merge' \
        --patch '{ "spec": { "defaultNetwork": { "openshiftSDNConfig": null } } }'
    3. To remove the OpenShift SDN network provider namespace, enter the following command:

      $ oc delete namespace openshift-sdn

Limited live migration to the OVN-Kubernetes network plugin overview

The limited live migration method is the process in which the OpenShift SDN network plugin and its network configurations, connections, and associated resources, are migrated to the OVN-Kubernetes network plugin without service interruption. It is available for OpenShift Container Platform, Red Hat OpenShift Dedicated, Red Hat OpenShift Service on AWS, and Azure Red Hat OpenShift deployment types. It is not available for HyperShift deployment types. This migration method is valuable for deployment types that require constant service availability and offers the following benefits:

  • Continuous service availability

  • Minimized downtime

  • Automatic node rebooting

  • Seamless transition from the OpenShift SDN network plugin to the OVN-Kubernetes network plugin

Although a rollback procedure is provided, the limited live migration is intended to be a one-way process.

OpenShift SDN CNI is deprecated as of OpenShift Container Platform 4.14. As of OpenShift Container Platform 4.15, the network plugin is not an option for new installations. In a subsequent future release, the OpenShift SDN network plugin is planned to be removed and no longer supported. Red Hat will provide bug fixes and support for this feature until it is removed, but this feature will no longer receive enhancements. As an alternative to OpenShift SDN CNI, you can use OVN Kubernetes CNI instead.

The following sections provide more information about the limited live migration method.

Supported platforms when using the limited live migration method

The following table provides information about the supported platforms for the limited live migration type.

Table 6. Supported platforms for the limited live migration method
Platform Limited Live Migration

Bare metal hardware (IPI and UPI)

Amazon Web Services (AWS) (IPI and UPI)

Google Cloud Platform (GCP) (IPI and UPI)

IBM Cloud® (IPI and UPI)

Microsoft Azure (IPI and UPI)

Red Hat OpenStack Platform (RHOSP) (IPI and UPI)

VMware vSphere (IPI and UPI)

AliCloud (IPI and UPI)

Nutanix (IPI and UPI)

Considerations for limited live migration to the OVN-Kubernetes network plugin

Before using the limited live migration method to the OVN-Kubernetes network plugin, cluster administrators should consider the following information:

  • The limited live migration procedure is unsupported for clusters with OpenShift SDN multitenant mode enabled.

  • Egress router pods block the limited live migration process. They must be removed before beginning the limited live migration process.

  • During the limited live migration, multicast, egress IP addresses, and egress firewalls are temporarily disabled. They can be migrated from OpenShift SDN to OVN-Kubernetes after the limited live migration process has finished.

  • The migration is intended to be a one-way process. However, for users that want to rollback to OpenShift-SDN, migration from OpenShift-SDN to OVN-Kubernetes must have succeeded. Users can follow the same procedure below to migrate to the OpenShift SDN network plugin from the OVN-Kubernetes network plugin.

  • The limited live migration is not supported on HyperShift clusters.

  • OpenShift SDN does not support IPsec. After the migration, cluster administrators can enable IPsec.

  • OpenShift SDN does not support IPv6. After the migration, cluster administrators can enable dual-stack.

  • The cluster MTU is the MTU value for pod interfaces. It is always less than your hardware MTU to account for the cluster network overlay overhead. The overhead is 100 bytes for OVN-Kubernetes and 50 bytes for OpenShift SDN.

    During the limited live migration, both OVN-Kubernetes and OpenShift SDN run in parallel. OVN-Kubernetes manages the cluster network of some nodes, while OpenShift SDN manages the cluster network of others. To ensure that cross-CNI traffic remains functional, the Cluster Network Operator updates the routable MTU to ensure that both CNIs share the same overlay MTU. As a result, after the migration has completed, the cluster MTU is 50 bytes less.

  • Some parameters of OVN-Kubernetes cannot be changed after installation. The following parameters can be set only before starting the limited live migration:

    • InternalTransitSwitchSubnet

    • internalJoinSubnet

  • OVN-Kubernetes reserves the 100.64.0.0/16 and 100.88.0.0/16 IP address ranges. If OpenShift SDN has been configured to use either of these IP address ranges, you must patch them to use a different IP address range before starting the limited live migration.

    • 100.64.0.0/16. This IP address range is used for the internalJoinSubnet parameter of OVN-Kubernetes by default. If this IP address range is already in use, enter the following command to update it to a different range, for example, 100.63.0.0/16:

      $ oc patch network.operator.openshift.io cluster --type='merge' -p='{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"ipv4":{"internalJoinSubnet": "100.63.0.0/16"}}}}}'
    • 100.88.0.0/16. This IP address range is used for the internalTransSwitchSubnet parameter of OVN-Kubernetes by default. If this IP address range is already in use, enter the following command to update it to a different range, for example, 100.99.0.0/16:

      $ oc patch network.operator.openshift.io cluster --type='merge' -p='{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"ipv4":{"internalTransitSwitchSubnet": "100.99.0.0/16"}}}}}'
  • In most cases, the limited live migration is independent of the secondary interfaces of pods created by the Multus CNI plugin. However, if these secondary interfaces were set up on the default network interface controller (NIC) of the host, for example, using MACVLAN, IPVLAN, SR-IOV, or bridge interfaces with the default NIC as the control node, OVN-Kubernetes might encounter malfunctions. Users should remove such configurations before proceeding with the limited live migration.

  • When there are multiple NICs inside of the host, and the default route is not on the interface that has the Kubernetes NodeIP, you must use the offline migration instead.

  • All DaemonSet objects in the openshift-sdn namespace, which are not managed by the Cluster Network Operator (CNO), must be removed before initiating the limited live migration. These unmanaged daemon sets can cause the migration status to remain incomplete if not properly handled.

How the live migration process works

The following table summarizes the live migration process by segmenting between the user-initiated steps in the process and the actions that the migration script performs in response.

Table 7. Live migration to OVNKubernetes from OpenShiftSDN
User-initiated steps Migration activity

Patch the cluster-level networking configuration by changing the networkType from OpenShiftSDN to OVNKubernetes.

Cluster Network Operator (CNO)
  • Sets migration related fields in the network.operator custom resource (CR) and waits for routable MTUs to be applied to all nodes.

  • Patches the network.operator CR to set the migration mode to Live for OVN-Kubernetes and deploys the OpenShift SDN network plugin in migration mode.

  • Deploys OVN-Kubernetes with hybrid overlay enabled, ensuring that no racing conditions occur.

  • Waits for the OVN-Kubernetes deployment and updates the conditions in the status of the network.config CR.

  • Triggers the Machine Config Operator (MCO) to apply the new machine config to each machine config pool, which includes node cordoning, draining, and rebooting.

  • OVN-Kubernetes adds nodes to the appropriate zones and recreates pods using OVN-Kubernetes as the default CNI plugin.

  • Removes migration-related fields from the network.operator CR and performs cleanup actions, such as deleting OpenShift SDN resources and redeploying OVN-Kubernetes in normal mode with the necessary configurations.

  • Waits for the OVN-Kubernetes redeployment and updates the status conditions in the network.config CR to indicate migration completion. If your migration is blocked, see "Checking limited live migration metrics" for information on troubleshooting the issue.

Migrating to the OVN-Kubernetes network plugin by using the limited live migration method

The following procedure checks for egress router resources and uses the limited live migration method to migrate from the OpenShift SDN network plugin to the OVN-Kubernetes network plugin.

Prerequisites
  • A cluster has been configured with the OpenShift SDN CNI network plugin in the network policy isolation mode.

  • You have installed the OpenShift CLI (oc).

  • You have access to the cluster as a user with the cluster-admin role.

  • You have created a recent backup of the etcd database.

  • The cluster is in a known good state without any errors.

  • Before migration to OVN-Kubernetes, a security group rule must be in place to allow UDP packets on port 6081 for all nodes on all cloud platforms.

  • Cluster administrators must remove any egress router pods before beginning the limited live migration. For more information about egress router pods, see "Deploying an egress router pod in redirect mode".

  • You have reviewed the "Considerations for limited live migration to the OVN-Kubernetes network plugin" section of this document.

Procedure
  1. Before initiating the limited live migration, you must check for any egress router pods. If there is an egress router pod on the cluster when performing a limited live migration, the Network Operator blocks the migration and returns the following error:

    The cluster configuration is invalid (network type live migration is not supported for pods with `pod.network.openshift.io/assign-macvlan` annotation. Please remove all egress router pods). Use `oc edit network.config.openshift.io cluster` to fix.
    • Enter the following command to locate egress router pods on your cluster:

      $ oc get pods --all-namespaces -o json | jq '.items[] | select(.metadata.annotations."pod.network.openshift.io/assign-macvlan" == "true") | {name: .metadata.name, namespace: .metadata.namespace}'
      Example output
      {
        "name": "egress-multi",
        "namespace": "egress-router-project"
      }
    • Alternatively, you can query metrics on the OpenShift Container Platform web console or by using the oc CLI to check for egress router pods. For more information, see "Checking limited live migration metrics".

  2. Enter the following command to remove an egress router pod:

    $ oc delete pod <egress_pod_name> -n <egress_router_project>
  3. Enter the following command to patch the cluster-level networking configuration and initiate the migration from OpenShift SDN to OVN-Kubernetes:

    $ oc patch Network.config.openshift.io cluster --type='merge' --patch '{"metadata":{"annotations":{"network.openshift.io/network-type-migration":""}},"spec":{"networkType":"OVNKubernetes"}}'

    After running these commands, the migration process begins. During this process, the Machine Config Operator reboots the nodes in your cluster twice. It is expected that the migration takes approximately twice as long as a cluster upgrade.

  4. Optional: You can enter the following commands to ensure that the migration process has completed, and to check the status of the network.config:

    $ oc get network.config.openshift.io cluster -o jsonpath='{.status.networkType}'
    $ oc get network.config cluster -o=jsonpath='{.status.conditions}' | jq .

    You can check limited live migration metrics for troubleshooting issues. For more information, see "Checking limited live migration metrics".

  5. Complete the following steps only if the migration succeeds and your cluster is in a good state:

    1. To remove the migration configuration from the network.config custom resource, enter the following command:

      $ oc patch Network.operator.openshift.io cluster --type='merge' \
        --patch '{ "spec": { "migration": null } }'
    2. To remove custom configuration for the OpenShift SDN network provider, enter the following command:

      $ oc patch Network.operator.openshift.io cluster --type='merge' \
        --patch '{ "spec": { "defaultNetwork": { "openshiftSDNConfig": null } } }'
    3. To remove the OpenShift SDN network provider namespace, enter the following command:

      $ oc delete namespace openshift-sdn

Checking limited live migration metrics

Metrics are available to monitor the progress of the limited live migration. Metrics can be viewed on the OpenShift Container Platform web console, or by using the oc CLI.

Prerequisites
  • You have initiated a limited live migration to OVN-Kubernetes.

Procedure
  1. To view limited live migration metrics on the OpenShift Container Platform web console:

    1. Click ObserveMetrics.

    2. In the Expression box, type openshift_network and click the openshift_network_operator_live_migration_procedure option.

  2. To view metrics by using the oc CLI:

    1. Enter the following command to generate a token for the prometheus-k8s service account in the openshift-monitoring namespace:

      $ oc create token prometheus-k8s -n openshift-monitoring
      Example output
      eyJhbGciOiJSUzI1NiIsImtpZCI6IlZiSUtwclcwbEJ2VW9We...
    2. Enter the following command to request information about the openshift_network_operator_live_migration_condition metric:

      $ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: <eyJhbGciOiJSUzI1NiIsImtpZCI6IlZiSUtwclcwbEJ2VW9We...>" "https://<openshift_API_endpoint>" --data-urlencode "query=openshift_network_operator_live_migration_condition" | jq`
      Example output
       "status": "success",
        "data": {
          "resultType": "vector",
          "result": [
            {
              "metric": {
                "__name__": "openshift_network_operator_live_migration_condition",
                "container": "network-operator",
                "endpoint": "metrics",
                "instance": "10.0.83.62:9104",
                "job": "metrics",
                "namespace": "openshift-network-operator",
                "pod": "network-operator-6c87754bc6-c8qld",
                "prometheus": "openshift-monitoring/k8s",
                "service": "metrics",
                "type": "NetworkTypeMigrationInProgress"
              },
              "value": [
                1717653579.587,
                "1"
              ]
            },
      ...

The table in "Information about limited live migration metrics" shows you the available metrics and the label values populated from the openshift_network_operator_live_migration_procedure expression. Use this information to monitor progress or to troubleshoot the migration.

Information about limited live migration metrics

The following table shows you the available metrics and the label values populated from the openshift_network_operator_live_migration_procedure expression. Use this information to monitor progress or to troubleshoot the migration.

Table 8. Limited live migration metrics
Metric Label values
openshift_network_operator_live_migration_blocked:

A Prometheus gauge vector metric. A metric that contains a constant 1 value labeled with the reason that the CNI limited live migration might not have started. This metric is available when the CNI limited live migration has started by annotating the Network custom resource.
This metric is not published unless the limited live migration is blocked.

The list of label values includes the following
  • UnsupportedCNI: Unable to migrate to the unsupported target CNI. Valid CNI is OVNKubernetes when migrating from OpenShift SDN.

  • UnsupportedHyperShiftCluster: Limited live migration is unsupported within an HCP cluster.

  • UnsupportedSDNNetworkIsolationMode: OpenShift SDN is configured with an unsupported network isolation mode Multitenant. Migrate to a supported network isolation mode before performing limited live migration.

  • UnsupportedMACVLANInterface: Remove the egress router or any pods which contain the pod annotation pod.network.openshift.io/assign-macvlan. Find the offending pod’s namespace or pod name with the following command:

    oc get pods -Ao=jsonpath='{range .items[?(@.metadata.annotations.pod\.network\.openshift\.io/assign-macvlan=="")]}{@.metadata.namespace}{"\t"}{@.metadata.name}{"\n"}'.

openshift_network_operator_live_migration_condition:

A metric which represents the status of each condition type for the CNI limited live migration. The set of status condition types is defined for network.config to support observability of the CNI limited live migration.
A 1 value represents condition status true. A 0 value represents false. -1 represents unknown. This metric is available when the CNI limited live migration has started by annotating the Network custom resource (CR).
This metric is only available when the limited live migration has been triggered by adding the relevant annotation to the Network CR cluster, otherwise, it is not published. If the following condition types are not present within the Network CR cluster, the metric and their labels are cleared.

The list of label values includes the following
  • NetworkTypeMigrationInProgress

  • NetworkTypeMigrationTargetCNIAvailable

  • NetworkTypeMigrationTargetCNIInUse

  • NetworkTypeMigrationOriginalCNIPurged

  • NetworkTypeMigrationMTUReady