You can view the migration Custom Resources (CRs) and download logs to troubleshoot a failed migration.

If the application was stopped during the failed migration, you must roll it back manually in order to prevent data corruption.

Manual rollback is not required if the application was not stopped during migration, because the original application is still running on the source cluster.

Viewing migration Custom Resources (CRs)

The Cluster Application Migration (CAM) tool creates the following CRs for migration:

migration architecture diagram

20 MigCluster (configuration, CAM cluster): Cluster definition

20 MigStorage (configuration, CAM cluster): Storage definition

20 MigPlan (configuration, CAM cluster): Migration plan

The MigPlan CR describes the source and target clusters, repository, and namespace(s) being migrated. It is associated with 0, 1, or many MigMigration CRs.

Deleting a MigPlan CR deletes the associated MigMigration CRs.

20 BackupStorageLocation (configuration, CAM cluster): Location of Velero backup objects

20 VolumeSnapshotLocation (configuration, CAM cluster): Location of Velero volume snapshots

20 MigMigration (action, CAM cluster): Migration, created during migration

A MigMigration CR is created every time you stage or migrate data. Each MigMigration CR is associated with a MigPlan CR.

20 Backup (action, source cluster): When you run a migration plan, the MigMigration CR creates two Velero backup CRs on each source cluster:

  • Backup CR #1 for Kubernetes objects

  • Backup CR #2 for PV data

20 Restore (action, target cluster): When you run a migration plan, the MigMigration CR creates two Velero restore CRs on the target cluster:

  • Restore CR #1 (using Backup CR #2) for PV data

  • Restore CR #2 (using Backup CR #1) for Kubernetes objects

  1. Obtain the CR name:

    $ oc get <cr> -n openshift-migration (1)
    NAME                                   AGE
    88435fe0-c9f8-11e9-85e6-5d593ce65e10   6m42s
    1 Specify the migration CR you want to view.
  2. View the CR:

    $ oc describe <cr> 88435fe0-c9f8-11e9-85e6-5d593ce65e10 -n openshift-migration

    The output is similar to the following examples.

MigMigration example
$ oc describe migmigration 88435fe0-c9f8-11e9-85e6-5d593ce65e10 -n openshift-migration
Name:         88435fe0-c9f8-11e9-85e6-5d593ce65e10
Namespace:    openshift-migration
Labels:       <none>
Annotations:  touch: 3b48b543-b53e-4e44-9d34-33563f0f8147
API Version:
Kind:         MigMigration
  Creation Timestamp:  2019-08-29T01:01:29Z
  Generation:          20
  Resource Version:    88179
  Self Link:           /apis/
  UID:                 8886de4c-c9f8-11e9-95ad-0205fe66cbb6
  Mig Plan Ref:
    Name:        socks-shop-mig-plan
    Namespace:   openshift-migration
  Quiesce Pods:  true
  Stage:         false
    Category:              Advisory
    Durable:               true
    Last Transition Time:  2019-08-29T01:03:40Z
    Message:               The migration has completed successfully.
    Reason:                Completed
    Status:                True
    Type:                  Succeeded
  Phase:                   Completed
  Start Timestamp:         2019-08-29T01:01:29Z
Events:                    <none>
Velero backup CR #2 example (PV data)
kind: Backup
  annotations: final "true" /socks-shop-mig-plan-registry-44dd3bd5-c9f8-11e9-95ad-0205fe66cbb6
  creationTimestamp: "2019-08-29T01:03:15Z"
  generateName: 88435fe0-c9f8-11e9-85e6-5d593ce65e10-
  generation: 1
  labels: migration
    migmigration: 8886de4c-c9f8-11e9-95ad-0205fe66cbb6
    migration-stage-backup: 8886de4c-c9f8-11e9-95ad-0205fe66cbb6 myrepo-vpzq9
  name: 88435fe0-c9f8-11e9-85e6-5d593ce65e10-59gb7
  namespace: openshift-migration
  resourceVersion: "87313"
  selfLink: /apis/
  uid: c80dbbc0-c9f8-11e9-95ad-0205fe66cbb6
  excludedNamespaces: []
  excludedResources: []
    resources: []
  includeClusterResources: null
  - sock-shop
  - persistentvolumes
  - persistentvolumeclaims
  - namespaces
  - imagestreams
  - imagestreamtags
  - secrets
  - configmaps
  - pods
      migration-included-stage-backup: 8886de4c-c9f8-11e9-95ad-0205fe66cbb6
  storageLocation: myrepo-vpzq9
  ttl: 720h0m0s
  - myrepo-wv6fx
  completionTimestamp: "2019-08-29T01:02:36Z"
  errors: 0
  expiration: "2019-09-28T01:02:35Z"
  phase: Completed
  startTimestamp: "2019-08-29T01:02:35Z"
  validationErrors: null
  version: 1
  volumeSnapshotsAttempted: 0
  volumeSnapshotsCompleted: 0
  warnings: 0
Velero restore CR #2 example (Kubernetes resources)
kind: Restore
  annotations: final "true" /socks-shop-mig-plan-registry-36f54ca7-c925-11e9-825a-06fa9fb68c88
  creationTimestamp: "2019-08-28T00:09:49Z"
  generateName: e13a1b60-c927-11e9-9555-d129df7f3b96-
  generation: 3
  labels: migration
    migmigration: e18252c9-c927-11e9-825a-06fa9fb68c88
    migration-final-restore: e18252c9-c927-11e9-825a-06fa9fb68c88
  name: e13a1b60-c927-11e9-9555-d129df7f3b96-gb8nx
  namespace: openshift-migration
  resourceVersion: "82329"
  selfLink: /apis/
  uid: 26983ec0-c928-11e9-825a-06fa9fb68c88
  backupName: e13a1b60-c927-11e9-9555-d129df7f3b96-sz24f
  excludedNamespaces: null
  - nodes
  - events
  includedNamespaces: null
  includedResources: null
  namespaceMapping: null
  restorePVs: true
  errors: 0
  failureReason: ""
  phase: Completed
  validationErrors: null
  warnings: 15

Downloading migration logs

You can download the Velero, Restic, and Migration controller logs in the CAM web console to troubleshoot a failed migration.

  1. Log in to the CAM console.

  2. Click Plans to view the list of migration plans.

  3. Click the Options menu kebab of a specific migration plan and select Logs.

  4. Click Download Logs to download the logs of the Migration controller, Velero, and Restic for all clusters.

  5. To download a specific log:

    1. Specify the log options:

      • Cluster: Select the source, target, or CAM host cluster.

      • Log source: Select Velero, Restic, or Controller.

      • Pod source: Select the Pod name, for example, controller-manager-78c469849c-v6wcf

        The selected log is displayed.

        You can clear the log selection settings by changing your selection.

    2. Click Download Selected to download the selected log.

Optionally, you can access the logs by using the CLI, as in the following example:

$ oc get pods -n openshift-migration | grep controller
controller-manager-78c469849c-v6wcf           1/1     Running     0          4h49m

$ oc logs controller-manager-78c469849c-v6wcf -f -n openshift-migration

Restic timeout error

If a migration fails because Restic times out, the following error appears in the Velero log:

level=error msg="Error backing up item" backup=velero/monitoring error="timed out waiting for all PodVolumeBackups to complete" error.file="/go/src/" error.function="*backupper).BackupPodVolumes" group=v1

The default value of restic_timeout is one hour. You can increase this for large migrations, keeping in mind that a higher value may delay the return of error messages.

  1. In the OpenShift Container Platform web console, navigate to OperatorsInstalled Operators.

  2. Click Cluster Application Migration Operator.

  3. In the MigrationController tab, click migration-controller.

  4. In the YAML tab, update the following parameter value:

      restic_timeout: 1h (1)
    1 Valid units are h (hours), m (minutes), and s (seconds), for example, 3h30m15s.
  5. Click Save.

Manually rolling back a migration

If your application was stopped during a failed migration, you must roll it back manually in order to prevent data corruption in the PV.

This procedure is not required if the application was not stopped during migration, because the original application is still running on the source cluster.

  1. On the target cluster, switch to the migrated project:

    $ oc project <project>
  2. Get the deployed resources:

    $ oc get all
  3. Delete the deployed resources to ensure that the application is not running on the target cluster and accessing data on the PVC:

    $ oc delete <resource_type>
  4. To stop a DaemonSet without deleting it, update the nodeSelector in the YAML file:

    apiVersion: extensions/v1beta1
    kind: DaemonSet
      name: hello-daemonset
            name: hello-daemonset
            name: hello-daemonset
            role: worker (1)
    1 Specify a nodeSelector value that does not exist on any node.
  5. Update each PV’s reclaim policy so that unnecessary data is removed. During migration, the reclaim policy for bound PVs is Retain, to ensure that data is not lost when an application is removed from the source cluster. You can remove these PVs during rollback.

    apiVersion: v1
    kind: PersistentVolume
      name: pv0001
        storage: 5Gi
        - ReadWriteOnce
      persistentVolumeReclaimPolicy: Retain (1)
    1 Specify Recycle or Delete.
  6. On the source cluster, switch to the migrated project and get its deployed resources:

    $ oc project <project>
    $ oc get all
  7. Start one or more replicas of each deployed resource:

    $ oc scale --replicas=1 <resource_type>/<resource_name>
  8. Update the nodeSelector of a DaemonSet to its original value, if you changed it during the procedure.

Known issues

This release has the following known issues:

  • During migration, the Cluster Application Migration (CAM) tool preserves the following namespace annotations:




      These annotations preserve the UID range, ensuring that the containers retain their file system permissions on the target cluster. There is a risk that the migrated UIDs could duplicate UIDs within an existing or future namespace on the target cluster. (BZ#1748440)

  • If an AWS bucket is added to the CAM web console and then deleted, its status remains True because the MigStorage CR is not updated. (BZ#1738564)

  • Migration fails if the Migration controller is running on a cluster other than the target cluster. The EnsureCloudSecretPropagated phase is skipped with a logged warning. (BZ#1757571)

  • Most cluster-scoped resources are not yet handled by the CAM tool. If your applications require cluster-scoped resources, you may have to create them manually on the target cluster.

  • Incorrect source cluster storage class is displayed when creating the migration plan. (BZ#1777869)

  • If a cluster in the CAM web console becomes inaccessible, it blocks attempts to close open migration plans. (BZ#1758269)

  • If a migration fails, the migration plan does not retain custom PV settings for quiesced pods. You must manually roll back the migration, delete the migration plan, and create a new migration plan with your PV settings. (BZ#1784899)