You can view the migration Custom Resources (CRs) and download logs to troubleshoot a failed migration.
If the application was stopped during the failed migration, you must roll it back manually in order to prevent data corruption.
Manual rollback is not required if the application was not stopped during migration, because the original application is still running on the source cluster.
The Cluster Application Migration (CAM) tool creates the following CRs for migration:
MigCluster (configuration, CAM cluster): Cluster definition
MigStorage (configuration, CAM cluster): Storage definition
MigPlan (configuration, CAM cluster): Migration plan
The MigPlan CR describes the source and target clusters, repository, and namespace(s) being migrated. It is associated with 0, 1, or many MigMigration CRs.
Deleting a MigPlan CR deletes the associated MigMigration CRs.
BackupStorageLocation (configuration, CAM cluster): Location of Velero backup objects
VolumeSnapshotLocation (configuration, CAM cluster): Location of Velero volume snapshots
MigMigration (action, CAM cluster): Migration, created during migration
A MigMigration CR is created every time you stage or migrate data. Each MigMigration CR is associated with a MigPlan CR.
Backup (action, source cluster): When you run a migration plan, the MigMigration CR creates two Velero backup CRs on each source cluster:
Backup CR #1 for Kubernetes objects
Backup CR #2 for PV data
Restore (action, target cluster): When you run a migration plan, the MigMigration CR creates two Velero restore CRs on the target cluster:
Restore CR #1 (using Backup CR #2) for PV data
Restore CR #2 (using Backup CR #1) for Kubernetes objects
Obtain the CR name:
$ oc get <cr> -n openshift-migration (1) NAME AGE 88435fe0-c9f8-11e9-85e6-5d593ce65e10 6m42s
|1||Specify the migration CR you want to view.|
View the CR:
$ oc describe <cr> 88435fe0-c9f8-11e9-85e6-5d593ce65e10 -n openshift-migration
The output is similar to the following examples.
$ oc describe migmigration 88435fe0-c9f8-11e9-85e6-5d593ce65e10 -n openshift-migration Name: 88435fe0-c9f8-11e9-85e6-5d593ce65e10 Namespace: openshift-migration Labels: <none> Annotations: touch: 3b48b543-b53e-4e44-9d34-33563f0f8147 API Version: migration.openshift.io/v1alpha1 Kind: MigMigration Metadata: Creation Timestamp: 2019-08-29T01:01:29Z Generation: 20 Resource Version: 88179 Self Link: /apis/migration.openshift.io/v1alpha1/namespaces/openshift-migration/migmigrations/88435fe0-c9f8-11e9-85e6-5d593ce65e10 UID: 8886de4c-c9f8-11e9-95ad-0205fe66cbb6 Spec: Mig Plan Ref: Name: socks-shop-mig-plan Namespace: openshift-migration Quiesce Pods: true Stage: false Status: Conditions: Category: Advisory Durable: true Last Transition Time: 2019-08-29T01:03:40Z Message: The migration has completed successfully. Reason: Completed Status: True Type: Succeeded Phase: Completed Start Timestamp: 2019-08-29T01:01:29Z Events: <none>
apiVersion: velero.io/v1 kind: Backup metadata: annotations: openshift.io/migrate-copy-phase: final openshift.io/migrate-quiesce-pods: "true" openshift.io/migration-registry: 172.30.105.179:5000 openshift.io/migration-registry-dir: /socks-shop-mig-plan-registry-44dd3bd5-c9f8-11e9-95ad-0205fe66cbb6 creationTimestamp: "2019-08-29T01:03:15Z" generateName: 88435fe0-c9f8-11e9-85e6-5d593ce65e10- generation: 1 labels: app.kubernetes.io/part-of: migration migmigration: 8886de4c-c9f8-11e9-95ad-0205fe66cbb6 migration-stage-backup: 8886de4c-c9f8-11e9-95ad-0205fe66cbb6 velero.io/storage-location: myrepo-vpzq9 name: 88435fe0-c9f8-11e9-85e6-5d593ce65e10-59gb7 namespace: openshift-migration resourceVersion: "87313" selfLink: /apis/velero.io/v1/namespaces/openshift-migration/backups/88435fe0-c9f8-11e9-85e6-5d593ce65e10-59gb7 uid: c80dbbc0-c9f8-11e9-95ad-0205fe66cbb6 spec: excludedNamespaces:  excludedResources:  hooks: resources:  includeClusterResources: null includedNamespaces: - sock-shop includedResources: - persistentvolumes - persistentvolumeclaims - namespaces - imagestreams - imagestreamtags - secrets - configmaps - pods labelSelector: matchLabels: migration-included-stage-backup: 8886de4c-c9f8-11e9-95ad-0205fe66cbb6 storageLocation: myrepo-vpzq9 ttl: 720h0m0s volumeSnapshotLocations: - myrepo-wv6fx status: completionTimestamp: "2019-08-29T01:02:36Z" errors: 0 expiration: "2019-09-28T01:02:35Z" phase: Completed startTimestamp: "2019-08-29T01:02:35Z" validationErrors: null version: 1 volumeSnapshotsAttempted: 0 volumeSnapshotsCompleted: 0 warnings: 0
apiVersion: velero.io/v1 kind: Restore metadata: annotations: openshift.io/migrate-copy-phase: final openshift.io/migrate-quiesce-pods: "true" openshift.io/migration-registry: 172.30.90.187:5000 openshift.io/migration-registry-dir: /socks-shop-mig-plan-registry-36f54ca7-c925-11e9-825a-06fa9fb68c88 creationTimestamp: "2019-08-28T00:09:49Z" generateName: e13a1b60-c927-11e9-9555-d129df7f3b96- generation: 3 labels: app.kubernetes.io/part-of: migration migmigration: e18252c9-c927-11e9-825a-06fa9fb68c88 migration-final-restore: e18252c9-c927-11e9-825a-06fa9fb68c88 name: e13a1b60-c927-11e9-9555-d129df7f3b96-gb8nx namespace: openshift-migration resourceVersion: "82329" selfLink: /apis/velero.io/v1/namespaces/openshift-migration/restores/e13a1b60-c927-11e9-9555-d129df7f3b96-gb8nx uid: 26983ec0-c928-11e9-825a-06fa9fb68c88 spec: backupName: e13a1b60-c927-11e9-9555-d129df7f3b96-sz24f excludedNamespaces: null excludedResources: - nodes - events - events.events.k8s.io - backups.velero.io - restores.velero.io - resticrepositories.velero.io includedNamespaces: null includedResources: null namespaceMapping: null restorePVs: true status: errors: 0 failureReason: "" phase: Completed validationErrors: null warnings: 15
You can download the Velero, Restic, and Migration controller logs in the CAM web console to troubleshoot a failed migration.
Log in to the CAM console.
Click Plans to view the list of migration plans.
Click the Options menu of a specific migration plan and select Logs.
Click Download Logs to download the logs of the Migration controller, Velero, and Restic for all clusters.
To download a specific log:
Specify the log options:
Cluster: Select the source, target, or CAM host cluster.
Log source: Select Velero, Restic, or Controller.
Pod source: Select the Pod name, for example,
The selected log is displayed.
You can clear the log selection settings by changing your selection.
Click Download Selected to download the selected log.
Optionally, you can access the logs by using the CLI, as in the following example:
$ oc get pods -n openshift-migration | grep controller controller-manager-78c469849c-v6wcf 1/1 Running 0 4h49m $ oc logs controller-manager-78c469849c-v6wcf -f -n openshift-migration
If a migration fails because Restic times out, the following error appears in the Velero log:
level=error msg="Error backing up item" backup=velero/monitoring error="timed out waiting for all PodVolumeBackups to complete" error.file="/go/src/github.com/heptio/velero/pkg/restic/backupper.go:165" error.function="github.com/heptio/velero/pkg/restic.(*backupper).BackupPodVolumes" group=v1
The default value of
restic_timeout is one hour. You can increase this for large migrations, keeping in mind that a higher value may delay the return of error messages.
In the OpenShift Container Platform web console, navigate to Operators → Installed Operators.
Click Cluster Application Migration Operator.
In the MigrationController tab, click migration-controller.
In the YAML tab, update the following parameter value:
spec: restic_timeout: 1h (1)
|1||Valid units are
If your application was stopped during a failed migration, you must roll it back manually in order to prevent data corruption in the PV.
This procedure is not required if the application was not stopped during migration, because the original application is still running on the source cluster.
On the target cluster, switch to the migrated project:
$ oc project <project>
Get the deployed resources:
$ oc get all
Delete the deployed resources to ensure that the application is not running on the target cluster and accessing data on the PVC:
$ oc delete <resource_type>
To stop a DaemonSet without deleting it, update the
nodeSelector in the YAML file:
apiVersion: extensions/v1beta1 kind: DaemonSet metadata: name: hello-daemonset spec: selector: matchLabels: name: hello-daemonset template: metadata: labels: name: hello-daemonset spec: nodeSelector: role: worker (1)
Update each PV’s reclaim policy so that unnecessary data is removed. During migration, the reclaim policy for bound PVs is
Retain, to ensure that data is not lost when an application is removed from the source cluster. You can remove these PVs during rollback.
apiVersion: v1 kind: PersistentVolume metadata: name: pv0001 spec: capacity: storage: 5Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain (1) ... status: ...
On the source cluster, switch to the migrated project and get its deployed resources:
$ oc project <project> $ oc get all
Start one or more replicas of each deployed resource:
$ oc scale --replicas=1 <resource_type>/<resource_name>
nodeSelector of a DaemonSet to its original value, if you changed it during the procedure.
This release has the following known issues:
During migration, the CAM tool preserves the following namespace annotations:
These annotations preserve the UID range, ensuring that the containers retain their file system permissions on the target cluster. There is a risk that the migrated UIDs could duplicate UIDs within an existing or future namespace on the target cluster. (BZ#1748440)
When adding an S3 endpoint to the CAM web console,
https:// is supported only for AWS. For other S3 providers, use
If an AWS bucket is added to the CAM web console and then deleted, its status remains
True because the MigStorage CR is not updated. (BZ#1738564)
Migration fails if the Migration controller is running on a cluster other than the target cluster. The
EnsureCloudSecretPropagated phase is skipped with a logged warning. (BZ#1757571)
Cluster-scoped resources, including Cluster Role Bindings and Security Context Constraints, are not yet handled by the CAM. If your applications require cluster-scoped resources, you must create them manually on the target cluster. (BZ#1759804)
Incorrect source cluster storage class is displayed when creating the migration plan. (BZ#1777869)
If a cluster in the CAM web console becomes inaccessible, it blocks attempts to close open migration plans. (BZ#1758269)
If a migration fails, the migration plan does not retain custom PV settings for quiesced pods. You must manually roll back the migration, delete the migration plan, and create a new migration plan with your PV settings. (BZ#1784899)