Understanding the data copy methods for migration

The CAM tool supports the file system and snapshot data copy methods for migrating data from the source cluster to the target cluster. You can select a method that is suited for your environment and is supported by your storage provider.

File system copy method

The CAM tool copies data files from the source cluster to the replication repository, and from there to the target cluster.

Table 1. File system copy method summary
Benefits Limitations
  • Clusters can have different storage classes

  • Supported for all S3 storage providers

  • Optional data verification with checksum

  • Slower than the snapshot copy method

  • Optional data verification significantly reduces performance

Snapshot copy method

The CAM tool copies a snapshot of the source cluster’s data to a cloud provider’s object storage, configured as a replication repository. The data is restored on the target cluster.

AWS, Google Cloud Provider, and Microsoft Azure support the snapshot copy method.

Table 2. Snapshot copy method summary
Benefits Limitations
  • Faster than the file system copy method

  • Cloud provider must support snapshots.

  • Clusters must be on the same cloud provider.

  • Clusters must be in the same location or region.

  • Clusters must have the same storage class.

  • Storage class must be compatible with snapshots.

Configuring a Multi-Cloud Object Gateway storage bucket as a replication repository

You can install the OpenShift Container Storage Operator and configure a Multi-Cloud Object Gateway (MCG) storage bucket as a replication repository.

Installing the OpenShift Container Storage Operator

You can install the OpenShift Container Storage Operator from OperatorHub.

Procedure
  1. In the OpenShift Container Platform web console, click OperatorsOperatorHub.

  2. Use Filter by keyword (in this case, OCS) to find the OpenShift Container Storage Operator.

  3. Select the OpenShift Container Storage Operator and click Install.

  4. Select an Update Channel, Installation Mode, and Approval Strategy.

  5. Click Subscribe.

    On the Installed Operators page, the OpenShift Container Storage Operator appears in the openshift-storage project with the status Succeeded.

Creating the Multi-Cloud Object Gateway storage bucket

You can create the Multi-Cloud Object Gateway (MCG) storage bucket’s Custom Resources (CRs).

Procedure
  1. Log in to the OpenShift Container Platform cluster:

    $ oc login
  2. Create the NooBaa CR configuration file, noobaa.yml, with the following content:

    apiVersion: noobaa.io/v1alpha1
    kind: NooBaa
    metadata:
      name: noobaa
      namespace: openshift-storage
    spec:
     dbResources:
       requests:
         cpu: 0.5 (1)
         memory: 1Gi
     coreResources:
       requests:
         cpu: 0.5 (1)
         memory: 1Gi
    1 For a very small cluster, you can change the cpu value to 0.1.
  3. Create the NooBaa object:

    $ oc create -f noobaa.yml
  4. Create the BackingStore CR configuration file, bs.yml, with the following content:

    apiVersion: noobaa.io/v1alpha1
    kind: BackingStore
    metadata:
      finalizers:
      - noobaa.io/finalizer
      labels:
        app: noobaa
      name: mcg-pv-pool-bs
      namespace: openshift-storage
    spec:
      pvPool:
        numVolumes: 3 (1)
        resources:
          requests:
            storage: 50Gi (2)
        storageClass: gp2 (3)
      type: pv-pool
    1 Specify the number of volumes in the PV pool.
    2 Specify the size of the volumes.
    3 Specify the storage class.
  5. Create the BackingStore object:

    $ oc create -f bs.yml
  6. Create the BucketClass CR configuration file, bc.yml, with the following content:

    apiVersion: noobaa.io/v1alpha1
    kind: BucketClass
    metadata:
      labels:
        app: noobaa
      name: mcg-pv-pool-bc
      namespace: openshift-storage
    spec:
      placementPolicy:
        tiers:
        - backingStores:
          - mcg-pv-pool-bs
          placement: Spread
  7. Create the BucketClass object:

    $ oc create -f bc.yml
  8. Create the ObjectBucketClaim CR configuration file, obc.yml, with the following content:

    apiVersion: objectbucket.io/v1alpha1
    kind: ObjectBucketClaim
    metadata:
      name: migstorage
      namespace: openshift-storage
    spec:
      bucketName: migstorage (1)
      storageClassName: openshift-storage.noobaa.io
      additionalConfig:
        bucketclass: mcg-pv-pool-bc
    1 Record the bucket name for adding the replication repository to the CAM web console.
  9. Create the ObjectBucketClaim object:

    $ oc create -f obc.yml
  10. Watch the resource creation process to verify that the ObjectBucketClaim status is Bound:

    $ watch -n 30 'oc get -n openshift-storage objectbucketclaim migstorage -o yaml'

    This process can take five to ten minutes.

  11. Obtain and record the following values, which are required when you add the replication repository to the CAM web console:

    • S3 endpoint:

      $ oc get route -n openshift-storage s3
    • S3 provider access key:

      $ oc get secret -n openshift-storage migstorage -o go-template='{{ .data.AWS_ACCESS_KEY_ID }}' | base64 -d
    • S3 provider secret access key:

      $ oc get secret -n openshift-storage migstorage -o go-template='{{ .data.AWS_SECRET_ACCESS_KEY }}' | base64 -d

Configuring an AWS S3 storage bucket as a replication repository

You can configure an AWS S3 storage bucket as a replication repository.

Prerequisites
  • The AWS S3 storage bucket must be accessible to the source and target clusters.

  • You must have the AWS CLI installed.

  • If you are using the snapshot copy method:

    • You must have access to EC2 Elastic Block Storage (EBS).

    • The source and target clusters must be in the same region.

    • The source and target clusters must have the same storage class.

    • The storage class must be compatible with snapshots.

Procedure
  1. Create an AWS S3 bucket:

    $ aws s3api create-bucket \
        --bucket <bucket_name> \ (1)
        --region <bucket_region> (2)
    1 Specify your S3 bucket name.
    2 Specify your S3 bucket region, for example, us-east-1.
  2. Create the IAM user velero:

    $ aws iam create-user --user-name velero
  3. Create an EC2 EBS snapshot policy:

    $ cat > velero-ec2-snapshot-policy.json <<EOF
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "ec2:DescribeVolumes",
                    "ec2:DescribeSnapshots",
                    "ec2:CreateTags",
                    "ec2:CreateVolume",
                    "ec2:CreateSnapshot",
                    "ec2:DeleteSnapshot"
                ],
                "Resource": "*"
            }
        ]
    }
    EOF
  4. Create an AWS S3 access policy for one or for all S3 buckets:

    $ cat > velero-s3-policy.json <<EOF
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "s3:GetObject",
                    "s3:DeleteObject",
                    "s3:PutObject",
                    "s3:AbortMultipartUpload",
                    "s3:ListMultipartUploadParts"
                ],
                "Resource": [
                    "arn:aws:s3:::<bucket_name>/*" (1)
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "s3:ListBucket",
                    "s3:GetBucketLocation",
                    "s3:ListBucketMultipartUploads"
                ],
                "Resource": [
                    "arn:aws:s3:::<bucket_name>" (1)
                ]
            }
        ]
    }
    EOF
    1 To grant access to a single S3 bucket, specify the bucket name. To grant access to all AWS S3 buckets, specify * instead of a bucket name:
    "Resource": [
        "arn:aws:s3:::*"
  5. Attach the EC2 EBS policy to velero:

    $ aws iam put-user-policy \
      --user-name velero \
      --policy-name velero-ebs \
      --policy-document file://velero-ec2-snapshot-policy.json
  6. Attach the AWS S3 policy to velero:

    $ aws iam put-user-policy \
      --user-name velero \
      --policy-name velero-s3 \
      --policy-document file://velero-s3-policy.json
  7. Create an access key for velero:

    $ aws iam create-access-key --user-name velero
    {
      "AccessKey": {
            "UserName": "velero",
            "Status": "Active",
            "CreateDate": "2017-07-31T22:24:41.576Z",
            "SecretAccessKey": <AWS_SECRET_ACCESS_KEY>, (1)
            "AccessKeyId": <AWS_ACCESS_KEY_ID> (1)
        }
    }
    1 Record the AWS_SECRET_ACCESS_KEY and the AWS_ACCESS_KEY_ID for adding the AWS repository to the CAM web console.

Configuring a Google Cloud Provider storage bucket as a replication repository

You can configure a Google Cloud Provider (GCP) storage bucket as a replication repository.

Prerequisites
  • The GCP storage bucket must be accessible to the source and target clusters.

  • You must have gsutil installed.

  • If you are using the snapshot copy method:

    • The source and target clusters must be in the same region.

    • The source and target clusters must have the same storage class.

    • The storage class must be compatible with snapshots.

Procedure
  1. Run gsutil init to log in:

    $ gsutil init
    Welcome! This command will take you through the configuration of gcloud.
    
    Your current configuration has been set to: [default]
    
    To continue, you must login. Would you like to login (Y/n)?
  2. Set the BUCKET variable:

    $ BUCKET=<bucket_name> (1)
    1 Specify your bucket name.
  3. Create a storage bucket:

    $ gsutil mb gs://$BUCKET/
  4. Set the PROJECT_ID variable to your active project:

    $ PROJECT_ID=$(gcloud config get-value project)
  5. Create a velero service account:

    $ gcloud iam service-accounts create velero \
        --display-name "Velero Storage"
  6. Set the SERVICE_ACCOUNT_EMAIL variable to the service account’s email address:

    $ SERVICE_ACCOUNT_EMAIL=$(gcloud iam service-accounts list \
      --filter="displayName:Velero Storage" \
      --format 'value(email)')
  7. Grant permissions to the service account:

    $ ROLE_PERMISSIONS=(
        compute.disks.get
        compute.disks.create
        compute.disks.createSnapshot
        compute.snapshots.get
        compute.snapshots.create
        compute.snapshots.useReadOnly
        compute.snapshots.delete
        compute.zones.get
    )
    
    gcloud iam roles create velero.server \
        --project $PROJECT_ID \
        --title "Velero Server" \
        --permissions "$(IFS=","; echo "${ROLE_PERMISSIONS[*]}")"
    
    gcloud projects add-iam-policy-binding $PROJECT_ID \
        --member serviceAccount:$SERVICE_ACCOUNT_EMAIL \
        --role projects/$PROJECT_ID/roles/velero.server
    
    gsutil iam ch serviceAccount:$SERVICE_ACCOUNT_EMAIL:objectAdmin gs://${BUCKET}
  8. Save the service account’s keys to the credentials-velero file in the current directory:

    $ gcloud iam service-accounts keys create credentials-velero \
      --iam-account $SERVICE_ACCOUNT_EMAIL

Configuring a Microsoft Azure Blob storage container as a replication repository

You can configure a Microsoft Azure Blob storage container as a replication repository.

Prerequisites
  • You must have an Azure storage account.

  • You must have the Azure CLI installed.

  • The Azure Blob storage container must be accessible to the source and target clusters.

  • If you are using the snapshot copy method:

    • The source and target clusters must be in the same region.

    • The source and target clusters must have the same storage class.

    • The storage class must be compatible with snapshots.

Procedure
  1. Set the AZURE_RESOURCE_GROUP variable:

    $ AZURE_RESOURCE_GROUP=Velero_Backups
  2. Create an Azure resource group:

    $ az group create -n $AZURE_RESOURCE_GROUP --location <CentralUS> (1)
    1 Specify your location.
  3. Set the AZURE_STORAGE_ACCOUNT_ID variable:

    $ AZURE_STORAGE_ACCOUNT_ID=velerobackups
  4. Create an Azure storage account:

    $ az storage account create \
      --name $AZURE_STORAGE_ACCOUNT_ID \
      --resource-group $AZURE_RESOURCE_GROUP \
      --sku Standard_GRS \
      --encryption-services blob \
      --https-only true \
      --kind BlobStorage \
      --access-tier Hot
  5. Set the BLOB_CONTAINER variable:

    $ BLOB_CONTAINER=velero
  6. Create an Azure Blob storage container:

    $ az storage container create \
      -n $BLOB_CONTAINER \
      --public-access off \
      --account-name $AZURE_STORAGE_ACCOUNT_ID
  7. Create a service principal and credentials for velero:

    $ AZURE_SUBSCRIPTION_ID=`az account list --query '[?isDefault].id' -o tsv`
    $ AZURE_TENANT_ID=`az account list --query '[?isDefault].tenantId' -o tsv`
    $ AZURE_CLIENT_SECRET=`az ad sp create-for-rbac --name "velero" --role "Contributor" --query 'password' -o tsv`
    $ AZURE_CLIENT_ID=`az ad sp list --display-name "velero" --query '[0].appId' -o tsv`
  8. Save the service principal’s credentials in the credentials-velero file:

    $ cat << EOF  > ./credentials-velero
    AZURE_SUBSCRIPTION_ID=${AZURE_SUBSCRIPTION_ID}
    AZURE_TENANT_ID=${AZURE_TENANT_ID}
    AZURE_CLIENT_ID=${AZURE_CLIENT_ID}
    AZURE_CLIENT_SECRET=${AZURE_CLIENT_SECRET}
    AZURE_RESOURCE_GROUP=${AZURE_RESOURCE_GROUP}
    AZURE_CLOUD_NAME=AzurePublicCloud
    EOF