×

As a cluster administrator, you install the OpenShift API for Data Protection (OADP) by installing the OADP Operator. The OADP Operator installs Velero 1.11.

Starting from OADP 1.0.4, all OADP 1.0.z versions can only be used as a dependency of the MTC Operator and are not available as a standalone Operator.

To back up Kubernetes resources and internal images, you must have object storage as a backup location, such as one of the following storage types:

Unless specified otherwise, "NooBaa" refers to the open source project that provides lightweight object storage, while "Multicloud Object Gateway (MCG)" refers to the Red Hat distribution of NooBaa.

The CloudStorage API, which automates the creation of a bucket for object storage, is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

You can back up persistent volumes (PVs) by using snapshots or Restic.

To back up PVs with snapshots, you must have a cloud provider that supports either a native snapshot API or Container Storage Interface (CSI) snapshots, such as one of the following cloud providers:

If you want to use CSI backup on OCP 4.11 and later, install OADP 1.1.x.

OADP 1.0.x does not support CSI backup on OCP 4.11 and later. OADP 1.0.x includes Velero 1.7.x and expects the API group snapshot.storage.k8s.io/v1beta1, which is not present on OCP 4.11 and later.

If your cloud provider does not support snapshots or if your storage is NFS, you can back up applications with Restic backups on object storage.

You create a default Secret and then you install the Data Protection Application.

AWS S3 compatible backup storage providers

OADP is compatible with many object storage providers for use with different backup and snapshot operations. Several object storage providers are fully supported, several are unsupported but known to work, and some have known limitations.

Supported backup storage providers

The following AWS S3 compatible object storage providers are fully supported by OADP through the AWS plugin for use as backup storage locations:

  • MinIO

  • Multicloud Object Gateway (MCG)

  • Amazon Web Services (AWS) S3

The following compatible object storage providers are supported and have their own Velero object store plugins:

  • Google Cloud Platform (GCP)

  • Microsoft Azure

Unsupported backup storage providers

The following AWS S3 compatible object storage providers, are known to work with Velero through the AWS plugin, for use as backup storage locations, however, they are unsupported and have not been tested by Red Hat:

  • IBM Cloud

  • Oracle Cloud

  • DigitalOcean

  • NooBaa, unless installed using Multicloud Object Gateway (MCG)

  • Tencent Cloud

  • Ceph RADOS v12.2.7

  • Quobyte

  • Cloudian HyperStore

Unless specified otherwise, "NooBaa" refers to the open source project that provides lightweight object storage, while "Multicloud Object Gateway (MCG)" refers to the Red Hat distribution of NooBaa.

Backup storage providers with known limitations

The following AWS S3 compatible object storage providers are known to work with Velero through the AWS plugin with a limited feature set:

  • Swift - It works for use as a backup storage location for backup storage, but is not compatible with Restic for filesystem-based volume backup and restore.

Configuring Multicloud Object Gateway (MCG) for disaster recovery on OpenShift Data Foundation

If you use cluster storage for your MCG bucket backupStorageLocation on OpenShift Data Foundation, configure MCG as an external object store.

Failure to configure MCG as an external object store might lead to backups not being available.

Unless specified otherwise, "NooBaa" refers to the open source project that provides lightweight object storage, while "Multicloud Object Gateway (MCG)" refers to the Red Hat distribution of NooBaa.

Procedure

About OADP update channels

When you install an OADP Operator, you choose an update channel. This channel determines which upgrades to the OADP Operator and to Velero you receive. You can switch channels at any time.

The following update channels are available:

  • The stable channel is now deprecated. The stable channel contains the patches (z-stream updates) of OADP ClusterServiceVersion for oadp.v1.1.z and older versions from oadp.v1.0.z.

  • The stable-1.0 channel contains oadp.v1.0.z, the most recent OADP 1.0 ClusterServiceVersion.

  • The stable-1.1 channel contains oadp.v1.1.z, the most recent OADP 1.1 ClusterServiceVersion.

  • The stable-1.2 channel contains oadp.v1.2.z, the most recent OADP 1.2 ClusterServiceVersion.

  • The stable-1.3 channel contains oadp.v1.3.z, the most recent OADP 1.3 ClusterServiceVersion.

Which update channel is right for you?

  • The stable channel is now deprecated. If you are already using the stable channel, you will continue to get updates from oadp.v1.1.z.

  • Choose the stable-1.y update channel to install OADP 1.y and to continue receiving patches for it. If you choose this channel, you will receive all z-stream patches for version 1.y.z.

When must you switch update channels?

  • If you have OADP 1.y installed, and you want to receive patches only for that y-stream, you must switch from the stable update channel to the stable-1.y update channel. You will then receive all z-stream patches for version 1.y.z.

  • If you have OADP 1.0 installed, want to upgrade to OADP 1.1, and then receive patches only for OADP 1.1, you must switch from the stable-1.0 update channel to the stable-1.1 update channel. You will then receive all z-stream patches for version 1.1.z.

  • If you have OADP 1.y installed, with y greater than 0, and want to switch to OADP 1.0, you must uninstall your OADP Operator and then reinstall it using the stable-1.0 update channel. You will then receive all z-stream patches for version 1.0.z.

You cannot switch from OADP 1.y to OADP 1.0 by switching update channels. You must uninstall the Operator and then reinstall it.

Installation of OADP on multiple namespaces

You can install OADP into multiple namespaces on the same cluster so that multiple project owners can manage their own OADP instance. This use case has been validated with Restic and CSI.

You install each instance of OADP as specified by the per-platform procedures contained in this document with the following additional requirements:

  • All deployments of OADP on the same cluster must be the same version, for example, 1.1.4. Installing different versions of OADP on the same cluster is not supported.

  • Each individual deployment of OADP must have a unique set of credentials and a unique BackupStorageLocation configuration.

  • By default, each OADP deployment has cluster-level access across namespaces. OpenShift Container Platform administrators need to review security and RBAC settings carefully and make any necessary changes to them to ensure that each OADP instance has the correct permissions.

Additional resources

Velero CPU and memory requirements based on collected data

The following recommendations are based on observations of performance made in the scale and performance lab. The backup and restore resources can be impacted by the type of plugin, the amount of resources required by that backup or restore, and the respective data contained in the persistent volumes (PVs) related to those resources.

CPU and memory requirement for configurations

Configuration types [1] Average usage [2] Large usage resourceTimeouts

CSI

Velero:

CPU- Request 200m, Limits 1000m

Memory - Request 256Mi, Limits 1024Mi

Velero:

CPU- Request 200m, Limits 2000m

Memory- Request 256Mi, Limits 2048Mi

N/A

Restic

[3] Restic:

CPU- Request 1000m, Limits 2000m

Memory - Request 16Gi, Limits 32Gi

[4] Restic:

CPU - Request 2000m, Limits 8000m

Memory - Request 16Gi, Limits 40Gi

900m

[5] DataMover

N/A

N/A

10m - average usage

60m - large usage

  1. Average usage - use these settings for most usage situations.

  2. Large usage - use these settings for large usage situations, such as a large PV (500GB Usage), multiple namespaces (100+), or many pods within a single namespace (2000 pods+), and for optimal performance for backup and restore involving large datasets.

  3. Restic resource usage corresponds to the amount of data, and type of data. For example, many small files or large amounts of data can cause Restic to utilize large amounts of resources. The Velero documentation references 500m as a supplied default, for most of our testing we found 200m request suitable with 1000m limit. As cited in the Velero documentation, exact CPU and memory usage is dependent on the scale of files and directories, in addition to environmental limitations.

  4. Increasing the CPU has a significant impact on improving backup and restore times.

  5. DataMover - DataMover default resourceTimeout is 10m. Our tests show that for restoring a large PV (500GB usage), it is required to increase the resourceTimeout to 60m.

The resource requirements listed throughout the guide are for average usage only. For large usage, adjust the settings as described in the table above.

NodeAgent CPU for large usage

Testing shows that increasing NodeAgent CPU can significantly improve backup and restore times when using OpenShift API for Data Protection (OADP).

It is not recommended to use Kopia without limits in production environments on nodes running production workloads due to Kopia’s aggressive consumption of resources. However, running Kopia with limits that are too low results in CPU limiting and slow backups and restore situations. Testing showed that running Kopia with 20 cores and 32 Gi memory supported backup and restore operations of over 100 GB of data, multiple namespaces, or over 2000 pods in a single namespace.

Testing detected no CPU limiting or memory saturation with these resource specifications.

You can set these limits in Ceph MDS pods by following the procedure in Changing the CPU and memory resources on the rook-ceph pods.

You need to add the following lines to the storage cluster Custom Resource (CR) to set the limits:

   resources:
     mds:
       limits:
         cpu: "3"
         memory: 128Gi
       requests:
         cpu: "3"
         memory: 8Gi