The OpenShift Container Platform Vertical Pod Autoscaler Operator (VPA) automatically reviews the historic and current CPU and memory resources for containers in Pods and can update the resource limits and requests based on the usage values it learns. The VPA uses individual custom resources (CR) to update all of the Pods associated with a workload object, such as a Deployment, Deployment Config, StatefulSet, Job, DaemonSet, ReplicaSet, or ReplicationController.

The VPA helps you to understand the optimal CPU and memory usage for your Pods and can automatically maintain Pod resources through the Pod lifecycle.

Vertical Pod Autoscaler is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see https://access.redhat.com/support/offerings/techpreview/.

About the Vertical Pod Autoscaler Operator

The Vertical Pod Autoscaler Operator (VPA) is implemented as an API resource and a custom resource (CR). The CR determines the actions the Vertical Pod Autoscaler Operator should take with the Pods associated with a specific workload object, such as a Daemonset, ReplicationController, and so forth.

The VPA automatically computes historic and current CPU and memory usage for the containers in those Pods and can use this data to automatically re-deploy Pods with optimized resource limits and requests to ensure that these Pods are operating efficiently at all times. When re-deploying Pods, the VPA honors any Pod Disruption Budget set for applications. If you do not want the VPA to automatically re-deploy Pods, you can use this resource information to manually update the Pods as needed.

When configured to update Pods automatically, the VPA reduces resources for Pods that are requesting more resources then they are using and increase resources for Pods that are not requesting enough.

For example, if you have a pod that uses 50% of the CPU but only requests 10%, the VPA determines that the Pod is consuming more CPU than requested and restarts the Pods with higher resources.

For developers, the VPA helps ensure their Pods stay up during periods of high demand by scheduling Pods onto nodes so that appropriate resources are available for each Pod.

Administrators can use the VPA to better utilize cluster resources, such as preventing Pods from reserving more CPU resources than needed. The VPA monitors the resources that workloads are actually using and adjusts the resource requirements so capacity is available to other workloads. The VPA also maintains the ratios between limits and requests that are specified in initial container configuration.

If you stop running the VPA or delete a specific VPA CR in your cluster, the resource requests for the Pods already modified by the VPA do not change. Any new Pods get the resources defined in the workload object, not the previous recommendations made by the VPA.

Installing the Vertical Pod Autoscaler Operator

You can use the OpenShift Container Platform web console to install the Vertical Pod Autoscaler Operator (VPA).

Procedure
  1. In the OpenShift Container Platform web console, click OperatorsOperatorHub.

  2. Choose VerticalPodAutoscaler from the list of available Operators, and click Install.

  3. On the Install Operator page, ensure that the Operator recommended namespace option is selected. This installs the Operator in the mandatory openshift-vertical-pod-autoscaler namespace, which is automatically created if it does not exist.

  4. Click Install.

  5. Verify the install by listing the VPA Operator components:

    1. Navigate to WorkloadsPods.

    2. Select the openshift-vertical-pod-autoscaler project from the drop-down menu and verify that there are four Pods running.

    3. Navigate to WorkloadsDeployments to verify that there are four Deployments running.

  6. Optional. Verify the install in the OpenShift Container Platform CLI using the following command:

    $ oc get all -n openshift-vertical-pod-autoscaler

    The output shows four Pods and four Deplyoments:

    Example output
    NAME                                                    READY   STATUS    RESTARTS   AGE
    pod/vertical-pod-autoscaler-operator-85b4569c47-2gmhc   1/1     Running   0          3m13s
    pod/vpa-admission-plugin-default-67644fc87f-xq7k9       1/1     Running   0          2m56s
    pod/vpa-recommender-default-7c54764b59-8gckt            1/1     Running   0          2m56s
    pod/vpa-updater-default-7f6cc87858-47vw9                1/1     Running   0          2m56s
    
    NAME                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
    service/vpa-webhook   ClusterIP   172.30.53.206   <none>        443/TCP   2m56s
    
    NAME                                               READY   UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/vertical-pod-autoscaler-operator   1/1     1            1           3m13s
    deployment.apps/vpa-admission-plugin-default       1/1     1            1           2m56s
    deployment.apps/vpa-recommender-default            1/1     1            1           2m56s
    deployment.apps/vpa-updater-default                1/1     1            1           2m56s
    
    NAME                                                          DESIRED   CURRENT   READY   AGE
    replicaset.apps/vertical-pod-autoscaler-operator-85b4569c47   1         1         1       3m13s
    replicaset.apps/vpa-admission-plugin-default-67644fc87f       1         1         1       2m56s
    replicaset.apps/vpa-recommender-default-7c54764b59            1         1         1       2m56s
    replicaset.apps/vpa-updater-default-7f6cc87858                1         1         1       2m56s

About Using the Vertical Pod Autoscaler Operator

To use the Vertical Pod Autoscaler Operator (VPA), you create a VPA custom resource (CR) for a workload object in your cluster. The VPA learns and applies the optimal CPU and memory resources for the Pods associated with that workload object. You can use a VPA with and Deployment, StatefulSet, Job, DaemonSet, ReplicaSet, or ReplicationController workload object. The VPA CR must be in the same project as the Pods you want to monitor.

You use the VPA CR to associate a workload object and specify which mode the VPA operates in:

  • The Auto and Recreate modes automatically apply the VPA CPU and memory recommendations throughout the Pod lifetime.

  • The Initial mode automatically applies VPA recommendations only at Pod creation.

  • The Off mode only provides recommended resource limits and requests, allowing you to manually apply the recommendations. The off mode does not update pods.

You can also use the CR to opt-out certain containers from VPA evaluation and updates.

For example, a Pod has the following limits and requests:

resources:
  limits:
    cpu: 1
    memory: 500Mi
  requests:
    cpu: 500m
    memory: 100Mi

After creating a VPA that is set to auto, the VPA learns the resource usage and terminates and recreates the Pod with new resource limits and requests:

resources:
  limits:
    cpu: 50m
    memory: 1250Mi
  requests:
    cpu: 25m
    memory: 262144k

You can view the VPA recommendations using the following command:

$ oc get vpa <vpa-name> --output yaml

The output shows the recommendations for CPU and memory requests, similar to the following:

Example output
...
status:

...

  recommendation:
    containerRecommendations:
    - containerName: frontend
      lowerBound:
        cpu: 25m
        memory: 262144k
      target:
        cpu: 25m
        memory: 262144k
      uncappedTarget:
        cpu: 25m
        memory: 262144k
      upperBound:
        cpu: 262m
        memory: "274357142"
    - containerName: backend
      lowerBound:
        cpu: 12m
        memory: 131072k
      target:
        cpu: 12m
        memory: 131072k
      uncappedTarget:
        cpu: 12m
        memory: 131072k
      upperBound:
        cpu: 476m
        memory: "498558823"

...

The output shows the recommended resources, target, the minimum recommended resources, lowerBound, the highest recommended resources, upperBound, and the most recent resource recommendations, uncappedTarget.

The VPA uses the lowerBound and upperBound values to determine if a Pod needs to be updated. If a Pod has resource requests below the lowerBound values or above the upperBound values, the VPA terminates and recreates the Pod with the target values.

Automatically applying VPA recommendations

To use the VPA to automatically update Pods, create a VPA CR for a specific workload object with updateMode set to Auto or Recreate.

When the Pods are created for the workload object, the VPA constantly monitors the containers to analyze their CPU and memory needs. The VPA deletes and redeploys Pods with new container resource limits and requests to meet those needs, honoring any Pod Disruption Budget set for your applications. The recommendations are added to the status field of the VPA CR for reference.

Example VPA CR for the Auto mode
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: vpa-recommender
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind:       Deployment (1)
    name:       frontend (2)
  updatePolicy:
    updateMode: "Auto" (3)
1 The type of workload object you want this VPA CR to manage.
2 The name of workload object you want this VPA CR to manage.
3 Set the mode to Auto or Recreate:
  • Auto. The VPA assigns resource requests on Pod creation and updates the existing Pods by terminating them when the requested resources differ significantly from the new recommendation.

  • Recreate. The VPA assigns resource requests on Pod creation and updates the existing Pods by terminating them when the requested resources differ significantly from the new recommendation. This mode should be used rarely, only if you need to ensure that the pods are restarted whenever the resource request changes.

There must be operating Pods in the project before the VPA can determine recommended resources and apply the recommendations to new pods.

Automatically applying VPA recommendations on Pod creation

To use the VPA to apply the recommended resources only when a Pod is first deployed, create a VPA CR for a specific workload object with updateMode set to Initial.

When the Pods are created for that workload object, the VPA analyzes the CPU and memory needs of the containers and assigns the recommended container resource limits and requests. The VPA does not update the Pods as it learns new resource recommendations.

Example VPA CR for the Initial mode
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: vpa-recommender
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind:       Deployment (1)
    name:       frontend (2)
  updatePolicy:
    updateMode: "Initial" (3)
1 The type of workload object you want this VPA CR to manage.
2 The name of workload object you want this VPA CR to manage.
3 Set the mode to Initial. The VPA assigns resources when Pods are created and does not change the resources during the lifetime of the Pod.

There must be operating Pods in the project before a VPA can determine recommended resources and apply the recommendations to new pods.

Manually applying VPA recommendations

To use the VPA to only determine the recommended CPU and memory values, create a VPA CR for a specific workload object with updateMode set to off.

When the Pods are created for that workload object, the VPA analyzes the CPU and memory needs of the containers and records those recommendations in the status field of the VPA CR. The VPA does not update the Pods as it determines new resource recommendations.

Example VPA CR for the Off mode
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: vpa-recommender
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind:       Deployment (1)
    name:       frontend (2)
  updatePolicy:
    updateMode: "Off" (3)
1 The type of workload object you want this VPA CR to manage.
2 The name of workload object you want this VPA CR to manage.
3 Set the mode to Off.

You can view the recommendations using the following command.

$ oc get vpa <vpa-name> --output yaml

With the recommendations, you can edit the workload object to add CPU and memory requests, then delete and redeploy the Pods using the recommended resources.

There must be operating Pods in the project before a VPA can determine recommended resources.

Exempting containers from applying VPA recommendations

If your workload object has multiple containers and you do not want the VPA to evaluate and act on all of the containers, create a VPA CR for a specific workload object and add a resourcePolicy to opt-out specific containers.

When the VPA updates the Pods with recommended resources, any containers with a resourcePolicy are not updated and the VPA does not present recommendations for those containers in the Pod.

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: vpa-recommender
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind:       Deployment (1)
    name:       frontend (2)
  updatePolicy:
    updateMode: "Auto" (3)
  resourcePolicy: (4)
    containerPolicies:
    - containerName: my-opt-sidecar
      mode: "Off"
1 The type of workload object you want this VPA CR to manage.
2 The name of workload object you want this VPA CR to manage.
3 Set the mode to Auto, Recreate, or Off. The Recreate mode should be used rarely, only if you need to ensure that the pods are restarted whenever the resource request changes.
4 Specify the containers you want to opt-out and set mode to Off.

For example, a pod has two containers, the same resource requests and limits:

...
spec:
  containers:
    name: frontend
    resources:
      limits:
        cpu: 1
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 100Mi
...
    name: backend
    resources:
      limits:
        cpu: "1"
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 100Mi
...

After launching a VPA CR with the backend container set to opt-out, the VPA terminates and recreates the pod with the recommended resources applied only to the frontend container:

...
spec:
  containers:
    name: frontend
    resources:
      limits:
        cpu: 50m
        memory: 1250Mi
      requests:
        cpu: 25m
        memory: 262144k
...
    name: backend
    resources:
      limits:
        cpu: "1"
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 100Mi
...

Using the Vertical Pod Autoscaler Operator

You can use the Vertical Pod Autoscaler Operator (VPA) by creating a VPA custom resource (CR). The CR indicates which Pods it should analyze and determines the actions the VPA should take with those Pods.

Procedure

To create a VPA CR for a specific workload object:

  1. Change to the project where the workload object you want to scale is located.

    1. Create a VPA CR YAML file:

      apiVersion: autoscaling.k8s.io/v1
      kind: VerticalPodAutoscaler
      metadata:
        name: vpa-recommender
      spec:
        targetRef:
          apiVersion: "apps/v1"
          kind:       Deployment (1)
          name:       frontend (2)
        updatePolicy:
          updateMode: "Auto" (3)
        resourcePolicy: (4)
          containerPolicies:
          - containerName: my-opt-sidecar
            mode: "Off"
      1 Specify the type of workload object you want this VPA to manage: Deployment, StatefulSet, Job, DaemonSet, ReplicaSet, or ReplicationController.
      2 Specify the name of an existing workload object you want this VPA to manage.
      3 Specify the VPA mode:
      • auto to automatically apply the recommended resources on Pods associated with the controller. The VPA terminates existing Pods and creates new Pods with the recommended resource limits and requests.

      • recreate to automatically apply the recommended resources on Pods associated with the workload object. The VPA terminates existing Pods and creates new Pods with the recommended resource limits and requests. The recreate mode should be used rarely, only if you need to ensure that the pods are restarted whenever the resource request changes.

      • initial to automatically apply the recommended resources when Pods associated with the workload object are created. The VPA does not update the Pods as it learns new resource recommendations.

      • off to only generate resource recommendations for the Pods associated with the workload object. The VPA does not update the Pods as it learns new resource recommendations and does not apply the recommendations to new Pods.

      4 Optional. Specify the containers you want to opt-out and set the mode to Off.
    2. Create the VPA CR:

      $ oc create -f <file-name>.yaml

      After a few moments, the VPA learns the resource usage of the containers in the Pods associated with the workload object.

      You can view the VPA recommendations using the following command:

      $ oc get vpa <vpa-name> --output yaml

      The output shows the recommendations for CPU and memory requests, similar to the following:

      Example output
      ...
      status:
      
      ...
      
        recommendation:
          containerRecommendations:
          - containerName: frontend
            lowerBound: (1)
              cpu: 25m
              memory: 262144k
            target: (2)
              cpu: 25m
              memory: 262144k
            uncappedTarget: (3)
              cpu: 25m
              memory: 262144k
            upperBound: (4)
              cpu: 262m
              memory: "274357142"
          - containerName: backend
            lowerBound:
              cpu: 12m
              memory: 131072k
            target:
              cpu: 12m
              memory: 131072k
            uncappedTarget:
              cpu: 12m
              memory: 131072k
            upperBound:
              cpu: 476m
              memory: "498558823"
      
      ...
      1 lowerBound is the minimum recommended resource levels.
      2 target is the recommended resource levels.
      3 upperBound is the highest recommended resource levels.
      4 uncappedTarget is the most recent resource recommendations.

Uninstalling the Vertical Pod Autoscaler Operator

You can remove the Vertical Pod Autoscaler Operator (VPA) from your OpenShift Container Platform cluster. After uninstalling, the resource requests for the Pods already modified by an existing VPA CR do not change. Any new Pods get the resources defined in the workload object, not the previous recommendations made by the Vertical Pod Autoscaler Operator.

You can remove a specific VPA using the oc delete vpa <vpa-name> command. The same actions apply for resource requests as uninstalling the Vertical Pod Autoscaler.

Prerequisites
  • The Vertical Pod Autoscaler Operator must be installed.

Procedure
  1. In the OpenShift Container Platform web console, click OperatorsInstalled Operators.

  2. Switch to the openshift-vertical-pod-autoscaler project.

  3. Find the VerticalPodAutoscaler Operator and click the Options menu. Select Uninstall Operator.

  4. In the dialog box, click Uninstall.