
Knative Serving provides automatic scaling, or autoscaling, for applications to match incoming demand.

Enabling scale-to-zero

You can use the enable-scale-to-zero spec to enable or disable scale-to-zero globally for applications on the cluster.

  • You have installed OpenShift Serverless Operator and Knative Serving on your cluster.

  • You have cluster administrator permissions.

  • You are using the default Knative Pod Autoscaler. The scale to zero feature is not available if you are using the Kubernetes Horizontal Pod Autoscaler.

  • Modify the enable-scale-to-zero spec in the KnativeServing custom resource (CR):

    Example KnativeServing CR
    apiVersion: operator.knative.dev/v1beta1
    kind: KnativeServing
      name: knative-serving
          enable-scale-to-zero: "false" (1)
    1 The enable-scale-to-zero spec can be either "true" or "false". If set to true, scale-to-zero is enabled. If set to false, applications are scaled down to the configured minimum scale bound. The default value is "true".

Configuring the scale-to-zero grace period

Knative Serving provides automatic scaling down to zero pods for applications. You can use the scale-to-zero-grace-period spec to define an upper bound time limit that Knative waits for scale-to-zero machinery to be in place before the last replica of an application is removed.

  • You have installed OpenShift Serverless Operator and Knative Serving on your cluster.

  • You have cluster administrator permissions.

  • You are using the default Knative Pod Autoscaler. The scale-to-zero feature is not available if you are using the Kubernetes Horizontal Pod Autoscaler.

  • Modify the scale-to-zero-grace-period spec in the KnativeServing custom resource (CR):

    Example KnativeServing CR
    apiVersion: operator.knative.dev/v1beta1
    kind: KnativeServing
      name: knative-serving
          scale-to-zero-grace-period: "30s" (1)
    1 The grace period time in seconds. The default value is 30 seconds.