Overview of the Operator Lifecycle Manager

In OpenShift Container Platform 4.1, the Operator Lifecycle Manager (OLM) helps users install, update, and manage the lifecycle of all Operators and their associated services running across their clusters. It is part of the Operator Framework, an open source toolkit designed to manage Kubernetes native applications (Operators) in an effective, automated, and scalable way.

olm workflow
Figure 1. Operator Lifecycle Manager workflow

The OLM runs by default in OpenShift Container Platform 4.1, which aids cluster administrators in installing, upgrading, and granting access to Operators running on their cluster. The OpenShift Container Platform web console provides management screens for cluster administrators to install Operators, as well as grant specific projects access to use the catalog of Operators available on the cluster.

For developers, a self-service experience allows provisioning and configuring instances of databases, monitoring, and big data services without having to be subject matter experts, because the Operator has that knowledge baked into it.

ClusterServiceVersions (CSVs)

A ClusterServiceVersion (CSV) is a YAML manifest created from Operator metadata that assists the Operator Lifecycle Manager (OLM) in running the Operator in a cluster. It is the metadata that accompanies an Operator container image, used to populate user interfaces with information like its logo, description, and version. It is also a source of technical information needed to run the Operator, like the RBAC rules it requires and which Custom Resources (CRs) it manages or depends on.

A CSV is composed of:

Metadata
  • Application metadata:

    • Name, description, version, links, labels, icon, etc.

Install strategy
  • Type: Deployment

    • Set of service accounts and required permissions

    • Set of Deployments.

Custom Resource Definitions (CRDs)
  • Type

  • Owned: Managed by this service

  • Required: Must exist in the cluster for this service to run

  • Resources: A list of resources that the Operator interacts with

  • Descriptors: Annotate CRD spec and status fields to provide semantic information

Operator Lifecycle Manager architecture

The Operator Lifecycle Manager (OLM) is composed of two Operators: the OLM Operator and the Catalog Operator.

Each of these Operators are responsible for managing the Custom Resource Definitions (CRDs) that are the basis for the OLM framework:

Table 1. CRDs managed by OLM and Catalog Operators
Resource Short name Owner Description

ClusterServiceVersion

csv

OLM

Application metadata: name, version, icon, required resources, installation, etc.

InstallPlan

ip

Catalog

Calculated list of resources to be created in order to automatically install or upgrade a CSV.

CatalogSource

catsrc

Catalog

A repository of CSVs, CRDs, and packages that define an application.

Subscription

sub

Catalog

Keeps CSVs up to date by tracking a channel in a package.

OperatorGroup

og

OLM

Configures all Operators deployed in the same namespace as the OperatorGroup object to watch for their Custom Resource (CR) in a list of namespaces or cluster-wide.

Each of these Operators are also responsible for creating resources:

Table 2. Resources created by OLM and Catalog Operators
Resource Owner

Deployments

OLM

ServiceAccounts

(Cluster)Roles

(Cluster)RoleBindings

Custom Resource Definitions (CRDs)

Catalog

ClusterServiceVersions (CSVs)

OLM Operator

The OLM Operator is responsible for deploying applications defined by CSV resources after the required resources specified in the CSV are present in the cluster.

The OLM Operator is not concerned with the creation of the required resources; users can choose to manually create these resources using the CLI, or users can choose to create these resources using the Catalog Operator. This separation of concern enables users incremental buy-in in terms of how much of the OLM framework they choose to leverage for their application.

While the OLM Operator is often configured to watch all namespaces, it can also be operated alongside other OLM Operators so long as they all manage separate namespaces.

OLM Operator workflow
  • Watches for ClusterServiceVersion (CSVs) in a namespace and checks that requirements are met. If so, runs the install strategy for the CSV.

    A CSV must be an active member of an OperatorGroup in order for the install strategy to be run.

Catalog Operator

The Catalog Operator is responsible for resolving and installing CSVs and the required resources they specify. It is also responsible for watching CatalogSources for updates to packages in channels and upgrading them (optionally automatically) to the latest available versions.

A user that wishes to track a package in a channel creates a Subscription resource configuring the desired package, channel, and the CatalogSource from which to pull updates. When updates are found, an appropriate InstallPlan is written into the namespace on behalf of the user.

Users can also create an InstallPlan resource directly, containing the names of the desired CSV and an approval strategy, and the Catalog Operator creates an execution plan for the creation of all of the required resources. After it is approved, the Catalog Operator creates all of the resources in an InstallPlan; this then independently satisfies the OLM Operator, which proceeds to install the CSVs.

Catalog Operator workflow
  • Has a cache of CRDs and CSVs, indexed by name.

  • Watches for unresolved InstallPlans created by a user:

    • Finds the CSV matching the name requested and adds it as a resolved resource.

    • For each managed or required CRD, adds it as a resolved resource.

    • For each required CRD, finds the CSV that manages it.

  • Watches for resolved InstallPlans and creates all of the discovered resources for it (if approved by a user or automatically).

  • Watches for CatalogSources and Subscriptions and creates InstallPlans based on them.

Catalog Registry

The Catalog Registry stores CSVs and CRDs for creation in a cluster and stores metadata about packages and channels.

A package manifest is an entry in the Catalog Registry that associates a package identity with sets of CSVs. Within a package, channels point to a particular CSV. Because CSVs explicitly reference the CSV that they replace, a package manifest provides the Catalog Operator all of the information that is required to update a CSV to the latest version in a channel (stepping through each intermediate version).

OperatorGroups

An OperatorGroup is an OLM resource that provides multitenant configuration to OLM-installed Operators. An OperatorGroup selects a set of target namespaces in which to generate required RBAC access for its member Operators. The set of target namespaces is provided by a comma-delimited string stored in the CSV’s olm.targetNamespaces annotation. This annotation is applied to member Operator’s CSV instances and is projected into their deployments.

OperatorGroup membership

An Operator is considered a member of an OperatorGroup if the following conditions are true:

  • The Operator’s CSV exists in the same namespace as the OperatorGroup.

  • The Operator’s CSV’s InstallModes support the set of namespaces targeted by the OperatorGroup.

An InstallMode consists of an InstallModeType field and a boolean Supported field. A CSV’s spec can contain a set of InstallModes of four distinct InstallModeTypes:

Table 3. InstallModes and supported OperatorGroups
InstallModeType Description

OwnNamespace

The Operator can be a member of an OperatorGroup that selects its own namespace.

SingleNamespace

The Operator can be a member of an OperatorGroup that selects one namespace.

MultiNamespace

The Operator can be a member of an OperatorGroup that selects more than one namespace.

AllNamespaces

The Operator can be a member of an OperatorGroup that selects all namespaces (target namespace set is the empty string "").

If a CSV’s spec omits an entry of InstallModeType, then that type is considered unsupported unless support can be inferred by an existing entry that implicitly supports it.

Troubleshooting OperatorGroup membership

  • If more than one OperatorGroup exists in a single namespace, any CSV created in that namespace will transition to a failure state with the reason TooManyOperatorGroups. CSVs in a failed state for this reason will transition to pending once the number of OperatorGroups in their namespaces reaches one.

  • If a CSV’s InstallModes do not support the target namespace selection of the OperatorGroup in its namespace, the CSV will transition to a failure state with the reason UnsupportedOperatorGroup. CSVs in a failed state for this reason will transition to pending once either the OperatorGroup’s target namespace selection changes to a supported configuration, or the CSV’s InstallModes are modified to support the OperatorGroup’s target namespace selection.

Target namespace selection

Specify the set of namespaces for the OperatorGroup using a label selector with the spec.selector field:

apiVersion: operators.coreos.com/v1alpha2
kind: OperatorGroup
metadata:
  name: my-group
  namespace: my-namespace
  spec:
    selector:
      matchLabels:
        cool.io/prod: "true"

You can also explicitly name the target namespaces using the spec.targetNamespaces field:

apiVersion: operators.coreos.com/v1alpha2
kind: OperatorGroup
metadata:
  name: my-group
  namespace: my-namespace
spec:
  targetNamespaces:
  - my-namespace
  - my-other-namespace
  - my-other-other-namespace

If both spec.targetNamespaces and spec.selector are defined, spec.selector is ignored.

Alternatively, you can omit both spec.selector and spec.targetNamespaces to specify a global OperatorGroup, which selects all namespaces:

apiVersion: operators.coreos.com/v1alpha2
kind: OperatorGroup
metadata:
  name: my-group
  namespace: my-namespace

The resolved set of selected namespaces is shown in an OperatorGroup’s status.namespaces field. A global OperatorGroup’s status.namespace contains the empty string (""), which signals to a consuming Operator that it should watch all namespaces.

OperatorGroup CSV annotations

Member CSVs of an OperatorGroup have the following annotations:

Annotation Description

olm.operatorGroup=<group_name>

Contains the name of the OperatorGroup.

olm.operatorGroupNamespace=<group_namespace>

Contains the namespace of the OperatorGroup.

olm.targetNamespaces=<target_namespaces>

Contains a comma-delimited string that lists the OperatorGroup’s target namespace selection.

All annotations except olm.targetNamespaces are included with copied CSVs. Omitting the olm.targetNamespaces annotation on copied CSVs prevents the duplication of target namespaces between tenants.

Provided APIs annotation

Information about what GroupVersionKinds (GVKs) are provided by an OperatorGroup are shown in an olm.providedAPIs annotation. The annotation’s value is a string consisting of <kind>.<version>.<group> delimited with commas. The GVKs of CRDs and APIServices provided by all active member CSVs of an OperatorGroup are included.

Review the following example of an OperatorGroup with a single active member CSV that provides the PackageManifest resource:

apiVersion: operators.coreos.com/v1alpha2
kind: OperatorGroup
metadata:
  annotations:
    olm.providedAPIs: PackageManifest.v1alpha1.packages.apps.redhat.com
  name: olm-operators
  namespace: local
  ...
spec:
  selector: {}
  serviceAccount:
    metadata:
      creationTimestamp: null
  targetNamespaces:
  - local
status:
  lastUpdated: 2019-02-19T16:18:28Z
  namespaces:
  - local

Role-based access control

When an OperatorGroup is created, three ClusterRoles are generated. Each contains a single AggregationRule with a ClusterRoleSelector set to match a label, as shown below:

ClusterRole Label to match

<operatorgroup_name>-admin

olm.opgroup.permissions/aggregate-to-admin: <operatorgroup_name>

<operatorgroup_name>-edit

olm.opgroup.permissions/aggregate-to-edit: <operatorgroup_name>

<operatorgroup_name>-view

olm.opgroup.permissions/aggregate-to-view: <operatorgroup_name>

The following RBAC resources are generated when a CSV becomes an active member of an OperatorGroup, as long as the CSV is watching all namespaces with the AllNamespaces InstallMode and is not in a failed state with reason InterOperatorGroupOwnerConflict.

Table 4. ClusterRoles generated for each API resource from a CRD
ClusterRole Settings

<kind>.<group>-<version>-admin

Verbs on <kind>:

  • *

Aggregation labels:

  • rbac.authorization.k8s.io/aggregate-to-admin: true

  • olm.opgroup.permissions/aggregate-to-admin: <operatorgroup_name>

<kind>.<group>-<version>-edit

Verbs on <kind>:

  • create

  • update

  • patch

  • delete

Aggregation labels:

  • rbac.authorization.k8s.io/aggregate-to-edit: true

  • olm.opgroup.permissions/aggregate-to-edit: <operatorgroup_name>

<kind>.<group>-<version>-view

Verbs on <kind>:

  • get

  • list

  • watch

Aggregation labels:

  • rbac.authorization.k8s.io/aggregate-to-view: true

  • olm.opgroup.permissions/aggregate-to-view: <operatorgroup_name>

<kind>.<group>-<version>-view-crdview

Verbs on apiextensions.k8s.io customresourcedefinitions <crd-name>:

  • get

Aggregation labels:

  • rbac.authorization.k8s.io/aggregate-to-view: true

  • olm.opgroup.permissions/aggregate-to-view: <operatorgroup_name>

Table 5. ClusterRoles generated for each API resource from an APIService
ClusterRole Settings

<kind>.<group>-<version>-admin

Verbs on <kind>:

  • *

Aggregation labels:

  • rbac.authorization.k8s.io/aggregate-to-admin: true

  • olm.opgroup.permissions/aggregate-to-admin: <operatorgroup_name>

<kind>.<group>-<version>-edit

Verbs on <kind>:

  • create

  • update

  • patch

  • delete

Aggregation labels:

  • rbac.authorization.k8s.io/aggregate-to-edit: true

  • olm.opgroup.permissions/aggregate-to-edit: <operatorgroup_name>

<kind>.<group>-<version>-view

Verbs on <kind>:

  • get

  • list

  • watch

Aggregation labels:

  • rbac.authorization.k8s.io/aggregate-to-view: true

  • olm.opgroup.permissions/aggregate-to-view: <operatorgroup_name>

Additional Roles and RoleBindings
  • If the CSV defines exactly one target namespace that contains *, then a ClusterRole and corresponding ClusterRoleBinding are generated for each permission defined in the CSV’s permissions field. All resources generated are given the olm.owner: <csv_name> and olm.owner.namespace: <csv_namespace> labels.

  • If the CSV does not define exactly one target namespace that contains *, then all Roles and RoleBindings in the Operator namespace with the olm.owner: <csv_name> and olm.owner.namespace: <csv_namespace> labels are copied into the target namespace.

Copied CSVs

OLM creates copies of all active member CSVs of an OperatorGroup in each of that OperatorGroup’s target namespaces. The purpose of a copied CSV is to tell users of a target namespace that a specific Operator is configured to watch resources created there. Copied CSVs have a status reason Copied and are updated to match the status of their source CSV. The olm.targetNamespaces annotation is stripped from copied CSVs before they are created on the cluster. Omitting the target namespace selection avoids the duplication of target namespaces between tenants. Copied CSVs are deleted when their source CSV no longer exists or the OperatorGroup that their source CSV belongs to no longer targets the copied CSV’s namespace.

Static OperatorGroups

An OperatorGroup is static if its spec.staticProvidedAPIs field is set to true. As a result, OLM does not modify the OperatorGroup’s olm.providedAPIs annotation, which means that it can be set in advance. This is useful when a user wants to use an OperatorGroup to prevent resource contention in a set of namespaces but does not have active member CSVs that provide the APIs for those resources.

Below is an example of an OperatorGroup that protects Prometheus resources in all namespaces with the something.cool.io/cluster-monitoring: "true" annotation:

apiVersion: operators.coreos.com/v1alpha2
kind: OperatorGroup
metadata:
  name: cluster-monitoring
  namespace: cluster-monitoring
  annotations:
    olm.providedAPIs: Alertmanager.v1.monitoring.coreos.com,Prometheus.v1.monitoring.coreos.com,PrometheusRule.v1.monitoring.coreos.com,ServiceMonitor.v1.monitoring.coreos.com
spec:
  staticProvidedAPIs: true
  selector:
    matchLabels:
      something.cool.io/cluster-monitoring: "true"

OperatorGroup intersection

Two OperatorGroups are said to have intersecting provided APIs if the intersection of their target namespace sets is not an empty set and the intersection of their provided API sets, defined by olm.providedAPIs annotations, is not an empty set.

A potential issue is that OperatorGroups with intersecting provided APIs can compete for the same resources in the set of intersecting namespaces.

When checking intersection rules, an OperatorGroup’s namespace is always included as part of its selected target namespaces.

Rules for intersection

Each time an active member CSV synchronizes, OLM queries the cluster for the set of intersecting provided APIs between the CSV’s OperatorGroup and all others. OLM then checks if that set is an empty set:

  • If true and the CSV’s provided APIs are a subset of the OperatorGroup’s:

    • Continue transitioning.

  • If true and the CSV’s provided APIs are not a subset of the OperatorGroup’s:

    • If the OperatorGroup is static:

      • Clean up any deployments that belong to the CSV.

      • Transition the CSV to a failed state with status reason CannotModifyStaticOperatorGroupProvidedAPIs.

    • If the OperatorGroup is not static:

      • Replace the OperatorGroup’s olm.providedAPIs annotation with the union of itself and the CSV’s provided APIs.

  • If false and the CSV’s provided APIs are not a subset of the OperatorGroup’s:

    • Clean up any deployments that belong to the CSV.

    • Transition the CSV to a failed state with status reason InterOperatorGroupOwnerConflict.

  • If false and the CSV’s provided APIs are a subset of the OperatorGroup’s:

    • If the OperatorGroup is static:

      • Clean up any deployments that belong to the CSV.

      • Transition the CSV to a failed state with status reason CannotModifyStaticOperatorGroupProvidedAPIs.

    • If the OperatorGroup is not static:

      • Replace the OperatorGroup’s olm.providedAPIs annotation with the difference between itself and the CSV’s provided APIs.

Failure states caused by OperatorGroups are non-terminal.

The following actions are performed each time an OperatorGroup synchronizes:

  • The set of provided APIs from active member CSVs is calculated from the cluster. Note that copied CSVs are ignored.

  • The cluster set is compared to olm.providedAPIs, and if olm.providedAPIs contains any extra APIs, then those APIs are pruned.

  • All CSVs that provide the same APIs across all namespaces are requeued. This notifies conflicting CSVs in intersecting groups that their conflict has possibly been resolved, either through resizing or through deletion of the conflicting CSV.

Metrics

The OLM exposes certain OLM-specific resources for use by the Prometheus-based OpenShift Container Platform cluster monitoring stack.

Table 6. Metrics exposed by OLM
Name Description

csv_count

Number of CSVs successfully registered.

install_plan_count

Number of InstallPlans.

subscription_count

Number of Subscriptions.

csv_upgrade_count

Monotonic count of CatalogSources.