This guide outlines the concepts and architecture of the Operator Lifecycle Manager (OLM) in OpenShift Container Platform.
In OpenShift Container Platform 4.4, the Operator Lifecycle Manager (OLM) helps users install, update, and manage the lifecycle of all Operators and their associated services running across their clusters. It is part of the Operator Framework, an open source toolkit designed to manage Kubernetes native applications (Operators) in an effective, automated, and scalable way.
The OLM runs by default in OpenShift Container Platform 4.4, which aids cluster administrators in installing, upgrading, and granting access to Operators running on their cluster. The OpenShift Container Platform web console provides management screens for cluster administrators to install Operators, as well as grant specific projects access to use the catalog of Operators available on the cluster.
For developers, a self-service experience allows provisioning and configuring instances of databases, monitoring, and big data services without having to be subject matter experts, because the Operator has that knowledge baked into it.
A ClusterServiceVersion (CSV) is a YAML manifest created from Operator metadata that assists the Operator Lifecycle Manager (OLM) in running the Operator in a cluster.
A CSV is the metadata that accompanies an Operator container image, used to populate user interfaces with information like its logo, description, and version. It is also a source of technical information needed to run the Operator, like the RBAC rules it requires and which Custom Resources (CRs) it manages or depends on.
A CSV is composed of:
Application metadata:
Name, description, version (semver compliant), links, labels, icon, etc.
Type: Deployment
Set of service accounts and required permissions
Set of Deployments.
Type
Owned: Managed by this service
Required: Must exist in the cluster for this service to run
Resources: A list of resources that the Operator interacts with
Descriptors: Annotate CRD spec and status fields to provide semantic information
In the Operator Lifecycle Manager (OLM) ecosystem, the following resources are used to resolve Operator installations and upgrades:
ClusterServiceVersion (CSV)
CatalogSource
Subscription
Operator metadata, defined in CSVs, can be stored in a collection called a CatalogSource. OLM uses CatalogSources, which use the Operator Registry API, to query for available Operators as well as upgrades for installed Operators.
Within a CatalogSource, Operators are organized into packages and streams of updates called channels, which should be a familiar update pattern from OpenShift Container Platform or other software on a continuous release cycle like web browsers.
A user indicates a particular package and channel in a particular CatalogSource
in a Subscription, for example an etcd
package and its alpha
channel. If a
Subscription is made to a package that has not yet been installed in the
namespace, the latest Operator for that package is installed.
OLM deliberately avoids version comparisons, so the "latest" or "newest" Operator available from a given catalog → channel → package path does not necessarily need to be the highest version number. It should be thought of more as the head reference of a channel, similar to a Git repository. |
Each CSV has a replaces
parameter that indicates which Operator it replaces.
This builds a graph of CSVs that can be queried by OLM, and updates can be
shared between channels. Channels can be thought of as entry points into the
graph of updates:
For example:
packageName: example
channels:
- name: alpha
currentCSV: example.v0.1.2
- name: beta
currentCSV: example.v0.1.3
defaultChannel: alpha
For OLM to successfully query for updates, given a CatalogSource, package,
channel, and CSV, a catalog must be able to return, unambiguously and
deterministically, a single CSV that replaces
the input CSV.
For an example upgrade scenario, consider an installed Operator corresponding to
CSV version 0.1.1
. OLM queries the CatalogSource and detects an upgrade in the
subscribed channel with new CSV version 0.1.3
that replaces an older but
not-installed CSV version 0.1.2
, which in turn replaces the older and
installed CSV version 0.1.1
.
OLM walks back from the channel head to previous versions via the replaces
field specified in the CSVs to determine the upgrade path 0.1.3
→ 0.1.2
→
0.1.1
; the direction of the arrow indicates that the former replaces the
latter. OLM upgrades the Operator one version at the time until it reaches the
channel head.
For this given scenario, OLM installs Operator version 0.1.2
to replace the
existing Operator version 0.1.1
. Then, it installs Operator version 0.1.3
to
replace the previously installed Operator version 0.1.2
. At this point, the
installed operator version 0.1.3
matches the channel head and the upgrade is
completed.
OLM’s basic path for upgrades is:
A CatalogSource is updated with one or more updates to an Operator.
OLM traverses every version of the Operator until reaching the latest version the CatalogSource contains.
However, sometimes this is not a safe operation to perform. There will be cases where a published version of an Operator should never be installed on a cluster if it has not already, for example because a version introduces a serious vulnerability.
In those cases, OLM must consider two cluster states and provide an update graph that supports both:
The "bad" intermediate Operator has been seen by the cluster and installed.
The "bad" intermediate Operator has not yet been installed onto the cluster.
By shipping a new catalog and adding a skipped release, OLM is ensured that it can always get a single unique update regardless of the cluster state and whether it has seen the bad update yet.
For example:
apiVersion: operators.coreos.com/v1alpha1
kind: ClusterServiceVersion
metadata:
name: etcdoperator.v0.9.2
namespace: placeholder
annotations:
spec:
displayName: etcd
description: Etcd Operator
replaces: etcdoperator.v0.9.0
skips:
- etcdoperator.v0.9.1
Consider the following example Old CatalogSource and New CatalogSource:
This graph maintains that:
Any Operator found in Old CatalogSource has a single replacement in New CatalogSource.
Any Operator found in New CatalogSource has a single replacement in New CatalogSource.
If the bad update has not yet been installed, it will never be.
Creating the New CatalogSource as described requires publishing CSVs that replace
one Operator, but can skip
several. This can be accomplished using the
skipRange
annotation:
olm.skipRange: <semver_range>
where <semver_range>
has the version range format supported by the
semver library.
When searching catalogs for updates, if the head of a channel has a skipRange
annotation and the currently installed Operator has a version field that falls
in the range, OLM updates to the latest entry in the channel.
The order of precedence is:
Channel head in the source specified by sourceName
on the Subscription, if the
other criteria for skipping are met.
The next Operator that replaces the current one, in the source specified by sourceName
.
Channel head in another source that is visible to the Subscription, if the other criteria for skipping are met.
The next Operator that replaces the current one in any source visible to the Subscription.
For example:
apiVersion: operators.coreos.com/v1alpha1
kind: ClusterServiceVersion
metadata:
name: elasticsearch-operator.v4.1.2
namespace: <namespace>
annotations:
olm.skipRange: '>=4.1.0 <4.1.2'
A z-stream, or patch release, must replace all previous z-stream releases for the same minor version. OLM does not care about major, minor, or patch versions, it just needs to build the correct graph in a catalog.
In other words, OLM must be able to take a graph as in Old CatalogSource and, similar to before, generate a graph as in New CatalogSource:
This graph maintains that:
Any Operator found in Old CatalogSource has a single replacement in New CatalogSource.
Any Operator found in New CatalogSource has a single replacement in New CatalogSource.
Any z-stream release in Old CatalogSource will update to the latest z-stream release in New CatalogSource.
Unavailable releases can be considered "virtual" graph nodes; their content does not need to exist, the registry just needs to respond as if the graph looks like this.
The Operator Lifecycle Manager is composed of two Operators: the OLM Operator and the Catalog Operator.
Each of these Operators is responsible for managing the Custom Resource Definitions (CRDs) that are the basis for the OLM framework:
Resource | Short name | Owner | Description |
---|---|---|---|
ClusterServiceVersion |
|
OLM |
Application metadata: name, version, icon, required resources, installation, etc. |
InstallPlan |
|
Catalog |
Calculated list of resources to be created in order to automatically install or upgrade a CSV. |
CatalogSource |
|
Catalog |
A repository of CSVs, CRDs, and packages that define an application. |
Subscription |
|
Catalog |
Used to keep CSVs up to date by tracking a channel in a package. |
OperatorGroup |
|
OLM |
Used to group multiple namespaces and prepare them for use by an Operator. |
Each of these Operators is also responsible for creating resources:
Resource | Owner |
---|---|
Deployments |
OLM |
ServiceAccounts |
|
(Cluster)Roles |
|
(Cluster)RoleBindings |
|
Custom Resource Definitions (CRDs) |
Catalog |
ClusterServiceVersions (CSVs) |
The OLM Operator is responsible for deploying applications defined by CSV resources after the required resources specified in the CSV are present in the cluster.
The OLM Operator is not concerned with the creation of the required resources; users can choose to manually create these resources using the CLI, or users can choose to create these resources using the Catalog Operator. This separation of concern allows users incremental buy-in in terms of how much of the OLM framework they choose to leverage for their application.
While the OLM Operator is often configured to watch all namespaces, it can also be operated alongside other OLM Operators so long as they all manage separate namespaces.
Watches for ClusterServiceVersion (CSVs) in a namespace and checks that requirements are met. If so, runs the install strategy for the CSV.
A CSV must be an active member of an OperatorGroup in order for the install strategy to be run. |
The Catalog Operator is responsible for resolving and installing CSVs and the required resources they specify. It is also responsible for watching CatalogSources for updates to packages in channels and upgrading them (optionally automatically) to the latest available versions.
A user who wishes to track a package in a channel creates a Subscription resource configuring the desired package, channel, and the CatalogSource from which to pull updates. When updates are found, an appropriate InstallPlan is written into the namespace on behalf of the user.
Users can also create an InstallPlan resource directly, containing the names of the desired CSV and an approval strategy, and the Catalog Operator creates an execution plan for the creation of all of the required resources. After it is approved, the Catalog Operator creates all of the resources in an InstallPlan; this then independently satisfies the OLM Operator, which proceeds to install the CSVs.
Has a cache of CRDs and CSVs, indexed by name.
Watches for unresolved InstallPlans created by a user:
Finds the CSV matching the name requested and adds it as a resolved resource.
For each managed or required CRD, adds it as a resolved resource.
For each required CRD, finds the CSV that manages it.
Watches for resolved InstallPlans and creates all of the discovered resources for it (if approved by a user or automatically).
Watches for CatalogSources and Subscriptions and creates InstallPlans based on them.
The Catalog Registry stores CSVs and CRDs for creation in a cluster and stores metadata about packages and channels.
A package manifest is an entry in the Catalog Registry that associates a package identity with sets of CSVs. Within a package, channels point to a particular CSV. Because CSVs explicitly reference the CSV that they replace, a package manifest provides the Catalog Operator all of the information that is required to update a CSV to the latest version in a channel, stepping through each intermediate version.
The Operator Lifecycle Manager (OLM) exposes certain OLM-specific resources for use by the Prometheus-based OpenShift Container Platform cluster monitoring stack.
Name | Description |
---|---|
|
Number of CatalogSources. |
|
When reconciling a ClusterServiceVersion (CSV), present whenever a CSV version
is in any state other than |
|
Number of CSVs successfully registered. |
|
When reconciling a CSV, represents whether a CSV version is in a |
|
Monotonic count of CSV upgrades. |
|
Number of InstallPlans. |
|
Number of Subscriptions. |
|
Monotonic count of Subscription syncs. Includes the |