In OpenShift Dedicated, you can monitor your own projects in isolation from Red Hat Site Reliability Engineering (SRE) platform metrics. You can monitor your own projects without the need for an additional monitoring solution.
The OpenShift Dedicated monitoring stack is based on the Prometheus open source project and its wider ecosystem. The monitoring stack includes the following:
Default platform monitoring components.
A set of platform monitoring components are installed in the
openshift-monitoring project by default during a OpenShift Dedicated installation. Red Hat Site Reliability Engineers (SRE) use these components to monitor core cluster components including Kubernetes services. This includes critical metrics, such as CPU and memory, collected from all of the workloads in every namespace.
These components are illustrated in the Installed by default section in the following diagram.
Components for monitoring user-defined projects.
A set of user-defined project monitoring components are installed in the
openshift-user-workload-monitoring project by default during a OpenShift Dedicated installation. You can use these components to monitor services and pods in user-defined projects.
These components are illustrated in the User section in the following diagram.
Red Hat Site Reliability Engineers (SRE) monitor the following platform targets in your OpenShift Dedicated cluster:
Elasticsearch (if Logging is installed)
Fluentd (if Logging is installed)
Kubernetes API server
Kubernetes controller manager
OpenShift API server
OpenShift Controller Manager
Operator Lifecycle Manager (OLM)
OpenShift Dedicated includes an optional enhancement to the monitoring stack that enables you to monitor services and pods in user-defined projects. This feature includes the following components:
The Prometheus Operator (PO) in the
Prometheus is the monitoring system through which monitoring is provided for user-defined projects. Prometheus sends alerts to Alertmanager for processing.
The Thanos Ruler is a rule evaluation engine for Prometheus that is deployed as a separate process. In OpenShift Dedicated , Thanos Ruler provides rule and alerting evaluation for the monitoring of user-defined projects.
The Alertmanager service handles alerts received from Prometheus and Thanos Ruler. Alertmanager is also responsible for sending user-defined alerts to external notification systems. Deploying this service is optional.
All of these components are monitored by the stack and are automatically updated when OpenShift Dedicated is updated.
This glossary defines common terms that are used in OpenShift Dedicated architecture.
Alertmanager handles alerts received from Prometheus. Alertmanager is also responsible for sending the alerts to external notification systems.
Alerting rules contain a set of conditions that outline a particular state within a cluster. Alerts are triggered when those conditions are true. An alerting rule can be assigned a severity that defines how the alerts are routed.
The Cluster Monitoring Operator (CMO) is a central component of the monitoring stack. It deploys and manages Prometheus instances such as, the Thanos Querier, the Telemeter Client, and metrics targets to ensure that they are up to date. The CMO is deployed by the Cluster Version Operator (CVO).
The Cluster Version Operator (CVO) manages the lifecycle of cluster Operators, many of which are installed in OpenShift Dedicated by default.
A config map provides a way to inject configuration data into pods. You can reference the data stored in a config map in a volume of type
ConfigMap. Applications running in a pod can use this data.
A container is a lightweight and executable image that includes software and all its dependencies. Containers virtualize the operating system. As a result, you can run containers anywhere from a data center to a public or private cloud as well as a developer’s laptop.
A CR is an extension of the Kubernetes API. You can create custom resources.
etcd is the key-value store for OpenShift Dedicated, which stores the state of all resource objects.
Fluentd gathers logs from nodes and feeds them to Elasticsearch.
Runs on nodes and reads the container manifests. Ensures that the defined containers have started and are running.
Kubernetes API server validates and configures data for the API objects.
Kubernetes controller manager governs the state of the cluster.
Kubernetes scheduler allocates pods to nodes.
Labels are key-value pairs that you can use to organize and select subsets of objects such as a pod.
A worker machine in the OpenShift Dedicated cluster. A node is either a virtual machine (VM) or a physical machine.
The preferred method of packaging, deploying, and managing a Kubernetes application in an OpenShift Dedicated cluster. An Operator takes human operational knowledge and encodes it into software that is packaged and shared with customers.
OLM helps you install, update, and manage the lifecycle of Kubernetes native applications. OLM is an open source toolkit designed to manage Operators in an effective, automated, and scalable way.
Stores the data even after the device is shut down. Kubernetes uses persistent volumes to store the application data.
You can use a PVC to mount a PersistentVolume into a Pod. You can access the storage without knowing the details of the cloud environment.
The pod is the smallest logical unit in Kubernetes. A pod is comprised of one or more containers to run in a worker node.
Prometheus is the monitoring system on which the OpenShift Dedicated monitoring stack is based. Prometheus is a time-series database and a rule evaluation engine for metrics. Prometheus sends alerts to Alertmanager for processing.
The Prometheus Adapter translates Kubernetes node and pod queries for use in Prometheus. The resource metrics that are translated include CPU and memory utilization. The Prometheus Adapter exposes the cluster resource metrics API for horizontal pod autoscaling.
The Prometheus Operator (PO) in the
openshift-monitoring project creates, configures, and manages platform Prometheus and Alertmanager instances. It also automatically generates monitoring target configurations based on Kubernetes label queries.
A silence can be applied to an alert to prevent notifications from being sent when the conditions for an alert are true. You can mute an alert after the initial notification, while you work on resolving the underlying issue.
OpenShift Dedicated supports many types of storage on AWS and GCP. You can manage container storage for persistent and non-persistent data in an OpenShift Dedicated cluster.
The Thanos Ruler is a rule evaluation engine for Prometheus that is deployed as a separate process. In OpenShift Dedicated, Thanos Ruler provides rule and alerting evaluation for the monitoring of user-defined projects.
A user interface (UI) to manage OpenShift Dedicated.