×

OpenShift Container Platform provides various resources for monitoring at the cluster level.

About OpenShift Container Platform cluster monitoring

OpenShift Container Platform includes a pre-configured, pre-installed, and self-updating monitoring stack that is based on the Prometheus open source project and its wider eco-system. It provides monitoring of cluster components and includes a set of alerts to immediately notify the cluster administrator about any occurring problems and a set of Grafana dashboards. The cluster monitoring stack is only supported for monitoring OpenShift Container Platform clusters.

To ensure compatibility with future OpenShift Container Platform updates, configuring only the specified monitoring stack options is supported.

Cluster logging components

The cluster logging components are based upon Elasticsearch, Fluentd, and Kibana (EFK). The collector, Fluentd, is deployed to each node in the OpenShift Container Platform cluster. It collects all node and container logs and writes them to Elasticsearch (ES). Kibana is the centralized, web UI where users and administrators can create rich visualizations and dashboards with the aggregated data.

There are currently 5 different types of cluster logging components:

  • logStore - This is where the logs will be stored. The current implementation is Elasticsearch.

  • collection - This is the component that collects logs from the node, formats them, and stores them in the logStore. The current implementation is Fluentd.

  • visualization - This is the UI component used to view logs, graphs, charts, and so forth. The current implementation is Kibana.

  • curation - This is the component that trims logs by age. The current implementation is Curator.

  • event routing - This is the component forwards OpenShift Container Platform events to cluster logging. The current implementation is Event Router.

For more information on cluster logging, see the OpenShift Container Platform cluster logging documentation.

About Telemetry

Telemetry sends a carefully chosen subset of the cluster monitoring metrics to Red Hat. These metrics are sent continuously and describe:

  • The size of an OpenShift Container Platform cluster

  • The health and status of OpenShift Container Platform components

  • The health and status of any upgrade being performed

  • Limited usage information about OpenShift Container Platform components and features

  • Summary info about alerts reported by the cluster monitoring component

This continuous stream of data is used by Red Hat to monitor the health of clusters in real time and to react as necessary to problems that impact our customers. It also allows Red Hat to roll out OpenShift Container Platform upgrades to customers so as to minimize service impact and continuously improve the upgrade experience.

This debugging information is available to Red Hat Support and engineering teams with the same restrictions as accessing data reported via support cases. All connected cluster information is used by Red Hat to help make OpenShift Container Platform better and more intuitive to use. None of the information is shared with third parties.

Information collected by Telemetry

Primary information collected by Telemetry includes:

  • The number of updates available per cluster

  • Channel and image repository used for an update

  • The number of errors that occurred during an update

  • Progress information of running updates

  • The number of machines per cluster

  • The number of CPU cores and size of RAM of the machines

  • The number of members in the etcd cluster and number of objects currently stored in the etcd cluster

  • The number of CPU cores and RAM used per machine type - infra or master

  • The number of CPU cores and RAM used per cluster

  • Use of OpenShift Container Platform framework components per cluster

  • The version of the OpenShift Container Platform cluster

  • Health, condition, and status for any OpenShift Container Platform framework component that is installed on the cluster, for example Cluster Version Operator, Cluster Monitoring, Image Registry, and Elasticsearch for Logging

  • A unique random identifier that is generated during installation

  • The name of the platform that OpenShift Container Platform is deployed on, such as Amazon Web Services

Telemetry does not collect identifying information such as user names, passwords, or the names or addresses of user resources.

CLI troubleshooting and debugging commands

For a list of the oc client troubleshooting and debugging commands, see the OpenShift Container Platform CLI tools documentation.