This documentation is work in progress and might not be complete or fully tested.
This documentation is work in progress and might not be complete or fully tested.

OpenShift Container Platform provides various resources for monitoring at the cluster level.

About OpenShift Container Platform cluster monitoring

OpenShift Container Platform includes a pre-configured, pre-installed, and self-updating monitoring stack that is based on the Prometheus open source project and its wider eco-system. It provides monitoring of cluster components and includes a set of alerts to immediately notify the cluster administrator about any occurring problems and a set of Grafana dashboards. The cluster monitoring stack is only supported for monitoring OpenShift Container Platform clusters.

To ensure compatibility with future OpenShift Container Platform updates, configuring only the specified monitoring stack options is supported.

About cluster logging components

The cluster logging components include a collector deployed to each node in the OpenShift Container Platform cluster that collects all node and container logs and writes them to a log store. You can use a centralized web UI to create rich visualizations and dashboards with the aggregated data.

The major components of cluster logging are:

  • collection - This is the component that collects logs from the cluster, formats them, and forwards them to the log store. The current implementation is Fluentd.

  • log store - This is where the logs are stored. The default implementation is Elasticsearch. You can use the default Elasticsearch log store or forward logs to external log stores. The default log store is optimized and tested for short-term storage.

  • visualization - This is the UI component you can use to view logs, graphs, charts, and so forth. The current implementation is Kibana.

For more information on cluster logging, see the OpenShift Container Platform cluster logging documentation.

About Telemetry

Telemetry sends a carefully chosen subset of the cluster monitoring metrics to Red Hat. The Telemeter Client fetches the metrics values every four minutes and thirty seconds and uploads the data to Red Hat. These metrics are described in this document.

This stream of data is used by Red Hat to monitor the clusters in real-time and to react as necessary to problems that impact our customers. It also allows Red Hat to roll out OpenShift Container Platform upgrades to customers to minimize service impact and continuously improve the upgrade experience.

This debugging information is available to Red Hat Support and Engineering teams with the same restrictions as accessing data reported through support cases. All connected cluster information is used by Red Hat to help make OpenShift Container Platform better and more intuitive to use.

Information collected by Telemetry

The following information is collected by Telemetry:

  • The unique random identifier that is generated during an installation

  • Version information, including the OpenShift Container Platform cluster version and installed update details that are used to determine update version availability

  • Update information, including the number of updates available per cluster, the channel and image repository used for an update, update progress information, and the number of errors that occur in an update

  • The name of the provider platform that OpenShift Container Platform is deployed on and the data center location

  • Sizing information about clusters, machine types, and machines, including the number of CPU cores and the amount of RAM used for each

  • The number of running virtual machine instances in a cluster

  • The number of etcd members and the number of objects stored in the etcd cluster

  • The OpenShift Container Platform framework components installed in a cluster and their condition and status

  • Usage information about components, features, and extensions

  • Usage details about Technology Previews and unsupported configurations

  • Information about degraded software

  • Information about nodes that are marked as NotReady

  • Events for all namespaces listed as "related objects" for a degraded Operator

  • Configuration details that help Red Hat Support to provide beneficial support for customers. This includes node configuration at the cloud infrastructure level, host names, IP addresses, Kubernetes pod names, namespaces, and services.

  • Information about the validity of certificates

Telemetry does not collect identifying information such as user names, or passwords. Red Hat does not intend to collect personal information. If Red Hat discovers that personal information has been inadvertently received, Red Hat will delete such information. To the extent that any telemetry data constitutes personal data, please refer to the Red Hat Privacy Statement for more information about Red Hat’s privacy practices.

CLI troubleshooting and debugging commands

For a list of the oc client troubleshooting and debugging commands, see the OpenShift Container Platform CLI tools documentation.