OpenShift Container Platform provides various resources for monitoring at the cluster level.

About OpenShift Container Platform monitoring

OpenShift Container Platform includes a pre-configured, pre-installed, and self-updating monitoring stack that provides monitoring for core platform components. OpenShift Container Platform delivers monitoring best practices out of the box. A set of alerts are included by default that immediately notify cluster administrators about issues with a cluster. Default dashboards in the OpenShift Container Platform web console include visual representations of cluster metrics to help you to quickly understand the state of your cluster.

After installing OpenShift Container Platform 4.6, cluster administrators can optionally enable monitoring for user-defined projects. By using this feature, cluster administrators, developers, and other users can specify how services and pods are monitored in their own projects. You can then query metrics, review dashboards, and manage alerting rules and silences for your own projects in the OpenShift Container Platform web console.

Cluster administrators can grant developers and other users permission to monitor their own projects. Privileges are granted by assigning one of the predefined monitoring roles.

About cluster logging components

The cluster logging components include a collector deployed to each node in the OpenShift Container Platform cluster that collects all node and container logs and writes them to a log store. You can use a centralized web UI to create rich visualizations and dashboards with the aggregated data.

The major components of cluster logging are:

  • collection - This is the component that collects logs from the cluster, formats them, and forwards them to the log store. The current implementation is Fluentd.

  • log store - This is where the logs are stored. The default implementation is Elasticsearch. You can use the default Elasticsearch log store or forward logs to external log stores. The default log store is optimized and tested for short-term storage.

  • visualization - This is the UI component you can use to view logs, graphs, charts, and so forth. The current implementation is Kibana.

For more information on cluster logging, see the OpenShift Container Platform cluster logging documentation.

About Telemetry

Telemetry sends a carefully chosen subset of the cluster monitoring metrics to Red Hat. These metrics are sent continuously and describe:

  • The size of an OpenShift Container Platform cluster

  • The health and status of OpenShift Container Platform components

  • The health and status of any upgrade being performed

  • Limited usage information about OpenShift Container Platform components and features

  • Summary info about alerts reported by the cluster monitoring component

This continuous stream of data is used by Red Hat to monitor the health of clusters in real time and to react as necessary to problems that impact our customers. It also allows Red Hat to roll out OpenShift Container Platform upgrades to customers so as to minimize service impact and continuously improve the upgrade experience.

This debugging information is available to Red Hat Support and engineering teams with the same restrictions as accessing data reported via support cases. All connected cluster information is used by Red Hat to help make OpenShift Container Platform better and more intuitive to use. None of the information is shared with third parties.

Information collected by Telemetry

Primary information collected by Telemetry includes:

  • The number of updates available per cluster

  • Channel and image repository used for an update

  • The number of errors that occurred during an update

  • Progress information of running updates

  • The number of machines per cluster

  • The number of CPU cores and size of RAM of the machines

  • The number of members in the etcd cluster and number of objects currently stored in the etcd cluster

  • The number of CPU cores and RAM used per machine type - infra or master

  • The number of CPU cores and RAM used per cluster

  • The number of running virtual machine instances in the cluster

  • Use of OpenShift Container Platform framework components per cluster

  • The version of the OpenShift Container Platform cluster

  • Health, condition, and status for any OpenShift Container Platform framework component that is installed on the cluster, for example Cluster Version Operator, Cluster Monitoring, Image Registry, and Elasticsearch for Logging

  • A unique random identifier that is generated during installation

  • The name of the platform that OpenShift Container Platform is deployed on, such as Amazon Web Services

Telemetry does not collect identifying information such as user names, passwords, or the names or addresses of user resources.

CLI troubleshooting and debugging commands

For a list of the oc client troubleshooting and debugging commands, see the OpenShift Container Platform CLI tools documentation.