When you submit a support case to Red Hat Support, it is helpful to provide debugging information for Red Hat OpenShift Service on AWS and OpenShift Virtualization by using the following tools:
Prometheus is a time-series database and a rule evaluation engine for metrics. Prometheus sends alerts to Alertmanager for processing.
The Alertmanager service handles alerts received from Prometheus. The Alertmanager is also responsible for sending the alerts to external notification systems.
For information about the Red Hat OpenShift Service on AWS monitoring stack, see About Red Hat OpenShift Service on AWS monitoring.
Collecting data about your environment minimizes the time required to analyze and determine the root cause.
Set the retention time for Prometheus metrics data to a minimum of seven days.
Configure the Alertmanager to capture relevant alerts and to send alert notifications to a dedicated mailbox so that they can be viewed and persisted outside the cluster.
Record the exact number of affected nodes and virtual machines.
Collecting data about malfunctioning virtual machines (VMs) minimizes the time required to analyze and determine the root cause.
Linux VMs: Install the latest QEMU guest agent.
Windows VMs:
Record the Windows patch update details.
If Remote Desktop Protocol (RDP) is enabled, connect by using the desktop viewer to determine whether there is a problem with the connection software.
Collect screenshots of VMs that have crashed before you restart them.
Collect memory dumps from VMs before remediation attempts.
Record factors that the malfunctioning VMs have in common. For example, the VMs have the same host or network.