To assist in troubleshooting a failed OpenShift Container Platform installation, you can gather logs from the bootstrap and control plane, or master, machines.

Prerequisites
  • You attempted to install a OpenShift Container Platform cluster, and installation failed.

  • You provided an SSH key to the installation program, and that key is in your running ssh-agent process.

Gathering logs from a failed installation

If you gave an SSH key to your installation program, you can gather data about your failed installation.

You use a different command to gather logs about an unsuccessful installation than to gather logs from a running cluster. If you must gather logs from a running cluster, use the oc adm must-gather command.

Prerequisites
  • Your OpenShift Container Platform installation failed before the bootstrap process finished. The bootstrap node must be running and accessible through SSH.

  • The ssh-agent process is active on your computer, and you provided both the ssh-agent process and the installation program the same SSH key.

  • If you tried to install a cluster on infrastructure that you provisioned, you must have the fully-qualified domain names of the control plane, or master, machines.

Procedure
  1. Generate the commands that are required to obtain the installation logs from the bootstrap and control plane machines:

    • If you used installer-provisioned infrastructure, run the following command:

      $ ./openshift-install gather bootstrap --dir=<directory> (1)
      1 installation_directory is the directory you stored the OpenShift Container Platform definition files that the installation program creates.

      For installer-provisioned infrastructure, the installation program stores information about the cluster, so you do not specify the host names or IP addresses

    • If you used infrastructure that you provisioned yourself, run the following command:

      $ ./openshift-install gather bootstrap --dir=<directory> \ (1)
          --bootstrap <bootstrap_address> \ (2)
          --master "<master_address> <master_address> <master_address>" (3)
      1 installation_directory is the directory you stored the OpenShift Container Platform definition files that the installation program creates.
      2 <bootstrap_address> is the fully-qualified domain name or IP address of the cluster’s bootstrap machine.
      3 <master_address> is the fully-qualified domain name or IP address of a control plane, or master, machine in your cluster. Specify a space-delimited list that contains all the control plane machines in your cluster.

    The command output resembles the following example:

    INFO Use the following commands to gather logs from the cluster
    INFO ssh -A core@<bootstrap_address> '/usr/local/bin/installer-gather.sh <master_address> <master_address> <master_address>'
    INFO scp core@<bootstrap_address>:~/log-bundle.tar.gz .

    You use both commands that are displayed to gather and download the logs.

  2. Gather logs from the bootstrap and master machines:

    $ ssh -A core@<bootstrap_address> '/usr/local/bin/installer-gather.sh <master_address> <master_address> <master_address>'

    You SSH into the bootstrap machine and run the gather tool, which is designed to collect as much data as possible from the bootstrap and control plane machines in your cluster and compress all of the gathered files.

    It is normal to see errors in the command output. If the command output displays the instructions to download the compressed log files, log-bundle.tar.gz, then the command succeeded.

  3. Download the compressed file that contains the logs:

    $ scp core@<bootstrap_address>:~/log-bundle.tar.gz . (1)
    1 <bootstrap_address> is the fully-qualified domain name or IP address of the bootstrap machine.

    The command to download the log files is included at the end of the gather command output.

    If you open a Red Hat support case about your installation failure, include the compressed logs in the case.

Manually gathering logs with SSH access to your host(s)

Manually gather logs in situations where must-gather or automated collection methods do not work.

Prerequisites
  • You must have SSH access to your host(s).

Procedure
  1. Collect the bootkube.service service logs from the bootstrap host using the journalctl command by running:

    $ journalctl -b -f -u bootkube.service
  2. Collect the bootstrap host’s container logs using the Podman logs. This is shown as a loop to get all of the container logs from the host:

    $ for pod in $(sudo podman ps -a -q); do sudo podman logs $pod; done
  3. Alternatively, collect the host’s container logs using the tail command by running:

    # tail -f /var/lib/containers/storage/overlay-containers/*/userdata/ctr.log
  4. Collect the kubelet.service and crio.service service logs from the master and worker hosts using the journalctl command by running:

    $ journalctl -b -f -u kubelet.service -u crio.service
  5. Collect the master and worker host container logs using the tail command by running:

    $ sudo tail -f /var/log/containers/*

Manually gathering logs without SSH access to your host(s)

Manually gather logs in situations where must-gather or automated collection methods do not work.

If you do not have SSH access to your node, you can access the systems journal to investigate what is happening on your host.

Prerequisites
  • Your OpenShift Container Platform installation must be complete.

  • Your API service is still functional.

  • You have system administrator privileges.

Procedure
  1. Access journald unit logs under /var/log by running:

    $ oc adm node-logs --role=master -u kubelet
  2. Access host file paths under /var/log by running:

    $ oc adm node-logs --role=master --path=openshift-apiserver