Overview

Following an OpenShift Enterprise upgrade, it may be desirable in extreme cases to downgrade your cluster to a previous version. The following sections outline the required steps for each system in a cluster to perform such a downgrade for the OpenShift Enterprise 3.2 to 3.1 downgrade path.

These steps are currently only supported for RPM-based installations of OpenShift Enterprise and assumes downtime of the entire cluster.

For the OpenShift Enterprise 3.1 to 3.0 downgrade path, see the OpenShift Enterprise 3.1 documentation, which has modified steps.

Verifying Backups

The Ansible playbook used during the upgrade process should have created a backup of the master-config.yaml file and the etcd data directory. Ensure these exist on your masters and etcd members:

/etc/origin/master/master-config.yaml.<timestamp>
/var/lib/origin/etcd-backup-<timestamp>

Also, back up the node-config.yaml file on each node (including masters, which have the node component on them) with a timestamp:

/etc/origin/node/node-config.yaml.<timestamp>

When using an external etcd cluster, the backup is created on all etcd members, though only one is required for the recovery process.

The RPM downgrade process in a later step should create .rpmsave backups of the following files, but it may be a good idea to keep a separate copy regardless:

/etc/sysconfig/atomic-openshift-master
/etc/etcd/etcd.conf (1)
1 Only required if using external etcd.

Shutting Down the Cluster

On all masters, nodes, and etcd members (if using an external etcd cluster), ensure the relevant services are stopped.

On the master in a single master cluster:

# systemctl stop atomic-openshift-master

On each master in a multi-master cluster:

# systemctl stop atomic-openshift-master-api
# systemctl stop atomic-openshift-master-controllers

On all master and node hosts:

# systemctl stop atomic-openshift-node

On any external etcd hosts:

# systemctl stop etcd

Removing RPMs

On all masters, nodes, and etcd members (if using an external etcd cluster), remove the following packages:

# yum remove atomic-openshift \
    atomic-openshift-clients \
    atomic-openshift-node \
    atomic-openshift-master \
    openvswitch \
    atomic-openshift-sdn-ovs \
    tuned-profiles-atomic-openshift-node

If you are using external etcd, also remove the etcd package:

# yum remove etcd

If using the embedded etcd, leave the etcd package installed. It is required for running the etcdctl command to issue operations in later steps.

Downgrading Docker

OpenShift Enterprise 3.2 requires Docker 1.9.1 and also supports Docker 1.10.3, however OpenShift Enterprise 3.1 requires Docker 1.8.2.

Downgrade to Docker 1.8.2 on each host using the following steps:

  1. Remove all local containers and images on the host. Any pods backed by a replication controller will be recreated.

    The following commands are destructive and should be used with caution.

    Delete all containers:

    # docker rm $(docker ps -a -q)

    Delete all images:

    # docker rmi $(docker images -q)
  2. Use yum swap (instead of yum downgrade) to install Docker 1.8.2:

    # yum swap docker-* docker-*1.8.2
    # sed -i 's/--storage-opt dm.use_deferred_deletion=true//' /etc/sysconfig/docker-storage
    # systemctl restart docker
  3. You should now have Docker 1.8.2 installed and running on the host. Verify with the following:

    # docker version
    Client:
     Version:      1.8.2-el7
     API version:  1.20
     Package Version: docker-1.8.2-10.el7.x86_64
    [...]
    
    # systemctl status docker
    ‚óŹ docker.service - Docker Application Container Engine
       Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
       Active: active (running) since Mon 2016-06-27 15:44:20 EDT; 33min ago
    [...]

Reinstalling RPMs

Disable the OpenShift Enterprise 3.3 repositories, and re-enable the 3.2 repositories:

# subscription-manager repos \
    --disable=rhel-7-server-ose-3.3-rpms \
    --enable=rhel-7-server-ose-3.2-rpms

On each master, install the following packages:

# yum install atomic-openshift \
    atomic-openshift-clients \
    atomic-openshift-node \
    atomic-openshift-master \
    openvswitch \
    atomic-openshift-sdn-ovs \
    tuned-profiles-atomic-openshift-node

On each node, install the following packages:

# yum install atomic-openshift \
    atomic-openshift-node \
    openvswitch \
    atomic-openshift-sdn-ovs \
    tuned-profiles-atomic-openshift-node

If using an external etcd cluster, install the following package on each etcd member:

# yum install etcd

Restoring etcd

Bringing OpenShift Enterprise Services Back Online

Verifying the Downgrade

To verify the downgrade, first check that all nodes are marked as Ready:

# oc get nodes
NAME                        STATUS                     AGE
master.example.com          Ready,SchedulingDisabled   165d
node1.example.com           Ready                      165d
node2.example.com           Ready                      165d

Then, verify that you are running the expected versions of the docker-registry and router images, if deployed:

# oc get -n default dc/docker-registry -o json | grep \"image\"
    "image": "openshift3/ose-docker-registry:v3.1.1.6",
# oc get -n default dc/router -o json | grep \"image\"
    "image": "openshift3/ose-haproxy-router:v3.1.1.6",

You can use the diagnostics tool on the master to look for common issues and provide suggestions. In OpenShift Enterprise 3.1, the oc adm diagnostics tool is available as openshift ex diagnostics:

# openshift ex diagnostics
...
[Note] Summary of diagnostics execution:
[Note] Completed with no errors or warnings seen.