Deploying CloudForms on OpenShift | Installation and Configuration

Overview
Requirements
Role Variables
Pre-flight Checks
Running the Playbook
Verifying the Deployment
- Describing the CFME Pod
- Opening a Remote Shell to the CFME Pod
Manual Cleanup

Overview

Deploying Red Hat CloudForms on OpenShift Container Platform is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs), might not be functionally complete, and Red Hat does not recommend to use them for production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information on Red Hat Technology Preview features support scope, see https://access.redhat.com/support/offerings/techpreview/.

The OpenShift Container Platform installer includes an Ansible role and playbook for deploying Red Hat CloudForms 4.5 (CloudForms Management Engine 5.8, or CFME) on OpenShift Container Platform. This deployment must be run on a dedicated OpenShift Container Platform cluster, meaning that no applications or infrastructure other than those required for the CFME deployment should be running on it.

This role is based on the work in the upstream manageiq/manageiq-pods project. For additional details on configurations specific to ManageIQ (MIQ), such as optional post-installation tasks, see the project’s upstream documentation.

The terms CFME and MIQ are interchangeable in this deployment.

Requirements

Prerequisites

A running OpenShift Container Platform 3.6 cluster
NFS or other compatible persistent volume storage provider
A cluster-admin user (created by this role, if required)

Cluster Sizing

In order to avoid random deployment failures due to insufficient resources, the following minimum cluster size is recommended:

Type	Size	CPUs	Memory
Masters	1+	8	12 GB
Nodes	2+	4	8 GB
Persistent Volume (PV) Storage	25 GB	N/A	N/A

Type

Size

CPUs

Memory

Masters

12 GB

Nodes

8 GB

Persistent Volume (PV) Storage

25 GB

N/A

CFME has hard requirements for memory. CFME will not install if your infrastructure does not meet or exceed the requirements given above.

Other Sizing Considerations

Recommendations assume CFME will be the only application running on this OpenShift Container Platform cluster.
Alternatively, you can provision an infrastructure node to run registry, router, metrics, and/or logging pods.
Each CFME application pod will consume at least 3 GB of RAM on initial deployment (a blank deployment without providers).
RAM consumption will ramp up higher depending on appliance use; after providers are added, expect higher resource consumption.

Assumptions

You must meet or exceed the Cluster Sizing requirements.
Your NFS server is on your master host.
Your PV backing the NFS storage volume is mounted on /exports/.

NFS must export the following directories to back the PVs:

/exports/miq-pv0[123]

If the required directories are not present during installation, they will be created using the recommended permissions per the upstream documentation:

UID/GID: root/root
Mode: 0775

If you are using a separate volume (/dev/vdX) for NFS storage, ensure it is mounted on /exports/ before running this role.

Role Variables

Core variables in this role:

Name Default Value Description

Name	Default Value	Description
`openshift_cfme_install_app`	`False`	`True`: Install everything and create a new CFME application. `False`: Only install the templates and scaffolding.

openshift_cfme_install_app

False

True: Install everything and create a new CFME application. False: Only install the templates and scaffolding.

Variables that you can override have defaults defined in /usr/share/ansible/openshift-ansible/roles/openshift_Cfme/defaults/main.yml.

Pre-flight Checks

As documented in Prerequisites, you must already have your OpenShift Container Platform cluster up and running.

The CFME pod is fairly large (about 1.7 GB), so to save some spin-up time post-deployment, you can optionally begin pre-pulling the following to each of your nodes now:

# docker pull registry.access.redhat.com/cloudforms45/cfme-openshift-app
# docker pull registry.access.redhat.com/cloudforms45/cfme-openshift-postgresql
# docker pull registry.access.redhat.com/cloudforms45/cfme-openshift-memcached

Running the Playbook

Do not re-run the entrypoint playbook multiple times in a row without cleaning up after previous runs if some of the CFME steps have ran. This is a known issue. Further instructions are provided in Manual Clean-up.

To run the playbook to deploy CFME:

In your existing inventory file, set the openshift_cfme_install_app parameter to True under the [OSEv3:vars] section:
```
[OSEv3:vars]
openshift_cfme_install_app=True
```

Run the following entrypoint playbook to install CFME using your existing inventory file:

# ansible-playbook -v [-i /path/to/file] \
    /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cfme/config.yml

Verifying the Deployment

After the installation completes, the playbook shows the following information:

TASK [openshift_cfme : Status update] *********************************************************
ok: [cfme-node.example.com] => {
    "msg": "CFME has been deployed. Note that there will be a delay before it is fully initialized.\n"
}

This will take several minutes (possibly 10 or more), depending on your network connection.

Describing the CFME Pod

To gain further insight into the deployment process during initialization, use the oc describe command to view details about the CFME pod:

$ oc describe pod manageiq-0

Readiness probes will take a while to become Healthy in this output. The initial health probes will not happen for at least eight minutes depending on how long it takes you to pull down the large images. CFME is a large application so it may take a considerable amount of time for it to deploy and be marked as Healthy.

You can find which node the application is running on by checking the oc describe output, as well:

Successfully assigned manageiq-0 to <host|ip>

You can run a docker pull command on the node to monitor the progress of the image pull:

# docker pull registry.access.redhat.com/cloudforms45/cfme-openshift-app
Using default tag: latest
Trying to pull repository registry.access.redhat.com/cloudforms45/cfme-openshift-app ...
sha256:bc6baac5aeba5affe0bada1bfbe330cd2d58da82767d66b3fa9ab12471a1b0f5: Pulling from registry.access.redhat.com/cloudforms45/cfme-openshift-app

d55ab3b04d8b: Already exists
b94f985aad49: Already exists
3cd23d7690bd: Already exists
Digest: sha256:bc6baac5aeba5affe0bada1bfbe330cd2d58da82767d66b3fa9ab12471a1b0f5
Status: Image is up to date for registry.access.redhat.com/cloudforms45/cfme-openshift-app:latest

The output above demonstrates the case where the image has been successfully pulled already. If the image is not completely pulled already, then you will see multiple progress bars detailing each image layer download status.

Opening a Remote Shell to the CFME Pod

You can use the oc rsh command to open a remote shell session to the CFME pod, allowing for additional inspection and progress monitoring techniques.

On your master node, switch to the cfme project (or whatever you named it if you overrode the openshift_cfme_project variable), and check on the pod states:
```
$ oc project cfme
Now using project "cfme" on server "https://10.10.0.100:8443".

$ oc get pod
NAME                 READY     STATUS    RESTARTS   AGE
manageiq-0           0/1       Running   0          14m
memcached-1-3lk7g    1/1       Running   0          14m
postgresql-1-12slb   1/1       Running   0          14m
```
Note how the manageiq-0 pod says 0/1 under the READY column. After some time (depending on your network connection), you will be able to oc rsh into the pod to find out more of what is happening in real time:

Verify that the CFME pod has completed deploying and initializing. You can do this one of two ways:

For a simple verification, run the following command after the pod has entered a ready state:
```
$ oc rsh manageiq-0 journalctl -f -u appliance-initialize.service
```
Watch until the output says:
```
Started Initialize Appliance Database
```
At this point, you have verified that the CFME pod has completed deploying and initializing successfully.

For a more detailed verification, including a fuller explanation on the initialization process and more interactive inspection techniques:

Open a remote shell session to the manageiq pod:
```
$ oc rsh manageiq-0 bash -l
```
The oc rsh command opens a shell in your pod. In this case, it is the pod called manageiq-0. Systemd is managing the services in this pod, so you can use the list-units command to see what is running currently:
```
# systemctl list-units | grep appliance
```
If you see the appliance-initialize service running, this indicates that the basic setup is still in progress.

You can monitor the appliance-initialize process with the journalctl command:

# journalctl -f -u appliance-initialize.service
Jun 14 14:55:52 manageiq-0 appliance-initialize.sh[58]: == Checking deployment status ==
Jun 14 14:55:52 manageiq-0 appliance-initialize.sh[58]: No pre-existing EVM configuration found on region PV
Jun 14 14:55:52 manageiq-0 appliance-initialize.sh[58]: == Checking for existing data on server PV ==
Jun 14 14:55:52 manageiq-0 appliance-initialize.sh[58]: == Starting New Deployment ==
Jun 14 14:55:52 manageiq-0 appliance-initialize.sh[58]: == Applying memcached config ==
Jun 14 14:55:53 manageiq-0 appliance-initialize.sh[58]: == Initializing Appliance ==
Jun 14 14:55:57 manageiq-0 appliance-initialize.sh[58]: create encryption key
Jun 14 14:55:57 manageiq-0 appliance-initialize.sh[58]: configuring external database
Jun 14 14:55:57 manageiq-0 appliance-initialize.sh[58]: Checking for connections to the database...
Jun 14 14:56:09 manageiq-0 appliance-initialize.sh[58]: Create region starting
Jun 14 14:58:15 manageiq-0 appliance-initialize.sh[58]: Create region complete
Jun 14 14:58:15 manageiq-0 appliance-initialize.sh[58]: == Initializing PV data ==
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: == Initializing PV data backup ==
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: sending incremental file list
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: created directory /persistent/server-deploy/backup/backup_2017_06_14_145816
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/miq/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/miq/vmdb/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/miq/vmdb/REGION
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/miq/vmdb/certs/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/miq/vmdb/certs/v2_key
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/miq/vmdb/config/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/miq/vmdb/config/database.yml
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: server-data/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: server-data/var/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: server-data/var/www/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: server-data/var/www/miq/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: server-data/var/www/miq/vmdb/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: server-data/var/www/miq/vmdb/GUID
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: sent 1330 bytes  received 136 bytes  2932.00 bytes/sec
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: total size is 770  speedup is 0.53
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: == Restoring PV data symlinks ==
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: /var/www/miq/vmdb/REGION symlink is already in place, skipping
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: /var/www/miq/vmdb/config/database.yml symlink is already in place, skipping
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: /var/www/miq/vmdb/certs/v2_key symlink is already in place, skipping
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: /var/www/miq/vmdb/log symlink is already in place, skipping
Jun 14 14:58:28 manageiq-0 systemctl[304]: Removed symlink /etc/systemd/system/multi-user.target.wants/appliance-initialize.service.
Jun 14 14:58:29 manageiq-0 systemd[1]: Started Initialize Appliance Database.

Most of this output is the initial database seeding process. This process can be time consuming.

At the bottom of the log, there is a special line from the systemctl service:

Removed symlink
/etc/systemd/system/multi-user.target.wants/appliance-initialize.service

The appliance-initialize service is no longer marked as enabled. This indicates that the base application initialization is now complete.

Open a remote shell session to the manageiq pod, if you have not already:
```
$ oc rsh manageiq-0 bash -l
```

From the oc rsh session, use the ps command to monitor for the httpd processes starting. You will see output similar to the following when that stage has completed:

# ps aux | grep http
root       1941  0.0  0.1 249820  7640 ?        Ss   15:02   0:00 /usr/sbin/httpd -DFOREGROUND
apache     1942  0.0  0.0 250752  6012 ?        S    15:02   0:00 /usr/sbin/httpd -DFOREGROUND
apache     1943  0.0  0.0 250472  5952 ?        S    15:02   0:00 /usr/sbin/httpd -DFOREGROUND
apache     1944  0.0  0.0 250472  5916 ?        S    15:02   0:00 /usr/sbin/httpd -DFOREGROUND
apache     1945  0.0  0.0 250360  5764 ?        S    15:02   0:00 /usr/sbin/httpd -DFOREGROUND

Furthermore, you can find other related processes by just looking for ones with MIQ in their name:

# ps aux | grep -i miq
root        333 27.7  4.2 555884 315916 ?       Sl   14:58   3:59 MIQ Server
root       1976  0.6  4.0 507224 303740 ?       SNl  15:02   0:03 MIQ: MiqGenericWorker id: 1, queue: generic
root       1984  0.6  4.0 507224 304312 ?       SNl  15:02   0:03 MIQ: MiqGenericWorker id: 2, queue: generic
root       1992  0.9  4.0 508252 304888 ?       SNl  15:02   0:05 MIQ: MiqPriorityWorker id: 3, queue: generic
root       2000  0.7  4.0 510308 304696 ?       SNl  15:02   0:04 MIQ: MiqPriorityWorker id: 4, queue: generic
root       2008  1.2  4.0 514000 303612 ?       SNl  15:02   0:07 MIQ: MiqScheduleWorker id: 5
root       2026  0.2  4.0 517504 303644 ?       SNl  15:02   0:01 MIQ: MiqEventHandler id: 6, queue: ems
root       2036  0.2  4.0 518532 303768 ?       SNl  15:02   0:01 MIQ: MiqReportingWorker id: 7, queue: reporting
root       2044  0.2  4.0 519560 303812 ?       SNl  15:02   0:01 MIQ: MiqReportingWorker id: 8, queue: reporting
root       2059  0.2  4.0 528372 303956 ?       SNl  15:02   0:01 puma 3.3.0 (tcp://127.0.0.1:5000) [MIQ: Web Server Worker]
root       2067  0.9  4.0 529664 305716 ?       SNl  15:02   0:05 puma 3.3.0 (tcp://127.0.0.1:3000) [MIQ: Web Server Worker]
root       2075  0.2  4.0 529408 304056 ?       SNl  15:02   0:01 puma 3.3.0 (tcp://127.0.0.1:4000) [MIQ: Web Server Worker]
root       2329  0.0  0.0  10640   972 ?        S+   15:13   0:00 grep --color=auto -i miq

Finally, still in the oc rsh session, test if the application is running correctly by requesting the application homepage. If the page is available, the page title will be ManageIQ: Login:
```
# curl -s -k https://localhost | grep -A2 '<title>'
<title>
ManageIQ: Login
</title>
```

The -s flag makes curl operations silent and the -k flag to ignore errors about untrusted certificates.

Manual Cleanup

At this time, uninstallation and cleanup of CFME deployments on OpenShift Container Platform is still a manual process. You must follow these steps to fully remove CFME from your cluster:

Delete the project:
```
$ oc delete project cfme
```

Delete the PVs:

$ oc delete pv miq-pv01
$ oc delete pv miq-pv02
$ oc delete pv miq-pv03

Clean out the old PV data:

$ cd /exports/
$ find miq* -type f -delete
$ find miq* -type d -delete

Remove the NFS exports:

$ rm /etc/exports.d/openshift_cfme.exports
$ exportfs -ar

Delete the cfme user:

$ oc delete user cfme

The oc delete project cfme command will return quickly, however it will continue to operate in the background. Continue running oc get project after you have completed the other steps to monitor the pods and final project termination progress.