Overview

OpenShift Enterprise clusters can be provisioned with persistent storage using GlusterFS.

Persistent volumes (PVs) and persistent volume claims (PVCs) can share volumes across a single project. While the GlusterFS-specific information contained in a PV definition could also be defined directly in a pod definition, doing so does not create the volume as a distinct cluster resource, making the volume more susceptible to conflicts.

This topic presumes some familiarity with OpenShift Enterprise and GlusterFS. See the Persistent Storage topic for details on the OpenShift Enterprise PV framework in general.

High-availability of storage in the infrastructure is left to the underlying storage provider.

Provisioning

To provision GlusterFS volumes the following are required:

  • An existing storage device in your underlying infrastructure.

  • A distinct list of servers (IP addresses) in the Gluster cluster, to be defined as endpoints.

  • A service, to persist the endpoints (optional).

  • An existing Gluster volume to be referenced in the persistent volume object.

  • glusterfs-fuse installed on each schedulable OpenShift Enterprise node in your cluster:

    # yum install glusterfs-fuse

OpenShift Enterprise nodes can also host a gluster node (referred to as hyperconverged storage). However, performance may be less predictable and harder to manage.

Creating Gluster Endpoints

An endpoints definition defines the GlusterFS cluster as EndPoints and includes the IP addresses of your Gluster servers. The port value can be any numeric value within the accepted range of ports. Optionally, you can create a service that persists the endpoints.

  1. Define the following service:

    Example 1. Gluster Service Definition
    apiVersion: v1
    kind: Service
    metadata:
      name: glusterfs-cluster (1)
    spec:
      ports:
      - port: 1
    1 This name must be defined in the endpoints definition to match the endpoints to this service
  2. Save the service definition to a file, for example gluster-service.yaml, then create the service:

    $ oc create -f gluster-service.yaml
  3. Verify that the service was created:

    # oc get services
    NAME                       CLUSTER_IP       EXTERNAL_IP   PORT(S)    SELECTOR        AGE
    glusterfs-cluster          172.30.205.34    <none>        1/TCP      <none>          44s
  4. Define the Gluster endpoints:

    Example 2. Gluster Endpoints Definition
    apiVersion: v1
    kind: Endpoints
    metadata:
      name: glusterfs-cluster (1)
    subsets:
      - addresses:
          - ip: 192.168.122.221 (2)
        ports:
          - port: 1
      - addresses:
          - ip: 192.168.122.222 (2)
        ports:
          - port: 1  (3)
    1 This name must match the service name from step 1.
    2 The ip values must be the actual IP addresses of a Gluster server, not fully-qualified host names.
    3 The port number is ignored.
  5. Save the endpoints definition to a file, for example gluster-endpoints.yaml, then create the endpoints:

    $ oc create -f gluster-endpoints.yaml
    endpoints "glusterfs-cluster" created
  6. Verify that the endpoints were created:

    $ oc get endpoints
    NAME                ENDPOINTS                             AGE
    docker-registry     10.1.0.3:5000                         4h
    glusterfs-cluster   192.168.122.221:1,192.168.122.222:1   11s
    kubernetes          172.16.35.3:8443                      4d

Creating the Persistent Volume

  1. Next, define the PV in an object definition before creating it in OpenShift Enterprise:

    Example 3. Persistent Volume Object Definition Using GlusterFS
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: gluster-default-volume (1)
    spec:
      capacity:
        storage: 2Gi (2)
      accessModes: (3)
        - ReadWriteMany
      glusterfs: (4)
        endpoints: glusterfs-cluster (5)
        path: myVol1 (6)
        readOnly: false
      persistentVolumeReclaimPolicy: Recycle
    1 The name of the volume. This is how it is identified via persistent volume claims or from pods.
    2 The amount of storage allocated to this volume.
    3 accessModes are used as labels to match a PV and a PVC. They currently do not define any form of access control.
    4 The volume type being used, in this case the glusterfs plug-in.
    5 The endpoints name that defines the Gluster cluster created in Creating Gluster Endpoints.
    6 The Gluster volume that will be accessed, as shown in the gluster volume status command.
  2. Save the definition to a file, for example gluster-pv.yaml, and create the persistent volume:

    # oc create -f gluster-pv.yaml
  3. Verify that the persistent volume was created:

    # oc get pv
    NAME                     LABELS    CAPACITY     ACCESSMODES   STATUS      CLAIM     REASON    AGE
    gluster-default-volume   <none>    2147483648   RWX           Available                       2s

Creating the Persistent Volume Claim

Developers request GlusterFS storage by referencing either a PVC or the Gluster volume plug-in directly in the volumes section of a pod spec. A PVC exists only in the user’s project and can only be referenced by pods within that project. Any attempt to access a PV across a project causes the pod to fail.

  1. Create a PVC that will bind to the new PV:

    Example 4. PVC Object Definition
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: gluster-claim
    spec:
      accessModes:
      - ReadWriteMany (1)
      resources:
         requests:
           storage: 1Gi  (2)
    1 accessModes do not enforce security, but rather act as labels to match a PV to a PVC.
    2 This claim will look for PVs offering 1Gi or greater capacity.
  2. Save the definition to a file, for example gluster-claim.yaml, and create the PVC:

    # oc create -f gluster-claim.yaml

    PVs and PVCs make sharing a volume across a project simpler. The gluster-specific information contained in the PV definition can also be defined directly in a pod specification.

Gluster Volume Security

This section covers Gluster volume security, including matching permissions and SELinux considerations. Understanding the basics of POSIX permissions, process UIDs, supplemental groups, and SELinux is presumed.

See the full Volume Security topic before implementing Gluster volumes.

As an example, assume that the target Gluster volume, HadoopVol is mounted under /mnt/glusterfs/, with the following POSIX permissions and SELinux labels:

# ls -lZ /mnt/glusterfs/
drwxrwx---. yarn hadoop system_u:object_r:fusefs_t:s0    HadoopVol

# id yarn
uid=592(yarn) gid=590(hadoop) groups=590(hadoop)

In order to access the HadoopVol volume, containers must match the SELinux label, and run with a UID of 592 or 590 in their supplemental groups. The OpenShift Enterprise GlusterFS plug-in mounts the volume in the container with the same POSIX ownership and permissions found on the target gluster mount, namely the owner will be 592 and group ID will be 590. However, the container is not run with its effective UID equal to 592, nor with its GID equal to 590, which is the desired behavior. Instead, a container’s UID and supplemental groups are determined by Security Context Constraints (SCCs) and the project defaults.

Group IDs

Configure Gluster volume access by using supplemental groups, assuming it is not an option to change permissions on the Gluster mount. Supplemental groups in OpenShift Enterprise are used for shared storage, such as GlusterFS. In contrast, block storage, such as Ceph RBD or iSCSI, use the fsGroup SCC strategy and the fsGroup value in the pod’s securityContext.

Use supplemental group IDs instead of user IDs to gain access to persistent storage. Supplemental groups are covered further in the full Volume Security topic.

The group ID on the target Gluster mount example above is 590. Therefore, a pod can define that group ID using supplementalGroups under the pod-level securityContext definition. For example:

spec:
  containers:
    - name:
    ...
  securityContext: (1)
    supplementalGroups: [590] (2)
1 securityContext must be defined at the pod level, not under a specific container.
2 An array of GIDs defined at the pod level.

Assuming there are no custom SCCs that satisfy the pod’s requirements, the pod matches the restricted SCC. This SCC has the supplementalGroups strategy set to RunAsAny, meaning that any supplied group IDs are accepted without range checking.

As a result, the above pod will pass admissions and can be launched. However, if group ID range checking is desired, use a custom SCC, as described in pod security and custom SCCs. A custom SCC can be created to define minimum and maximum group IDs, enforce group ID range checking, and allow a group ID of 590.

User IDs

User IDs can be defined in the container image or in the pod definition. The full Volume Security topic covers controlling storage access based on user IDs, and should be read prior to setting up NFS persistent storage.

Use supplemental group IDs instead of user IDs to gain access to persistent storage.

In the target Gluster mount example above, the container needs a UID set to 592, so the following can be added to the pod definition:

spec:
  containers: (1)
  - name:
  ...
    securityContext:
      runAsUser: 592  (2)
1 Pods contain a securtityContext specific to each container and a pod-level securityContext, which applies to all containers defined in the pod.
2 The UID defined on the Gluster mount.

With the default project and the restricted SCC, a pod’s requested user ID of 592 will not be allowed, and the pod will fail. This is because:

  • The pod requests 592 as its user ID.

  • All SCCs available to the pod are examined to see which SCC will allow a user ID of 592.

  • Because all available SCCs use MustRunAsRange for their runAsUser strategy, UID range checking is required.

  • 592 is not included in the SCC or project’s user ID range.

Do not modify the predefined SCCs. Insead, create a custom SCC so that minimum and maximum user IDs are defined, UID range checking is still enforced, and the UID of 592 will be allowed.

SELinux

See the full Volume Security topic for information on controlling storage access in conjunction with using SELinux.

By default, SELinux does not allow writing from a pod to a remote Gluster server.

To enable writing to GlusterFS volumes with SELinux enforcing on each node, run:

$ sudo setsebool -P virt_sandbox_use_fusefs on

The virt_sandbox_use_fusefs boolean is defined by the docker-selinux package. If you get an error saying it is not defined, please ensure that this package is installed.

The -P option makes the bool persistent between reboots.