The File Integrity Operator is an OpenShift Container Platform Operator that continually runs file integrity checks on the cluster nodes. It deploys a daemon set that initializes and runs privileged advanced intrusion detection environment (AIDE) containers on each node, providing a status object with a log of files that are modified during the initial run of the daemon set pods.

Currently, only Red Hat Enterprise Linux CoreOS (RHCOS) nodes are supported.

Understanding the FileIntegrity custom resource

An instance of a FileIntegrity custom resource (CR) represents a set of continuous file integrity scans for one or more nodes.

Each FileIntegrity CR is backed by a daemon set running AIDE on the nodes matching the FileIntegrity CR specification.

The following example FileIntegrity CR enables scans on only the worker nodes, but otherwise uses the defaults.

Example FileIntegrity CR
kind: FileIntegrity
  name: worker-fileintegrity
  namespace: openshift-file-integrity
  nodeSelector: ""
  config: {}

Checking the FileIntegrity custom resource status

The FileIntegrity custom resource (CR) reports its status through the .status.phase subresource.

  • To query the FileIntegrity CR status, run:

    $ oc get fileintegrities/worker-fileintegrity  -o jsonpath="{ .status.phase }"
    Example output

FileIntegrity custom resource phases

  • Pending - The phase after the custom resource (CR) is created.

  • Active - The phase when the backing daemon set is up and running.

  • Initializing - The phase when the AIDE database is being reinitialized.

Understanding the FileIntegrityNodeStatuses object

The scan results of the FileIntegrity CR are reported in another object called FileIntegrityNodeStatuses.

$ oc get fileintegritynodestatuses
Example output
NAME                                                AGE
worker-fileintegrity-ip-10-0-130-192.ec2.internal   101s
worker-fileintegrity-ip-10-0-147-133.ec2.internal   109s
worker-fileintegrity-ip-10-0-165-160.ec2.internal   102s

The FileIntegrityNodeStatus object might not be created until the second run of the scanner is finished. The period is configurable.

There is one result object per node. The nodeName attribute of each FileIntegrityNodeStatus object corresponds to the node being scanned. The status of the file integrity scan is represented in the results array, which holds scan conditions.

$ oc get -ojsonpath='{.items[*].results}' | jq

The fileintegritynodestatus object reports the latest status of an AIDE run and exposes the status as Failed, Succeeded, or Errored in a status field.

$ oc get fileintegritynodestatuses -w
Example output
NAME                                                               NODE                                         STATUS   Succeeded   Succeeded   Succeeded   Succeeded    Failed   Succeeded   Succeeded   Succeeded    Failed   Succeeded   Succeeded

FileIntegrityNodeStatus CR status types

These conditions are reported in the results array of the corresponding FileIntegrityNodeStatus CR status:

  • Succeeded - The integrity check passed; the files and directories covered by the AIDE check have not been modified since the database was last initialized.

  • Failed - The integrity check failed; some files or directories covered by the AIDE check have been modified since the database was last initialized.

  • Errored - The AIDE scanner encountered an internal error.

FileIntegrityNodeStatus CR success example

Example output of a condition with a success status
    "condition": "Succeeded",
    "lastProbeTime": "2020-09-15T12:45:57Z"
    "condition": "Succeeded",
    "lastProbeTime": "2020-09-15T12:46:03Z"
    "condition": "Succeeded",
    "lastProbeTime": "2020-09-15T12:45:48Z"

In this case, all three scans succeeded and so far there are no other conditions.

FileIntegrityNodeStatus CR failure status example

To simulate a failure condition, modify one of the files AIDE tracks. For example, modify /etc/resolv.conf on one of the worker nodes:

$ oc debug node/ip-10-0-130-192.ec2.internal
Example output
Creating debug namespace/openshift-debug-node-ldfbj ...
Starting pod/ip-10-0-130-192ec2internal-debug ...
To use host binaries, run `chroot /host`
Pod IP:
If you don't see a command prompt, try pressing enter.
sh-4.2# echo "# integrity test" >> /host/etc/resolv.conf
sh-4.2# exit

Removing debug pod ...
Removing debug namespace/openshift-debug-node-ldfbj ...

After some time, the Failed condition is reported in the results array of the corresponding FileIntegrityNodeStatus object. The previous Succeeded condition is retained, which allows you to pinpoint the time the check failed.

$ oc get -ojsonpath='{.results}' | jq -r

Alternatively, if you are not mentioning the object name, run:

$ oc get -ojsonpath='{.items[*].results}' | jq
Example output
    "condition": "Succeeded",
    "lastProbeTime": "2020-09-15T12:54:14Z"
    "condition": "Failed",
    "filesChanged": 1,
    "lastProbeTime": "2020-09-15T12:57:20Z",
    "resultConfigMapName": "aide-ds-worker-fileintegrity-ip-10-0-130-192.ec2.internal-failed",
    "resultConfigMapNamespace": "openshift-file-integrity"

The Failed condition points to a config map that gives more details about what exactly failed and why:

$ oc describe cm aide-ds-worker-fileintegrity-ip-10-0-130-192.ec2.internal-failed
Example output
Name:         aide-ds-worker-fileintegrity-ip-10-0-130-192.ec2.internal-failed
Namespace:    openshift-file-integrity
Annotations: 0


AIDE 0.15.1 found differences between database and filesystem!!
Start timestamp: 2020-09-15 12:58:15

  Total number of files:  31553
  Added files:                0
  Removed files:            0
  Changed files:            1

Changed files:

changed: /hostroot/etc/resolv.conf

Detailed information about changes:

File: /hostroot/etc/resolv.conf
 SHA512   : sTQYpB/AL7FeoGtu/1g7opv6C+KT1CBJ , qAeM+a8yTgHPnIHMaRlS+so61EN8VOpg

Events:  <none>

Due to the config map data size limit, AIDE logs over 1 MB are added to the failure config map as a base64-encoded gzip archive. In this case, you want to pipe the output of the above command to base64 --decode | gunzip. Compressed logs are indicated by the presence of a annotation key in the config map.