×

OpenShift Container Platform uses Fluentd to collect operations and application logs from your cluster which OpenShift Container Platform enriches with Kubernetes Pod and Namespace metadata.

You can configure log rotation, log location, use an external log aggregator, and make other configurations.

You must set cluster logging to Unmanaged state before performing these configurations, unless otherwise noted. For more information, see Changing the cluster logging management state.

Viewing Fluentd pods

You can use the oc get pods -o wide command to see the nodes where the Fluentd pod are deployed.

Procedure

Run the following command in the openshift-logging project:

$ oc get pods -o wide | grep fluentd

NAME                         READY     STATUS    RESTARTS   AGE     IP            NODE                           NOMINATED NODE
fluentd-5mr28                1/1       Running   0          4m56s   10.129.2.12   ip-10-0-164-233.ec2.internal   <none>
fluentd-cnc4c                1/1       Running   0          4m56s   10.128.2.13   ip-10-0-155-142.ec2.internal   <none>
fluentd-nlp8z                1/1       Running   0          4m56s   10.131.0.13   ip-10-0-138-77.ec2.internal    <none>
fluentd-rknlk                1/1       Running   0          4m56s   10.128.0.33   ip-10-0-128-130.ec2.internal   <none>
fluentd-rsm49                1/1       Running   0          4m56s   10.129.0.37   ip-10-0-163-191.ec2.internal   <none>
fluentd-wjt8s                1/1       Running   0          4m56s   10.130.0.42   ip-10-0-156-251.ec2.internal   <none>

Viewing Fluentd logs

How you view logs depends upon the LOGGING_FILE_PATH setting.

  • If LOGGING_FILE_PATH points to a file, the default, use the logs utility, from the project, where the pod is located, to print out the contents of Fluentd log files:

    $ oc exec <any-fluentd-pod> -- logs (1)
    1 Specify the name of a Fluentd pod. Note the space before logs.

    For example:

    $ oc exec fluentd-ht42r -n openshift-logging -- logs

    To view the current setting:

    oc -n openshift-logging set env daemonset/fluentd --list | grep LOGGING_FILE_PATH
  • If you are using LOGGING_FILE_PATH=console, Fluentd writes logs to stdout/stderr`. You can retrieve the logs with the oc logs [-f] <pod_name> command, where the -f is optional, from the project where the pod is located.

    $ oc logs -f <any-fluentd-pod> (1)
    1 Specify the name of a Fluentd pod. Use the -f option to follow what is being written into the logs.

    For example

    $ oc logs -f fluentd-ht42r -n openshift-logging

    The contents of log files are printed out, starting with the oldest log.

Configure Fluentd CPU and memory limits

Each component specification allows for adjustments to both the CPU and memory limits.

Procedure
  1. Edit the Cluster Logging Custom Resource (CR) in the openshift-logging project:

    $ oc edit ClusterLogging instance
    $ oc edit ClusterLogging instance
    
    apiVersion: "logging.openshift.io/v1"
    kind: "ClusterLogging"
    metadata:
      name: "instance"
    
    ....
    
    spec:
      collection:
        logs:
          fluentd:
            resources:
              limits: (1)
                cpu: 250m
                memory: 1Gi
              requests:
                cpu: 250m
                memory: 1Gi
    1 Specify the CPU and memory limits as needed. The values shown are the default values.

Configuring Fluentd log location

Fluentd writes logs to a specified file or to the default location, /var/log/fluentd/fluentd.log, based on the LOGGING_FILE_PATH environment variable.

Prerequisite

Set cluster logging to the unmanaged state.

Procedure

To set the output location for the Fluentd logs:

  1. Edit the LOGGING_FILE_PATH parameter in the fluentd daemonset. You can specify a particular file or console:

    spec:
      template:
        spec:
          containers:
              env:
                - name: LOGGING_FILE_PATH
                  value: console (1)
    
    LOGGING_FILE_PATH= (2)
    1 Specify the log output method:
    • use console to use the Fluentd default location. Retrieve the logs with the oc logs [-f] <pod_name> command.

    • use <path-to-log/fluentd.log> to sends the log output to the specified file. Retrieve the logs with the `oc exec <pod_name> — logs command. This is the default setting.

      Or, use the CLI:

      oc -n openshift-logging set env daemonset/fluentd LOGGING_FILE_PATH=console

Throttling Fluentd logs

For projects that are especially verbose, an administrator can throttle down the rate at which the logs are read in by Fluentd before being processed. By throttling, you deliberately slow down the rate at which you are reading logs, so Kibana might take longer to display records.

Throttling can contribute to log aggregation falling behind for the configured projects; log entries can be lost if a pod is deleted before Fluentd catches up.

Throttling does not work when using the systemd journal as the log source. The throttling implementation depends on being able to throttle the reading of the individual log files for each project. When reading from the journal, there is only a single log source, no log files, so no file-based throttling is available. There is not a method of restricting the log entries that are read into the Fluentd process.

Prerequisite

Set cluster logging to the unmanaged state.

Procedure
  1. To configure Fluentd to restrict specific projects, edit the throttle configuration in the Fluentd ConfigMap after deployment:

    $ oc edit configmap/fluentd

    The format of the throttle-config.yaml key is a YAML file that contains project names and the desired rate at which logs are read in on each node. The default is 1000 lines at a time per node. For example:

throttle-config.yaml: |
  - opensift-logging:
      read_lines_limit: 10
  - .operations:
      read_lines_limit: 100

Understanding Buffer Chunk Limiting for Fluentd

If the Fluentd logger is unable to keep up with a high number of logs, it will need to switch to file buffering to reduce memory usage and prevent data loss.

Fluentd file buffering stores records in chunks. Chunks are stored in buffers.

The Fluentd buffer_chunk_limit is determined by the environment variable BUFFER_SIZE_LIMIT, which has the default value 8m. The file buffer size per output is determined by the environment variable FILE_BUFFER_LIMIT, which has the default value 256Mi. The permanent volume size must be larger than FILE_BUFFER_LIMIT multiplied by the output.

On the Fluentd pods, permanent volume /var/lib/fluentd should be prepared by the PVC or hostmount, for example. That area is then used for the file buffers.

The buffer_type and buffer_path are configured in the Fluentd configuration files as follows:

$ egrep "buffer_type|buffer_path" *.conf
output-es-config.conf:
  buffer_type file
  buffer_path `/var/lib/fluentd/buffer-output-es-config`
output-es-ops-config.conf:
  buffer_type file
  buffer_path `/var/lib/fluentd/buffer-output-es-ops-config`

The Fluentd buffer_queue_limit is the value of the variable BUFFER_QUEUE_LIMIT. This value is 32 by default.

The environment variable BUFFER_QUEUE_LIMIT is calculated as (FILE_BUFFER_LIMIT / (number_of_outputs * BUFFER_SIZE_LIMIT)).

If the BUFFER_QUEUE_LIMIT variable has the default set of values:

  • FILE_BUFFER_LIMIT = 256Mi

  • number_of_outputs = 1

  • BUFFER_SIZE_LIMIT = 8Mi

The value of buffer_queue_limit will be 32. To change the buffer_queue_limit, you must change the value of FILE_BUFFER_LIMIT.

In this formula, number_of_outputs is 1 if all the logs are sent to a single resource, and it is incremented by 1 for each additional resource. For example, the value of number_of_outputs is:

  • 1 - if all logs are sent to a single Elasticsearch pod

  • 2 - if application logs are sent to an Elasticsearch pod and ops logs are sent to another Elasticsearch pod

  • 4 - if application logs are sent to an Elasticsearch pod, ops logs are sent to another Elasticsearch pod, and both of them are forwarded to other Fluentd instances

Configuring Fluentd JSON parsing

You can configure Fluentd to inspect each log message to determine if the message is in JSON format and merge the message into the JSON payload document posted to Elasticsearch. This feature is disabled by default.

You can enable or disable this feature by editing the MERGE_JSON_LOG environment variable in the fluentd daemonset.

Enabling this feature comes with risks, including:

  • Possible log loss due to Elasticsearch rejecting documents due to inconsistent type mappings.

  • Potential buffer storage leak caused by rejected message cycling.

  • Overwrite of data for field with same names.

The features in this topic should be used by only experienced Fluentd and Elasticsearch users.

Prerequisites

Set cluster logging to the unmanaged state.

Procedure

Use the following command to enable this feature:

oc set env ds/fluentd MERGE_JSON_LOG=true (1)
1 Set this to false to disable this feature or true to enable this feature.

Setting MERGE_JSON_LOG and CDM_UNDEFINED_TO_STRING

If you set the MERGE_JSON_LOG and CDM_UNDEFINED_TO_STRING enviroment variables to true, you might receive an Elasticsearch 400 error. The error occurs because when`MERGE_JSON_LOG=true`, Fluentd adds fields with data types other than string. When you set CDM_UNDEFINED_TO_STRING=true, Fluentd attempts to add those fields as a string value resulting in the Elasticsearch 400 error. The error clears when the indices roll over for the next day.

When Fluentd rolls over the indices for the next day’s logs, it will create a brand new index. The field definitions are updated and you will not get the 400 error.

Records that have hard errors, such as schema violations, corrupted data, and so forth, cannot be retried. Fluent sends the records for error handling. If you add a <label @ERROR> section to your Fluentd config, as the last <label>, you can handle these records as needed.

For example:

data:
  fluent.conf:

....

    <label @ERROR>
      <match **>
        @type file
        path /var/log/fluent/dlq
        time_slice_format %Y%m%d
        time_slice_wait 10m
        time_format %Y%m%dT%H%M%S%z
        compress gzip
      </match>
    </label>

This section writes error records to the Elasticsearch dead letter queue (DLQ) file. See the fluentd documentation for more information about the file output.

Then you can edit the file to clean up the records manually, edit the file to use with the Elasticsearch /_bulk index API and use cURL to add those records. For more information on Elasticsearch Bulk API, see the Elasticsearch documentation.

Configuring how the log collector normalizes logs

Cluster Logging uses a specific data model, like a database schema, to store log records and their metadata in the logging store. There are some restrictions on the data:

  • There must be a "message" field containing the actual log message.

  • There must be a "@timestamp" field containing the log record timestamp in RFC 3339 format, preferably millisecond or better resolution.

  • There must be a "level" field with the log level, such as err, info, unknown, and so forth.

For more information on the data model, see Exported Fields.

Because of these requirements, conflicts and inconsistencies can arise with log data collected from different subsystems.

For example, if you use the MERGE_JSON_LOG feature (MERGE_JSON_LOG=true), it can be extremely useful to have your applications log their output in JSON, and have the log collector automatically parse and index the data in Elasticsearch. However, this leads to several problems, including:

  • field names can be empty, or contain characters that are illegal in Elasticsearch;

  • different applications in the same namespace might output the same field name with different value data types;

  • applications might emit too many fields;

  • fields may conflict with the cluster logging built-in fields.

You can configure how cluster logging treats fields from disparate sources by editing the log collector daemonset, Fluentd or Rsyslog, and setting environment variables in the table below.

  • Undefined fields. One of the problems with log data from disparate systems is that some fields might be unknown to the ViaQ data model. Such fields are called undefined. ViaQ requires all top-level fields to be defined and described.

    Use the parameters to configure how OpenShift Container Platform moves any undefined fields under a top-level field called undefined to avoid conflicting with the well known ViaQ top-level fields. You can add undefined fields to the top-level fields and move others to an undefined container.

    You can also replace special characters in undefined fields and convert undefined fields to their JSON string representation. Coverting to JSON string preserves the structure of the value, so that you can retrieve the value later and convert it back to a map or an array.

    • Simple scalar values like numbers and booleans are changed to a quoted string. For example: 10 becomes "10", 3.1415 becomes "3.1415"`, false becomes "false".

    • Map/dict values and array values are converted to their JSON string representation: "mapfield":{"key":"value"} becomes "mapfield":"{\"key\":\"value\"}" and "arrayfield":[1,2,"three"] becomes "arrayfield":"[1,2,\"three\"]".

  • Defined fields. You can also configure which defined fields appear in the top levels of the logs.

    The default top-level fields, defined through the CDM_DEFAULT_KEEP_FIELDS parameter, are CEE, time, @timestamp, aushape, ci_job, collectd, docker, fedora-ci, file, foreman, geoip, hostname, ipaddr4, ipaddr6, kubernetes, level, message, namespace_name, namespace_uuid, offset, openstack, ovirt, pid, pipeline_metadata, rsyslog, service, systemd, tags, testcase, tlog, viaq_msg_id.

    Any fields not included in ${CDM_DEFAULT_KEEP_FIELDS} or ${CDM_EXTRA_KEEP_FIELDS} are moved to ${CDM_UNDEFINED_NAME} if CDM_USE_UNDEFINED is true.

    The CDM_DEFAULT_KEEP_FIELDS parameter is for only advanced users, or if you are instructed to do so by Red Hat support.

  • Empty fields. You can determine which empty fields to retain from disparate logs.

Table 1. Environment parameters for log normalization
Parameters Definition Example

CDM_EXTRA_KEEP_FIELDS

Specify an extra set of defined fields to be kept at the top level of the logs in addition to the CDM_DEFAULT_KEEP_FIELDS. The default is "".

CDM_EXTRA_KEEP_FIELDS="broker"

CDM_KEEP_EMPTY_FIELDS

Specify fields to retain even if empty in CSV format. Empty defined fields not specified are dropped. The default is "message", keep empty messages.

CDM_KEEP_EMPTY_FIELDS="message"

CDM_USE_UNDEFINED

Set to true to move undefined fields to the undefined top level field. The default is false. If true, values in CDM_DEFAULT_KEEP_FIELDS and CDM_EXTRA_KEEP_FIELDS are not moved to undefined.

CDM_USE_UNDEFINED=true

CDM_UNDEFINED_NAME

Specify a name for the undefined top level field if using CDM_USE_UNDEFINED. The default is`undefined`. Enabled only when CDM_USE_UNDEFINED is true.

CDM_UNDEFINED_NAME="undef"

CDM_UNDEFINED_MAX_NUM_FIELDS

If the number of undefined fields is greater than this number, all undefined fields are converted to their JSON string representation and stored in the CDM_UNDEFINED_NAME field. If the record contains more than this value of undefined fields, no further processing takes place on these fields. Instead, the fields will be converted to a single string JSON value, stored in the top-level CDM_UNDEFINED_NAME field. Keeping the default of -1 allows for an unlimited number of undefined fields, which is not recommended.

This parameter is honored even if CDM_USE_UNDEFINED is false.

CDM_UNDEFINED_MAX_NUM_FIELDS=4

CDM_UNDEFINED_TO_STRING

Set to true to convert all undefined fields to their JSON string representation. The default is false.

CDM_UNDEFINED_TO_STRING=true

CDM_UNDEFINED_DOT_REPLACE_CHAR

Specify a character to use in place of a dot character '.' in an undefined field. MERGE_JSON_LOG must be true. The default is UNUSED. If you set the MERGE_JSON_LOG parameter to true, see the Note below.

CDM_UNDEFINED_DOT_REPLACE_CHAR="_"

If you set the MERGE_JSON_LOG parameter in the log collector daemonset and CDM_UNDEFINED_TO_STRING environment variables to true, you might receive an Elasticsearch 400 error. The error occurs because when`MERGE_JSON_LOG=true`, the log collector adds fields with data types other than string. When you set CDM_UNDEFINED_TO_STRING=true, the log collector attempts to add those fields as a string value resulting in the Elasticsearch 400 error. The error clears when the log collector rolls over the indices for the next day’s logs

When the log collector rolls over the indices, it creates a brand new index. The field definitions are updated and you will not get the 400 error.

Procedure

Use the CDM_* parameters to configure undefined and empty field processing.

  1. Configure how to process fields, as needed:

    1. Specify the fields to move using CDM_EXTRA_KEEP_FIELDS.

    2. Specify any empty fields to retain in the CDM_KEEP_EMPTY_FIELDS parameter in CSV format.

  2. Configure how to process undefined fields, as needed:

    1. Set CDM_USE_UNDEFINED to true to move undefined fields to the top-level undefined field:

    2. Specify a name for the undefined fields using the CDM_UNDEFINED_NAME parameter.

    3. Set CDM_UNDEFINED_MAX_NUM_FIELDS to a value other than the default -1, to set an upper bound on the number of undefined fields in a single record.

  3. Specify CDM_UNDEFINED_DOT_REPLACE_CHAR to change any dot . characters in an undefined field name to another character. For example, if CDM_UNDEFINED_DOT_REPLACE_CHAR=@@@ and there is a field named foo.bar.baz the field is transformed into foo@@@bar@@@baz.

  4. Set UNDEFINED_TO_STRING to true to convert undefined fields to their JSON string representation.

If you configure the CDM_UNDEFINED_TO_STRING or CDM_UNDEFINED_MAX_NUM_FIELDS parameters, you use the CDM_UNDEFINED_NAME to change the undefined field name. This field is needed because CDM_UNDEFINED_TO_STRING or CDM_UNDEFINED_MAX_NUM_FIELDS could change the value type of the undefined field. When CDM_UNDEFINED_TO_STRING or CDM_UNDEFINED_MAX_NUM_FIELDS is set to true and there are more undefined fields in a log, the value type becomes string. Elasticsearch stops accepting records if the value type is changed, for example, from JSON to JSON string.

For example, when CDM_UNDEFINED_TO_STRING is false or CDM_UNDEFINED_MAX_NUM_FIELDS is the default, -1, the value type of the undefined field is json. If you change CDM_UNDEFINED_MAX_NUM_FIELDS to a value other than default and there are more undefined fields in a log, the value type becomes string (json string). Elasticsearch stops accepting records if the value type is changed.

Configuring Fluentd using environment variables

You can use environment variables to modify your Fluentd configuration.

Prerequisite

Set cluster logging to the unmanaged state.

Procedure

Set any of the Fluentd environment variables as needed:

oc set env ds/fluentd <env-var>=<value>

For example:

oc set env ds/fluentd LOGGING_FILE_AGE=30