×

Logging for Red Hat OpenShift collects operations and application logs from your cluster and enriches the data with Kubernetes pod and project metadata. All supported modifications to the log collector can be performed though the spec.collection stanza in the ClusterLogging custom resource (CR).

Configuring the log collector

You can configure which log collector type your logging uses by modifying the ClusterLogging custom resource (CR).

Fluentd is deprecated and is planned to be removed in a future release. Red Hat provides bug fixes and support for this feature during the current release lifecycle, but this feature no longer receives enhancements. As an alternative to Fluentd, you can use Vector instead.

Prerequisites
  • You have administrator permissions.

  • You have installed the OpenShift CLI (oc).

  • You have installed the Red Hat OpenShift Logging Operator.

  • You have created a ClusterLogging CR.

Procedure
  1. Modify the ClusterLogging CR collection spec:

    ClusterLogging CR example
    apiVersion: logging.openshift.io/v1
    kind: ClusterLogging
    metadata:
    # ...
    spec:
    # ...
      collection:
        type: <log_collector_type> (1)
        resources: {}
        tolerations: {}
    # ...
    1 The log collector type you want to use for the logging. This can be vector or fluentd.
  2. Apply the ClusterLogging CR by running the following command:

    $ oc apply -f <filename>.yaml

Creating a LogFileMetricExporter resource

In logging version 5.8 and newer versions, the LogFileMetricExporter is no longer deployed with the collector by default. You must manually create a LogFileMetricExporter custom resource (CR) to generate metrics from the logs produced by running containers.

If you do not create the LogFileMetricExporter CR, you may see a No datapoints found message in the OpenShift Container Platform web console dashboard for Produced Logs.

Prerequisites
  • You have administrator permissions.

  • You have installed the Red Hat OpenShift Logging Operator.

  • You have installed the OpenShift CLI (oc).

Procedure
  1. Create a LogFileMetricExporter CR as a YAML file:

    Example LogFileMetricExporter CR
    apiVersion: logging.openshift.io/v1alpha1
    kind: LogFileMetricExporter
    metadata:
      name: instance
      namespace: openshift-logging
    spec:
      nodeSelector: {} (1)
      resources: (2)
        limits:
          cpu: 500m
          memory: 256Mi
        requests:
          cpu: 200m
          memory: 128Mi
      tolerations: [] (3)
    # ...
    1 Optional: The nodeSelector stanza defines which nodes the pods are scheduled on.
    2 The resources stanza defines resource requirements for the LogFileMetricExporter CR.
    3 Optional: The tolerations stanza defines the tolerations that the pods accept.
  2. Apply the LogFileMetricExporter CR by running the following command:

    $ oc apply -f <filename>.yaml
Verification

A logfilesmetricexporter pod runs concurrently with a collector pod on each node.

  • Verify that the logfilesmetricexporter pods are running in the namespace where you have created the LogFileMetricExporter CR, by running the following command and observing the output:

    $ oc get pods -l app.kubernetes.io/component=logfilesmetricexporter -n openshift-logging
    Example output
    NAME                           READY   STATUS    RESTARTS   AGE
    logfilesmetricexporter-9qbjj   1/1     Running   0          2m46s
    logfilesmetricexporter-cbc4v   1/1     Running   0          2m46s

Configuring resources and scheduling for logging collectors

Administrators can modify the resources or scheduling of the collector by creating a ClusterLogging custom resource (CR) that is in the same namespace and has the same name as the ClusterLogForwarder CR that it supports.

The applicable stanzas for the ClusterLogging CR when using multiple log forwarders in a deployment are managementState and collection. All other stanzas are ignored.

Prerequisites
  • You have administrator permissions.

  • You have installed the Red Hat OpenShift Logging Operator version 5.8 or newer.

  • You have created a ClusterLogForwarder CR.

Procedure
  1. Create a ClusterLogging CR that supports your existing ClusterLogForwarder CR:

    Example ClusterLogging CR YAML
    apiVersion: logging.openshift.io/v1
    kind: ClusterLogging
    metadata:
      name:  <name> (1)
      namespace: <namespace> (2)
    spec:
      managementState: "Managed"
      collection:
        type: "vector"
        tolerations:
        - key: "logging"
          operator: "Exists"
          effect: "NoExecute"
          tolerationSeconds: 6000
        resources:
          limits:
            memory: 1Gi
          requests:
            cpu: 100m
            memory: 1Gi
        nodeSelector:
          collector: needed
    # ...
    1 The name must be the same name as the ClusterLogForwarder CR.
    2 The namespace must be the same namespace as the ClusterLogForwarder CR.
  2. Apply the ClusterLogging CR by running the following command:

    $ oc apply -f <filename>.yaml

Viewing logging collector pods

You can view the logging collector pods and the corresponding nodes that they are running on.

Procedure
  • Run the following command in a project to view the logging collector pods and their details:

    $ oc get pods --selector component=collector -o wide -n <project_name>
    Example output
    NAME           READY  STATUS    RESTARTS   AGE     IP            NODE                  NOMINATED NODE   READINESS GATES
    collector-8d69v  1/1    Running   0          134m    10.130.2.30   master1.example.com   <none>           <none>
    collector-bd225  1/1    Running   0          134m    10.131.1.11   master2.example.com   <none>           <none>
    collector-cvrzs  1/1    Running   0          134m    10.130.0.21   master3.example.com   <none>           <none>
    collector-gpqg2  1/1    Running   0          134m    10.128.2.27   worker1.example.com   <none>           <none>
    collector-l9j7j  1/1    Running   0          134m    10.129.2.31   worker2.example.com   <none>           <none>

Configure log collector CPU and memory limits

The log collector allows for adjustments to both the CPU and memory limits.

Procedure
  • Edit the ClusterLogging custom resource (CR) in the openshift-logging project:

    $ oc -n openshift-logging edit ClusterLogging instance
    apiVersion: logging.openshift.io/v1
    kind: ClusterLogging
    metadata:
      name: instance
      namespace: openshift-logging
    spec:
      collection:
        type: fluentd
        resources:
          limits: (1)
            memory: 736Mi
            requests:
              cpu: 100m
              memory: 736Mi
    # ...
    1 Specify the CPU and memory limits and requests as needed. The values shown are the default values.

Configuring the collector to receive audit logs as an HTTP server

You can configure your log collector to listen for HTTP connections and receive audit logs as an HTTP server by specifying http as a receiver input in the ClusterLogForwarder custom resource (CR). This enables you to use a common log store for audit logs that are collected from both inside and outside of your OpenShift Container Platform cluster.

Prerequisites
  • You have administrator permissions.

  • You have installed the OpenShift CLI (oc).

  • You have installed the Red Hat OpenShift Logging Operator.

  • You have created a ClusterLogForwarder CR.

Procedure
  1. Modify the ClusterLogForwarder CR to add configuration for the http receiver input:

    apiVersion: logging.openshift.io/v1beta1
    kind: ClusterLogForwarder
    metadata:
    # ...
    spec:
      serviceAccountName: <service_account_name>
      inputs:
        - name: http-receiver (1)
          receiver:
            type: http (2)
            http:
              format: kubeAPIAudit (3)
              port: 8443 (4)
      pipelines: (5)
        - name: http-pipeline
          inputRefs:
            - http-receiver
    # ...
    1 Specify a name for your input receiver.
    2 Specify the input receiver type as http.
    3 Currently, only the the kube-apiserver webhook format is supported for http input receivers.
    4 Optional: Specify the port that the input receiver listens on. This must be a value between 1024 and 65535. The default value is 8443 if this is not specified.
    5 Configure a pipeline for your input receiver.
  2. Apply the changes to the ClusterLogForwarder CR:

    $ oc apply -f <filename>.yaml
Additional resources

Advanced configuration for the Fluentd log forwarder

Fluentd is deprecated and is planned to be removed in a future release. Red Hat provides bug fixes and support for this feature during the current release lifecycle, but this feature no longer receives enhancements. As an alternative to Fluentd, you can use Vector instead.

Logging includes multiple Fluentd parameters that you can use for tuning the performance of the Fluentd log forwarder. With these parameters, you can change the following Fluentd behaviors:

  • Chunk and chunk buffer sizes

  • Chunk flushing behavior

  • Chunk forwarding retry behavior

Fluentd collects log data in a single blob called a chunk. When Fluentd creates a chunk, the chunk is considered to be in the stage, where the chunk gets filled with data. When the chunk is full, Fluentd moves the chunk to the queue, where chunks are held before being flushed, or written out to their destination. Fluentd can fail to flush a chunk for a number of reasons, such as network issues or capacity issues at the destination. If a chunk cannot be flushed, Fluentd retries flushing as configured.

By default in OpenShift Container Platform, Fluentd uses the exponential backoff method to retry flushing, where Fluentd doubles the time it waits between attempts to retry flushing again, which helps reduce connection requests to the destination. You can disable exponential backoff and use the periodic retry method instead, which retries flushing the chunks at a specified interval.

These parameters can help you determine the trade-offs between latency and throughput.

  • To optimize Fluentd for throughput, you could use these parameters to reduce network packet count by configuring larger buffers and queues, delaying flushes, and setting longer times between retries. Be aware that larger buffers require more space on the node file system.

  • To optimize for low latency, you could use the parameters to send data as soon as possible, avoid the build-up of batches, have shorter queues and buffers, and use more frequent flush and retries.

You can configure the chunking and flushing behavior using the following parameters in the ClusterLogging custom resource (CR). The parameters are then automatically added to the Fluentd config map for use by Fluentd.

These parameters are:

  • Not relevant to most users. The default settings should give good general performance.

  • Only for advanced users with detailed knowledge of Fluentd configuration and performance.

  • Only for performance tuning. They have no effect on functional aspects of logging.

Table 1. Advanced Fluentd Configuration Parameters
Parameter Description Default

chunkLimitSize

The maximum size of each chunk. Fluentd stops writing data to a chunk when it reaches this size. Then, Fluentd sends the chunk to the queue and opens a new chunk.

8m

totalLimitSize

The maximum size of the buffer, which is the total size of the stage and the queue. If the buffer size exceeds this value, Fluentd stops adding data to chunks and fails with an error. All data not in chunks is lost.

Approximately 15% of the node disk distributed across all outputs.

flushInterval

The interval between chunk flushes. You can use s (seconds), m (minutes), h (hours), or d (days).

1s

flushMode

The method to perform flushes:

  • lazy: Flush chunks based on the timekey parameter. You cannot modify the timekey parameter.

  • interval: Flush chunks based on the flushInterval parameter.

  • immediate: Flush chunks immediately after data is added to a chunk.

interval

flushThreadCount

The number of threads that perform chunk flushing. Increasing the number of threads improves the flush throughput, which hides network latency.

2

overflowAction

The chunking behavior when the queue is full:

  • throw_exception: Raise an exception to show in the log.

  • block: Stop data chunking until the full buffer issue is resolved.

  • drop_oldest_chunk: Drop the oldest chunk to accept new incoming chunks. Older chunks have less value than newer chunks.

block

retryMaxInterval

The maximum time in seconds for the exponential_backoff retry method.

300s

retryType

The retry method when flushing fails:

  • exponential_backoff: Increase the time between flush retries. Fluentd doubles the time it waits until the next retry until the retry_max_interval parameter is reached.

  • periodic: Retries flushes periodically, based on the retryWait parameter.

exponential_backoff

retryTimeOut

The maximum time interval to attempt retries before the record is discarded.

60m

retryWait

The time in seconds before the next chunk flush.

1s

For more information on the Fluentd chunk lifecycle, see Buffer Plugins in the Fluentd documentation.

Procedure
  1. Edit the ClusterLogging custom resource (CR) in the openshift-logging project:

    $ oc edit ClusterLogging instance
  2. Add or modify any of the following parameters:

    apiVersion: logging.openshift.io/v1
    kind: ClusterLogging
    metadata:
      name: instance
      namespace: openshift-logging
    spec:
      collection:
        fluentd:
          buffer:
            chunkLimitSize: 8m (1)
            flushInterval: 5s (2)
            flushMode: interval (3)
            flushThreadCount: 3 (4)
            overflowAction: throw_exception (5)
            retryMaxInterval: "300s" (6)
            retryType: periodic (7)
            retryWait: 1s (8)
            totalLimitSize: 32m (9)
    # ...
    1 Specify the maximum size of each chunk before it is queued for flushing.
    2 Specify the interval between chunk flushes.
    3 Specify the method to perform chunk flushes: lazy, interval, or immediate.
    4 Specify the number of threads to use for chunk flushes.
    5 Specify the chunking behavior when the queue is full: throw_exception, block, or drop_oldest_chunk.
    6 Specify the maximum interval in seconds for the exponential_backoff chunk flushing method.
    7 Specify the retry type when chunk flushing fails: exponential_backoff or periodic.
    8 Specify the time in seconds before the next chunk flush.
    9 Specify the maximum size of the chunk buffer.
  3. Verify that the Fluentd pods are redeployed:

    $ oc get pods -l component=collector -n openshift-logging
  4. Check that the new values are in the fluentd config map:

    $ oc extract configmap/collector-config --confirm
    Example fluentd.conf
    <buffer>
      @type file
      path '/var/lib/fluentd/default'
      flush_mode interval
      flush_interval 5s
      flush_thread_count 3
      retry_type periodic
      retry_wait 1s
      retry_max_interval 300s
      retry_timeout 60m
      queued_chunks_limit_size "#{ENV['BUFFER_QUEUE_LIMIT'] || '32'}"
      total_limit_size "#{ENV['TOTAL_LIMIT_SIZE_PER_BUFFER'] || '8589934592'}"
      chunk_limit_size 8m
      overflow_action throw_exception
      disable_chunk_backup true
    </buffer>