Logging 5.4.13

Bug fixes

  • Before this update, a problem with the Fluentd collector caused it to not capture OAuth login events stored in /var/log/auth-server/audit.log. This led to incomplete collection of login events from the OAuth service. With this update, the Fluentd collector now resolves this issue by capturing all login events from the OAuth service, including those stored in /var/log/auth-server/audit.log, as expected. (LOG-3731)

Logging 5.4.9

Bug fixes

  • Before this update, the Fluentd collector would warn of unused configuration parameters. This update removes those configuration parameters and their warning messages. (LOG-3074)

  • Before this update, Kibana had a fixed 24h OAuth cookie expiration time, which resulted in 401 errors in Kibana whenever the accessTokenInactivityTimeout field was set to a value lower than 24h. With this update, Kibana’s OAuth cookie expiration time synchronizes to the accessTokenInactivityTimeout, with a default value of 24h. (LOG-3306)

Logging 5.4.6

Bug fixes

  • Before this update, Fluentd would sometimes not recognize that the Kubernetes platform rotated the log file and would no longer read log messages. This update corrects that by setting the configuration parameter suggested by the upstream development team. (LOG-2792)

  • Before this update, each rollover job created empty indices when the ClusterLogForwarder custom resource had JSON parsing defined. With this update, new indices are not empty. (LOG-2823)

  • Before this update, if you deleted the Kibana Custom Resource, the OpenShift Container Platform web console continued displaying a link to Kibana. With this update, removing the Kibana Custom Resource also removes that link. (LOG-3054)

Logging 5.4.5

Bug fixes

  • Before this update, the Operator did not ensure that the pod was ready, which caused the cluster to reach an inoperable state during a cluster restart. With this update, the Operator marks new pods as ready before continuing to a new pod during a restart, which resolves the issue. (LOG-2881)

  • Before this update, the addition of multi-line error detection caused internal routing to change and forward records to the wrong destination. With this update, the internal routing is correct. (LOG-2946)

  • Before this update, the Operator could not decode index setting JSON responses with a quoted Boolean value and would result in an error. With this update, the Operator can properly decode this JSON response. (LOG-3009)

  • Before this update, Elasticsearch index templates defined the fields for labels with the wrong types. This change updates those templates to match the expected types forwarded by the log collector. (LOG-2972)

Logging 5.4.4

Bug fixes

  • Before this update, non-latin characters displayed incorrectly in Elasticsearch. With this update, Elasticsearch displays all valid UTF-8 symbols correctly. (LOG-2794)

  • Before this update, non-latin characters displayed incorrectly in Fluentd. With this update, Fluentd displays all valid UTF-8 symbols correctly. (LOG-2657)

  • Before this update, the metrics server for the collector attempted to bind to the address using a value exposed by an environment value. This change modifies the configuration to bind to any available interface. (LOG-2821)

  • Before this update, the cluster-logging Operator relied on the cluster to create a secret. This cluster behavior changed in OpenShift Container Platform 4.11, which caused logging deployments to fail. With this update, the cluster-logging Operator resolves the issue by creating the secret if needed. (LOG-2840)

Logging 5.4.3

Elasticsearch Operator deprecation notice

In logging 5.4.3 the Elasticsearch Operator is deprecated and is planned to be removed in a future release. Red Hat will provide bug fixes and support for this feature during the current release lifecycle, but this feature will no longer receive enhancements and will be removed. As an alternative to using the Elasticsearch Operator to manage the default log storage, you can use the Loki Operator.

Bug fixes

  • Before this update, the OpenShift Logging Dashboard showed the number of active primary shards instead of all active shards. With this update, the dashboard displays all active shards. (LOG-2781)

  • Before this update, a bug in a library used by elasticsearch-operator contained a denial of service attack vulnerability. With this update, the library has been updated to a version that does not contain this vulnerability. (LOG-2816)

  • Before this update, when configuring Vector to forward logs to Loki, it was not possible to set a custom bearer token or use the default token if Loki had TLS enabled. With this update, Vector can forward logs to Loki using tokens with TLS enabled. (LOG-2786

  • Before this update, the ElasticSearch Operator omitted the referencePolicy property of the ImageStream custom resource when selecting an oauth-proxy image. This omission caused the Kibana deployment to fail in specific environments. With this update, using referencePolicy resolves the issue, and the Operator can deploy Kibana successfully. (LOG-2791)

  • Before this update, alerting rules for the ClusterLogForwarder custom resource did not take multiple forward outputs into account. This update resolves the issue. (LOG-2640)

  • Before this update, clusters configured to forward logs to Amazon CloudWatch wrote rejected log files to temporary storage, causing cluster instability over time. With this update, chunk backup for CloudWatch has been disabled, resolving the issue. (LOG-2768)

Logging 5.4.2

Bug fixes

  • Before this update, editing the Collector configuration using oc edit was difficult because it had inconsistent use of white-space. This change introduces logic to normalize and format the configuration prior to any updates by the Operator so that it is easy to edit using oc edit. (LOG-2319)

  • Before this update, the FluentdNodeDown alert could not provide instance labels in the message section appropriately. This update resolves the issue by fixing the alert rule to provide instance labels in cases of partial instance failures. (LOG-2607)

  • Before this update, several log levels, such as`critical`, that were documented as supported by the product were not. This update fixes the discrepancy so the documented log levels are now supported by the product. (LOG-2033)

Logging 5.4.1

Bug fixes

  • Before this update, the log file metric exporter only reported logs created while the exporter was running, which resulted in inaccurate log growth data. This update resolves this issue by monitoring /var/log/pods. (LOG-2442)

  • Before this update, the collector would be blocked because it continually tried to use a stale connection when forwarding logs to fluentd forward receivers. With this release, the keepalive_timeout value has been set to 30 seconds (30s) so that the collector recycles the connection and re-attempts to send failed messages within a reasonable amount of time. (LOG-2534)

  • Before this update, an error in the gateway component enforcing tenancy for reading logs limited access to logs with a Kubernetes namespace causing "audit" and some "infrastructure" logs to be unreadable. With this update, the proxy correctly detects users with admin access and allows access to logs without a namespace. (LOG-2448)

  • Before this update, the system:serviceaccount:openshift-monitoring:prometheus-k8s service account had cluster level privileges as a clusterrole and clusterrolebinding. This update restricts the service account` to the openshift-logging namespace with a role and rolebinding. (LOG-2437)

  • Before this update, Linux audit log time parsing relied on an ordinal position of a key/value pair. This update changes the parsing to use a regular expression to find the time entry. (LOG-2321)

Logging 5.4

The following advisories are available for logging 5.4: {logging-title-uc} Release 5.4

Technology Previews

The following features are available a Technology Previews on OpenShift Container Platform.

Vector collector

Vector is a log collector offered as a Technology Preview alternative to the current default collector for the logging.

The Vector collector supports the following outputs:

  • elasticsearch. An external Elasticsearch instance. The elasticsearch output can use a TLS connection.

  • kafka. A Kafka broker. The kafka output can use an unsecured or TLS connection.

  • loki. Loki, a horizontally scalable, highly available, multi-tenant log aggregation system.

Vector does not support FIPS Enabled Clusters.

Vector is not enabled by default. To enable Vector on your OpenShift Container Platform cluster, you must add the logging.openshift.io/preview-vector-collector: enabled annotation to the ClusterLogging custom resource (CR), and add vector as a collection type:

Example ClusterLogging CR
  apiVersion: "logging.openshift.io/v1"
  kind: "ClusterLogging"
    name: "instance"
    namespace: "openshift-logging"
      logging.openshift.io/preview-vector-collector: enabled
        type: "vector"
        vector: {}

Loki log store

Loki is a horizontally scalable, highly available, multi-tenant log aggregation system currently offered as an alternative to Elasticsearch as a log store for the logging. See the "Logging using LokiStack" documentation for more information about installing Loki.

Bug fixes

  • Before this update, the cluster-logging-operator used cluster scoped roles and bindings to establish permissions for the Prometheus service account to scrape metrics. These permissions were created when deploying the Operator using the console interface but were missing when deploying from the command line. This update fixes the issue by making the roles and bindings namespace-scoped. (LOG-2286)

  • Before this update, a prior change to fix dashboard reconciliation introduced a ownerReferences field to the resource across namespaces. As a result, both the config map and dashboard were not created in the namespace. With this update, the removal of the ownerReferences field resolves the issue, and the OpenShift Logging dashboard is available in the console. (LOG-2163)

  • Before this update, changes to the metrics dashboards did not deploy because the cluster-logging-operator did not correctly compare existing and modified config maps that contain the dashboard. With this update, the addition of a unique hash value to object labels resolves the issue. (LOG-2071)

  • Before this update, the OpenShift Logging dashboard did not correctly display the pods and namespaces in the table, which displays the top producing containers collected over the last 24 hours. With this update, the pods and namespaces are displayed correctly. (LOG-2069)

  • Before this update, when the ClusterLogForwarder was set up with Elasticsearch OutputDefault and Elasticsearch outputs did not have structured keys, the generated configuration contained the incorrect values for authentication. This update corrects the secret and certificates used. (LOG-2056)

  • Before this update, the OpenShift Logging dashboard displayed an empty CPU graph because of a reference to an invalid metric. With this update, the correct data point has been selected, resolving the issue. (LOG-2026)

  • Before this update, the Fluentd container image included builder tools that were unnecessary at run time. This update removes those tools from the image.(LOG-1927)

  • Before this update, a name change of the deployed collector in the 5.3 release caused the logging collector to generate the FluentdNodeDown alert. This update resolves the issue by fixing the job name for the Prometheus alert. (LOG-1918)

  • Before this update, the log collector was collecting its own logs due to a refactoring of the component name change. This lead to a potential feedback loop of the collector processing its own log that might result in memory and log message size issues. This update resolves the issue by excluding the collector logs from the collection. (LOG-1774)

  • Before this update, Elasticsearch generated the error Unable to create PersistentVolumeClaim due to forbidden: exceeded quota: infra-storage-quota. if the PVC already existed. With this update, Elasticsearch checks for existing PVCs, resolving the issue. (LOG-2131)

  • Before this update, Elasticsearch was unable to return to the ready state when the elasticsearch-signing secret was removed. With this update, Elasticsearch is able to go back to the ready state after that secret is removed. (LOG-2171)

  • Before this update, the change of the path from which the collector reads container logs caused the collector to forward some records to the wrong indices. With this update, the collector now uses the correct configuration to resolve the issue. (LOG-2160)

  • Before this update, clusters with a large number of namespaces caused Elasticsearch to stop serving requests because the list of namespaces reached the maximum header size limit. With this update, headers only include a list of namespace names, resolving the issue. (LOG-1899)

  • Before this update, the OpenShift Container Platform Logging dashboard showed the number of shards 'x' times larger than the actual value when Elasticsearch had 'x' nodes. This issue occurred because it was printing all primary shards for each Elasticsearch pod and calculating a sum on it, although the output was always for the whole Elasticsearch cluster. With this update, the number of shards is now correctly calculated. (LOG-2156)

  • Before this update, the secrets kibana and kibana-proxy were not recreated if they were deleted manually. With this update, the elasticsearch-operator will watch the resources and automatically recreate them if deleted. (LOG-2250)

  • Before this update, tuning the buffer chunk size could cause the collector to generate a warning about the chunk size exceeding the byte limit for the event stream. With this update, you can also tune the read line limit, resolving the issue. (LOG-2379)

  • Before this update, the logging console link in OpenShift web console was not removed with the ClusterLogging CR. With this update, deleting the CR or uninstalling the Cluster Logging Operator removes the link. (LOG-2373)

  • Before this update, a change to the container logs path caused the collection metric to always be zero with older releases configured with the original path. With this update, the plugin which exposes metrics about collected logs supports reading from either path to resolve the issue. (LOG-2462)