Release notes for Red Hat OpenShift Logging 5.3
Making open source more inclusive
Red Hat is committed to replacing problematic language in our code, documentation, and web properties. We are beginning with these four terms: master, slave, blacklist, and whitelist. Because of the enormity of this endeavor, these changes will be implemented gradually over several upcoming releases. For more details, see our CTO Chris Wright’s message.
Supported Versions
OpenShift | 4.7 | 4.8 | 4.9 |
---|---|---|---|
RHOL 5.0 | X | X | |
RHOL 5.1 | X | X | |
RHOL 5.2 | X | X | X |
RHOL 5.3 | X |
OpenShift Logging 5.3.0
The following advisories are available for OpenShift Logging 5.3.x:
New features and enhancements
- With this update, authorization requirements for Log Forwarding have been relaxed. Outputs may now be configured with SASL, username/password, or TLS.
Bug fixes
Before this update, application logs were not correctly configured to forward to the proper Cloudwatch stream with multi-line error detection enabled. (LOG-1939)
Before this update, a name change of the deployed collector in the 5.3 release caused the alert ‘fluentnodedown’ to generate. (LOG-1918)
Before this update, a regression introduced in a prior release configuration caused the collector to flush its buffered messages before shutdown, creating a delay the termination and restart of collector Pods. With this update, fluentd no longer flushes buffers at shutdown, resolving the issue. (LOG-1735)
Before this update, a regression introduced in a prior release intentionally disabled JSON message parsing. With this update, a log entry’s “level” value is set based on: a parsed JSON message that has a “level” field or by applying a regex against the message field to extract a match. (LOG-1199)
Known issues
If you forward logs to an external Elasticsearch server and then change a configured value in the pipeline secret, such as the username and password, the Fluentd forwarder loads the new secret but uses the old value to connect to an external Elasticsearch server. This issue happens because the Red Hat OpenShift Logging Operator does not currently monitor secrets for content changes. (LOG-1652)
As a workaround, if you change the secret, you can force the Fluentd pods to redeploy by entering:
$ oc delete pod -l component=collector
Deprecated and removed features
Some features available in previous releases have been deprecated or removed.
Deprecated functionality is still included in OpenShift Logging and continues to be supported; however, it will be removed in a future release of this product and is not recommended for new deployments.
Forwarding logs using the legacy Fluentd and legacy syslog methods have been removed
In OpenShift Logging 5.3, the legacy methods of forwarding logs to Syslog and Fluentd are removed. Bug fixes and support are provided through the end of the OpenShift Logging 5.2 life cycle. After which, no new feature enhancements are made.
Instead, use the following non-legacy methods:
Configuration mechanisms for legacy forwarding methods have been removed
In OpenShift Logging 5.3, the legacy configuration mechanism for log forwarding is removed: You cannot forward logs using the legacy Fluentd method and legacy Syslog method. Use the standard log forwarding methods instead.
OpenShift Logging 5.2.0
The following advisories are available for OpenShift Logging 5.2.x:
New features and enhancements
With this update, you can forward log data to Amazon CloudWatch, which provides application and infrastructure monitoring. For more information, see Forwarding logs to Amazon CloudWatch. (LOG-1173)
With this update, you can forward log data to Loki, a horizontally scalable, highly available, multi-tenant log aggregation system. For more information, see Forwarding logs to Loki. (LOG-684)
With this update, if you use the Fluentd forward protocol to forward log data over a TLS-encrypted connection, now you can use a password-encrypted private key file and specify the passphrase in the Cluster Log Forwarder configuration. For more information, see Forwarding logs using the Fluentd forward protocol. (LOG-1525)
This enhancement enables you to use a username and password to authenticate a log forwarding connection to an external Elasticsearch instance. For example, if you cannot use mutual TLS (mTLS) because a third-party operates the Elasticsearch instance, you can use HTTP or HTTPS and set a secret that contains the username and password. For more information, see Forwarding logs to an external Elasticsearch instance. (LOG-1022)
With this update, you can collect OVN network policy audit logs for forwarding to a logging server. For more information, see Collecting OVN network policy audit logs. (LOG-1526)
By default, the data model introduced in OKD 4.5 gave logs from different namespaces a single index in common. This change made it harder to see which namespaces produced the most logs.
The current release adds namespace metrics to the Logging dashboard in the OKD console. With these metrics, you can see which namespaces produce logs and how many logs each namespace produces for a given timestamp.
To see these metrics, open the Administrator perspective in the OKD web console, and navigate to Observe → Dashboards → Logging/Elasticsearch. (LOG-1680)
The current release, OpenShift Logging 5.2, enables two new metrics: For a given timestamp or duration, you can see the total logs produced or logged by individual containers, and the total logs collected by the collector. These metrics are labeled by namespace, pod, and container name so that you can see how many logs each namespace and pod collects and produces. (LOG-1213)
Bug fixes
Before this update, when the OpenShift Elasticsearch Operator created index management cronjobs, it added the
POLICY_MAPPING
environment variable twice, which caused the apiserver to report the duplication. This update fixes the issue so that thePOLICY_MAPPING
environment variable is set only once per cronjob, and there is no duplication for the apiserver to report. (LOG-1130)Before this update, suspending an Elasticsearch cluster to zero nodes did not suspend the index-management cronjobs, which put these cronjobs into maximum backoff. Then, after unsuspending the Elasticsearch cluster, these cronjobs stayed halted due to maximum backoff reached. This update resolves the issue by suspending the cronjobs and the cluster. (LOG-1268)
Before this update, in the Logging dashboard in the OKD console, the list of top 10 log-producing containers was missing the “chart namespace” label and provided the incorrect metric name,
fluentd_input_status_total_bytes_logged
. With this update, the chart shows the namespace label and the correct metric name,log_logged_bytes_total
. (LOG-1271)Before this update, if an index management cronjob terminated with an error, it did not report the error exit code: instead, its job status was “complete.” This update resolves the issue by reporting the error exit codes of index management cronjobs that terminate with errors. (LOG-1273)
The
priorityclasses.v1beta1.scheduling.k8s.io
was removed in 1.22 and replaced bypriorityclasses.v1.scheduling.k8s.io
(v1beta1
was replaced byv1
). Before this update,APIRemovedInNextReleaseInUse
alerts were generated forpriorityclasses
becausev1beta1
was still present . This update resolves the issue by replacingv1beta1
withv1
. The alert is no longer generated. (LOG-1385)Previously, the OpenShift Elasticsearch Operator and Red Hat OpenShift Logging Operator did not have the annotation that was required for them to appear in the OKD web console list of operators that can run in a disconnected environment. This update adds the
operators.openshift.io/infrastructure-features: '["Disconnected"]'
annotation to these two operators so that they appear in the list of operators that run in disconnected environments. (LOG-1420)Before this update, Red Hat OpenShift Logging Operator pods were scheduled on CPU cores that were reserved for customer workloads on performance-optimized single-node clusters. With this update, cluster logging operator pods are scheduled on the correct CPU cores. (LOG-1440)
Before this update, some log entries had unrecognized UTF-8 bytes, which caused Elasticsearch to reject the messages and block the entire buffered payload. With this update, rejected payloads drop the invalid log entries and resubmit the remaining entries to resolve the issue. (LOG-1499)
Before this update, the
kibana-proxy
Pod sometimes entered theCrashLoopBackoff
state and logged the following messageInvalid configuration: cookie_secret must be 16, 24, or 32 bytes to create an AES cipher when pass_access_token == true or cookie_refresh != 0, but is 29 bytes.
The exact actual number of bytes could vary. With this update, the generation of the Kibana session secret has been corrected, and the kibana-proxy Pod no longer enters aCrashLoopBackoff
state due to this error. (LOG-1446)Before this update, the AWS CloudWatch Fluentd plug-in logged its AWS API calls to the Fluentd log at all log levels, consuming additional OKD node resources. With this update, the AWS CloudWatch Fluentd plug-in logs AWS API calls only at the “debug” and “trace” log levels. This way, at the default “warn” log level, Fluentd does not consume extra node resources. (LOG-1071)
Before this update, the Elasticsearch OpenDistro security plug-in caused user index migrations to fail. This update resolves the issue by providing a newer version of the plug-in. Now, index migrations proceed without errors. (LOG-1276)
Before this update, in the Logging dashboard in the OKD console, the list of top 10 log-producing containers lacked data points. This update resolves the issue, and the dashboard displays all data points. (LOG-1353)
Before this update, if you were tuning the performance of the Fluentd log forwarder by adjusting the
chunkLimitSize
andtotalLimitSize
values, theSetting queued_chunks_limit_size for each buffer to
message reported values that were too low. The current update fixes this issue so that this message reports the correct values. (LOG-1411)Before this update, the Kibana OpenDistro security plug-in caused user index migrations to fail. This update resolves the issue by providing a newer version of the plug-in. Now, index migrations proceed without errors. (LOG-1558)
Before this update, using a namespace input filter prevented logs in that namespace from appearing in other inputs. With this update, logs are sent to all inputs that can accept them. (LOG-1570)
Before this update, a missing license file for the
viaq/logerr
dependency caused license scanners to abort without success. With this update, theviaq/logerr
dependency is licensed under Apache 2.0 and the license scanners run successfully. (LOG-1590)Before this update, an incorrect brew tag for
curator5
within theelasticsearch-operator-bundle
build pipeline caused the pull of an image pinned to a dummy SHA1. With this update, the build pipeline uses thelogging-curator5-rhel8
reference forcurator5
, enabling index management cronjobs to pull the correct image fromregistry.redhat.io
. (LOG-1624)Before this update, an issue with the
ServiceAccount
permissions caused errors such asno permissions for [indices:admin/aliases/get]
. With this update, a permission fix resolves the issue. (LOG-1657)Before this update, the Custom Resource Definition (CRD) for the Red Hat OpenShift Logging Operator was missing the Loki output type, which caused the admission controller to reject the
ClusterLogForwarder
custom resource object. With this update, the CRD includes Loki as an output type so that administrators can configureClusterLogForwarder
to send logs to a Loki server. (LOG-1683)Before this update, OpenShift Elasticsearch Operator reconciliation of the
ServiceAccounts
overwrote third-party-owned fields that contained secrets. This issue caused memory and CPU spikes due to frequent recreation of secrets. This update resolves the issue. Now, the OpenShift Elasticsearch Operator does not overwrite third-party-owned fields. (LOG-1714)Before this update, in the
ClusterLogging
custom resource (CR) definition, if you specified aflush_interval
value but did not setflush_mode
tointerval
, the Red Hat OpenShift Logging Operator generated a Fluentd configuration. However, the Fluentd collector generated an error at runtime. With this update, the Red Hat OpenShift Logging Operator validates theClusterLogging
CR definition and only generates the Fluentd configuration if both fields are specified. (LOG-1723)
Known issues
If you forward logs to an external Elasticsearch server and then change a configured value in the pipeline secret, such as the username and password, the Fluentd forwarder loads the new secret but uses the old value to connect to an external Elasticsearch server. This issue happens because the Red Hat OpenShift Logging Operator does not currently monitor secrets for content changes. (LOG-1652)
As a workaround, if you change the secret, you can force the Fluentd pods to redeploy by entering:
$ oc delete pod -l component=collector
Deprecated and removed features
Some features available in previous releases have been deprecated or removed.
Deprecated functionality is still included in OpenShift Logging and continues to be supported; however, it will be removed in a future release of this product and is not recommended for new deployments.
Forwarding logs using the legacy Fluentd and legacy syslog methods have been deprecated
From OKD 4.6 to the present, forwarding logs by using the following legacy methods have been deprecated and will be removed in a future release:
Forwarding logs using the legacy Fluentd method
Forwarding logs using the legacy syslog method
Instead, use the following non-legacy methods:
OpenShift Logging 5.1.0
The following advisories are available for OpenShift Logging 5.1.x:
RHBA-2021:3545 - Bug Fix Advisory. OpenShift Logging Bug Fix Release 5.1.2
RHBA-2021:2885 - Bug Fix Advisory. OpenShift Logging Bug Fix Release 5.1.1
RHBA-2021:2112 - Bug Fix Advisory. OpenShift Logging Bug Fix Release 5.1.0
New features and enhancements
OpenShift Logging 5.1 now supports OKD 4.7 and later running on:
IBM Power Systems
IBM Z and LinuxONE
This release adds improvements related to the following components and concepts.
As a cluster administrator, you can use Kubernetes pod labels to gather log data from an application and send it to a specific log store. You can gather log data by configuring the
inputs[].application.selector.matchLabels
element in theClusterLogForwarder
custom resource (CR) YAML file. You can also filter the gathered log data by namespace. (LOG-883)This release adds the following new
ElasticsearchNodeDiskWatermarkReached
warnings to the OpenShift Elasticsearch Operator (EO):Elasticsearch Node Disk Low Watermark Reached
Elasticsearch Node Disk High Watermark Reached
Elasticsearch Node Disk Flood Watermark Reached
The alert applies the past several warnings when it predicts that an Elasticsearch node will reach the
Disk Low Watermark
,Disk High Watermark
, orDisk Flood Stage Watermark
thresholds in the next 6 hours. This warning period gives you time to respond before the node reaches the disk watermark thresholds. The warning messages also provide links to the troubleshooting steps, which you can follow to help mitigate the issue. The EO applies the past several hours of disk space data to a linear model to generate these warnings. (LOG-1100)JSON logs can now be forwarded as JSON objects, rather than quoted strings, to either Red Hat’s managed Elasticsearch cluster or any of the other supported third-party systems. Additionally, you can now query individual fields from a JSON log message inside Kibana which increases the discoverability of specific logs. (LOG-785, LOG-1148)
Deprecated and removed features
Some features available in previous releases have been deprecated or removed.
Deprecated functionality is still included in OpenShift Logging and continues to be supported; however, it will be removed in a future release of this product and is not recommended for new deployments.
Elasticsearch Curator has been removed
With this update, the Elasticsearch Curator has been removed and is no longer supported. Elasticsearch Curator helped you curate or manage your indices on OKD 4.4 and earlier. Instead of using Elasticsearch Curator, configure the log retention time.
Forwarding logs using the legacy Fluentd and legacy syslog methods have been deprecated
From OKD version 4.6 to the present, forwarding logs by using the legacy Fluentd and legacy syslog methods have been deprecated and will be removed in a future release. Use the standard non-legacy methods instead.
Bug fixes
Before this update, the
ClusterLogForwarder
CR did not show theinput[].selector
element after it had been created. With this update, when you specify aselector
in theClusterLogForwarder
CR, it remains. Fixing this bug was necessary for LOG-883, which enables using pod label selectors to forward application log data. (LOG-1338)Before this update, an update in the cluster service version (CSV) accidentally introduced resources and limits for the OpenShift Elasticsearch Operator container. Under specific conditions, this caused an out-of-memory condition that terminated the Elasticsearch Operator pod. This update fixes the issue by removing the CSV resources and limits for the Operator container. The Operator gets scheduled without issues. (LOG-1254)
Before this update, forwarding logs to Kafka using chained certificates failed with the following error message:
state=error: certificate verify failed (unable to get local issuer certificate)
Logs could not be forwarded to a Kafka broker with a certificate signed by an intermediate CA. This happened because the Fluentd Kafka plug-in could only handle a single CA certificate supplied in the
ca-bundle.crt
entry of the corresponding secret. This update fixes the issue by enabling the Fluentd Kafka plug-in to handle multiple CA certificates supplied in theca-bundle.crt
entry of the corresponding secret. Now, logs can be forwarded to a Kafka broker with a certificate signed by an intermediate CA. (LOG-1218, LOG-1216)Before this update, while under load, Elasticsearch responded to some requests with an HTTP 500 error, even though there was nothing wrong with the cluster. Retrying the request was successful. This update fixes the issue by updating the index management cron jobs to be more resilient when they encounter temporary HTTP 500 errors. The updated index management cron jobs will first retry a request multiple times before failing. (LOG-1215)
Before this update, if you did not set the
.proxy
value in the cluster installation configuration, and then configured a global proxy on the installed cluster, a bug prevented Fluentd from forwarding logs to Elasticsearch. To work around this issue, in the proxy or cluster configuration, set theno_proxy
value to.svc.cluster.local
so it skips internal traffic. This update fixes the proxy configuration issue. If you configure the global proxy after installing an OKD cluster, Fluentd forwards logs to Elasticsearch. (LOG-1187, BZ#1915448)Before this update, the logging collector created more socket connections than necessary. With this update, the logging collector reuses the existing socket connection to send logs. (LOG-1186)
Before this update, if a cluster administrator tried to add or remove storage from an Elasticsearch cluster, the OpenShift Elasticsearch Operator (EO) incorrectly tried to upgrade the Elasticsearch cluster, displaying
scheduledUpgrade: "True"
,shardAllocationEnabled: primaries
, and change the volumes. With this update, the EO does not try to upgrade the Elasticsearch cluster.The EO status displays the following new status information to indicate when you have tried to make an unsupported change to the Elasticsearch storage that it has ignored:
StorageStructureChangeIgnored
when you try to change between using ephemeral and persistent storage structures.StorageClassNameChangeIgnored
when you try to change the storage class name.StorageSizeChangeIgnored
when you try to change the storage size.
If you configure the
ClusterLogging
custom resource (CR) to switch from ephemeral to persistent storage, the EO creates a persistent volume claim (PVC) but does not create a persistent volume (PV). To clear theStorageStructureChangeIgnored
status, you must revert the change to theClusterLogging
CR and delete the persistent volume claim (PVC).(LOG-1351)
Before this update, if you redeployed a full Elasticsearch cluster, it got stuck in an unhealthy state, with one non-data node running and all other data nodes shut down. This happened because new certificates prevented the Elasticsearch Operator from scaling down the non-data nodes of the Elasticsearch cluster. With this update, Elasticsearch Operator can scale all the data and non-data nodes down and then back up again, so they load the new certificates. The Elasticsearch Operator can reach the new nodes after they load the new certificates. (LOG-1536)