About log collection and forwarding

The Red Hat OpenShift Logging Operator deploys a collector based on the ClusterLogForwarder resource specification. There are two collector options supported by this Operator: the legacy Fluentd collector, and the Vector collector.

Fluentd is deprecated and is planned to be removed in a future release. Red Hat provides bug fixes and support for this feature during the current release lifecycle, but this feature no longer receives enhancements. As an alternative to Fluentd, you can use Vector instead.

Log collection

The log collector is a daemon set that deploys pods to each OKD node to collect container and node logs.

By default, the log collector uses the following sources:

  • System and infrastructure logs generated by journald log messages from the operating system, the container runtime, and OKD.

  • /var/log/containers/*.log for all container logs.

If you configure the log collector to collect audit logs, it collects them from /var/log/audit/audit.log.

The log collector collects the logs from these sources and forwards them internally or externally depending on your logging configuration.

Log collector types

Vector is a log collector offered as an alternative to Fluentd for the logging.

You can configure which logging collector type your cluster uses by modifying the ClusterLogging custom resource (CR) collection spec:

Example ClusterLogging CR that configures Vector as the collector

  1. apiVersion: logging.openshift.io/v1
  2. kind: ClusterLogging
  3. metadata:
  4. name: instance
  5. namespace: openshift-logging
  6. spec:
  7. collection:
  8. logs:
  9. type: vector
  10. vector: {}
  11. # ...

Log collection limitations

The container runtimes provide minimal information to identify the source of log messages: project, pod name, and container ID. This information is not sufficient to uniquely identify the source of the logs. If a pod with a given name and project is deleted before the log collector begins processing its logs, information from the API server, such as labels and annotations, might not be available. There might not be a way to distinguish the log messages from a similarly named pod and project or trace the logs to their source. This limitation means that log collection and normalization are considered best effort.

The available container runtimes provide minimal information to identify the source of log messages and do not guarantee unique individual log messages or that these messages can be traced to their source.

Log collector features by type

Table 1. Log Sources
FeatureFluentdVector

App container logs

App-specific routing

App-specific routing by namespace

Infra container logs

Infra journal logs

Kube API audit logs

OpenShift API audit logs

Open Virtual Network (OVN) audit logs

Table 2. Authorization and Authentication
FeatureFluentdVector

Elasticsearch certificates

Elasticsearch username / password

Amazon Cloudwatch keys

Amazon Cloudwatch STS

Kafka certificates

Kafka username / password

Kafka SASL

Loki bearer token

Table 3. Normalizations and Transformations
FeatureFluentdVector

Viaq data model - app

Viaq data model - infra

Viaq data model - infra(journal)

Viaq data model - Linux audit

Viaq data model - kube-apiserver audit

Viaq data model - OpenShift API audit

Viaq data model - OVN

Loglevel Normalization

JSON parsing

Structured Index

Multiline error detection

Multicontainer / split indices

Flatten labels

CLF static labels

Table 4. Tuning
FeatureFluentdVector

Fluentd readlinelimit

Fluentd buffer

- chunklimitsize

- totallimitsize

- overflowaction

- flushthreadcount

- flushmode

- flushinterval

- retrywait

- retrytype

- retrymaxinterval

- retrytimeout

Table 5. Visibility
FeatureFluentdVector

Metrics

Dashboard

Alerts

Table 6. Miscellaneous
FeatureFluentdVector

Global proxy support

x86 support

ARM support

IBM Power® support

IBM Z® support

IPv6 support

Log event buffering

Disconnected Cluster

Collector outputs

The following collector outputs are supported:

Table 7. Supported outputs
FeatureFluentdVector

Elasticsearch v6-v8

Fluent forward

Syslog RFC3164

✓ (Logging 5.7+)

Syslog RFC5424

✓ (Logging 5.7+)

Kafka

Amazon Cloudwatch

Amazon Cloudwatch STS

Loki

HTTP

✓ (Logging 5.7+)

Google Cloud Logging

Splunk

✓ (Logging 5.6+)

Log forwarding

Administrators can create ClusterLogForwarder resources that specify which logs are collected, how they are transformed, and where they are forwarded to.

ClusterLogForwarder resources can be used up to forward container, infrastructure, and audit logs to specific endpoints within or outside of a cluster. Transport Layer Security (TLS) is supported so that log forwarders can be configured to send logs securely.

Administrators can also authorize RBAC permissions that define which service accounts and users can access and forward which types of logs.

Log forwarding implementations

There are two log forwarding implementations available: the legacy implementation, and the multi log forwarder feature.

Only the Vector collector is supported for use with the multi log forwarder feature. The Fluentd collector can only be used with legacy implementations.

Legacy implementation

In legacy implementations, you can only use one log forwarder in your cluster. The ClusterLogForwarder resource in this mode must be named instance, and must be created in the openshift-logging namespace. The ClusterLogForwarder resource also requires a corresponding ClusterLogging resource named instance in the openshift-logging namespace.

Multi log forwarder feature

The multi log forwarder feature is available in logging 5.8 and later, and provides the following functionality:

  • Administrators can control which users are allowed to define log collection and which logs they are allowed to collect.

  • Users who have the required permissions are able to specify additional log collection configurations.

  • Administrators who are migrating from the deprecated Fluentd collector to the Vector collector can deploy a new log forwarder separately from their existing deployment. The existing and new log forwarders can operate simultaneously while workloads are being migrated.

In multi log forwarder implementations, you are not required to create a corresponding ClusterLogging resource for your ClusterLogForwarder resource. You can create multiple ClusterLogForwarder resources using any name, in any namespace, with the following exceptions:

  • You cannot create a ClusterLogForwarder resource named instance in the openshift-logging namespace, because this is reserved for a log forwarder that supports the legacy workflow using the Fluentd collector.

  • You cannot create a ClusterLogForwarder resource named collector in the openshift-logging namespace, because this is reserved for the collector.

Enabling the multi log forwarder feature for a cluster

To use the multi log forwarder feature, you must create a service account and cluster role bindings for that service account. You can then reference the service account in the ClusterLogForwarder resource to control access permissions.

In order to support multi log forwarding in additional namespaces other than the openshift-logging namespace, you must update the Red Hat OpenShift Logging Operator to watch all namespaces. This functionality is supported by default in new Red Hat OpenShift Logging Operator version 5.8 installations.

Authorizing log collection RBAC permissions

In logging 5.8 and later, the Red Hat OpenShift Logging Operator provides collect-audit-logs, collect-application-logs, and collect-infrastructure-logs cluster roles, which enable the collector to collect audit logs, application logs, and infrastructure logs respectively.

You can authorize RBAC permissions for log collection by binding the required cluster roles to a service account.

Prerequisites

  • The Red Hat OpenShift Logging Operator is installed in the openshift-logging namespace.

  • You have administrator permissions.

Procedure

  1. Create a service account for the collector. If you want to write logs to storage that requires a token for authentication, you must include a token in the service account.

  2. Bind the appropriate cluster roles to the service account:

    Example binding command

    1. $ oc adm policy add-cluster-role-to-user <cluster_role_name> system:serviceaccount:<namespace_name>:<service_account_name>

Additional resources