Configuring systemd-journald and Fluentd

Because Fluentd reads from the journal, and the journal default settings are very low, journal entries can be lost because the journal cannot keep up with the logging rate from system services.

We recommend setting RateLimitIntervalSec=30s and RateLimitBurst=10000 (or even higher if necessary) to prevent the journal from losing entries.

Configuring systemd-journald for cluster logging

As you scale up your project, the default logging environment might need some adjustments.

For example, if you are missing logs, you might have to increase the rate limits for journald. You can adjust the number of messages to retain for a specified period of time to ensure that cluster logging does not use excessive resources without dropping logs.

You can also determine if you want the logs compressed, how long to retain logs, how or if the logs are stored, and other settings.

Procedure

  1. Create a journald.conf file with the required settings:

    1. Compress=yes (1)
    2. ForwardToConsole=no (2)
    3. ForwardToSyslog=no
    4. MaxRetentionSec=1month (3)
    5. RateLimitBurst=10000 (4)
    6. RateLimitIntervalSec=30s
    7. Storage=persistent (5)
    8. SyncIntervalSec=1s (6)
    9. SystemMaxUse=8g (7)
    10. SystemKeepFree=20% (8)
    11. SystemMaxFileSize=10M (9)
    1Specify whether you want logs compressed before they are written to the file system. Specify yes to compress the message or no to not compress. The default is yes.
    2Configure whether to forward log messages. Defaults to no for each. Specify:
    • ForwardToConsole to forward logs to the system console.

    • ForwardToKsmg to forward logs to the kernel log buffer.

    • ForwardToSyslog to forward to a syslog daemon.

    • ForwardToWall to forward messages as wall messages to all logged-in users.

    3Specify the maximum time to store journal entries. Enter a number to specify seconds. Or include a unit: “year”, “month”, “week”, “day”, “h” or “m”. Enter 0 to disable. The default is 1month.
    4Configure rate limiting. If, during the time interval defined by RateLimitIntervalSec, more logs than specified in RateLimitBurst are received, all further messages within the interval are dropped until the interval is over. It is recommended to set RateLimitIntervalSec=30s and RateLimitBurst=10000, which are the defaults.
    5Specify how logs are stored. The default is persistent:
    • volatile to store logs in memory in /var/log/journal/.

    • persistent to store logs to disk in /var/log/journal/. systemd creates the directory if it does not exist.

    • auto to store logs in in /var/log/journal/ if the directory exists. If it does not exist, systemd temporarily stores logs in /run/systemd/journal.

    • none to not store logs. systemd drops all logs.

    6Specify the timeout before synchronizing journal files to disk for ERR, WARNING, NOTICE, INFO, and DEBUG logs. systemd immediately syncs after receiving a CRIT, ALERT, or EMERG log. The default is 1s.
    7Specify the maximum size the journal can use. The default is 8g.
    8Specify how much disk space systemd must leave free. The default is 20%.
    9Specify the maximum size for individual journal files stored persistently in /var/log/journal. The default is 10M.

    If you are removing the rate limit, you might see increased CPU utilization on the system logging daemons as it processes any messages that would have previously been throttled.

    For more information on systemd settings, see https://www.freedesktop.org/software/systemd/man/journald.conf.html. The default settings listed on that page might not apply to OKD.

  2. Convert the journal.conf file to base64 and store it in a variable that is named jrnl_cnf by running the following command:

    1. $ export jrnl_cnf=$( cat journald.conf | base64 -w0 )
  3. Create a MachineConfig object that includes the jrnl_cnf variable, which was created in the previous step. The following sample command creates a MachineConfig object for the worker:

    1. $ cat << EOF > ./40-worker-custom-journald.yaml (1)
    2. apiVersion: machineconfiguration.openshift.io/v1
    3. kind: MachineConfig
    4. metadata:
    5. labels:
    6. machineconfiguration.openshift.io/role: worker (2)
    7. name: 40-worker-custom-journald (3)
    8. spec:
    9. config:
    10. ignition:
    11. config: {}
    12. security:
    13. tls: {}
    14. timeouts: {}
    15. version: 3.1.0
    16. networkd: {}
    17. passwd: {}
    18. storage:
    19. files:
    20. - contents:
    21. source: data:text/plain;charset=utf-8;base64,${jrnl_cnf} (4)
    22. verification: {}
    23. filesystem: root
    24. mode: 0644 (5)
    25. path: /etc/systemd/journald.conf.d/custom.conf
    26. osImageURL: ""
    27. EOF
    1Optional: For control plane (also known as master) node, you can provide the file name as 40-master-custom-journald.yaml.
    2Optional: For control plane (also known as master) node, provide the role as master.
    3Optional: For control plane (also known as master) node, you can provide the name as 40-master-custom-journald.
    4Optional: To include a static copy of the parameters in the journald.conf file, replace ${jrnl_cnf} with the output of the echo $jrnl_cnf command.
    5Set the permissions for the journal.conf file. It is recommended to set 0644 permissions.
  4. Create the machine config:

    1. $ oc apply -f <file_name>.yaml

    The controller detects the new MachineConfig object and generates a new rendered-worker-<hash> version.

  5. Monitor the status of the rollout of the new rendered configuration to each node:

    1. $ oc describe machineconfigpool/<node> (1)
    1Specify the node as master or worker.

    Example output for worker

    1. Name: worker
    2. Namespace:
    3. Labels: machineconfiguration.openshift.io/mco-built-in=
    4. Annotations: <none>
    5. API Version: machineconfiguration.openshift.io/v1
    6. Kind: MachineConfigPool
    7. ...
    8. Conditions:
    9. Message:
    10. Reason: All nodes are updating to rendered-worker-913514517bcea7c93bd446f4830bc64e