Configuring the LokiStack log store

In logging subsystem documentation, LokiStack refers to the logging subsystem supported combination of Loki and web proxy with OKD authentication integration. LokiStack’s proxy uses OKD authentication to enforce multi-tenancy. Loki refers to the log store as either the individual component or an external store.

Creating a new group for the cluster-admin user role

Querying application logs for multiple namespaces as a cluster-admin user, where the sum total of characters of all of the namespaces in the cluster is greater than 5120, results in the error Parse error: input size too long (XXXX > 5120). For better control over access to logs in LokiStack, make the cluster-admin user a member of the cluster-admin group. If the cluster-admin group does not exist, create it and add the desired users to it.

Use the following procedure to create a new group for users with cluster-admin permissions.

Procedure

  1. Enter the following command to create a new group:

    1. $ oc adm groups new cluster-admin
  2. Enter the following command to add the desired user to the cluster-admin group:

    1. $ oc adm groups add-users cluster-admin <username>
  3. Enter the following command to add cluster-admin user role to the group:

    1. $ oc adm policy add-cluster-role-to-group cluster-admin cluster-admin

LokiStack behavior during cluster restarts

In logging version 5.8 and newer versions, when an OKD cluster is restarted, LokiStack ingestion and the query path continue to operate within the available CPU and memory resources available for the node. This means that there is no downtime for the LokiStack during OKD cluster updates. This behavior is achieved by using PodDisruptionBudget resources. The Loki Operator provisions PodDisruptionBudget resources for Loki, which determine the minimum number of pods that must be available per component to ensure normal operations under certain conditions.

Additional resources

Configuring Loki to tolerate node failure

In the logging subsystem 5.8 and later versions, the Loki Operator supports setting pod anti-affinity rules to request that pods of the same component are scheduled on different available nodes in the cluster.

Affinity is a property of pods that controls the nodes on which they prefer to be scheduled. Anti-affinity is a property of pods that prevents a pod from being scheduled on a node.

In OKD, pod affinity and pod anti-affinity allow you to constrain which nodes your pod is eligible to be scheduled on based on the key-value labels on other pods.

The Operator sets default, preferred podAntiAffinity rules for all Loki components, which includes the compactor, distributor, gateway, indexGateway, ingester, querier, queryFrontend, and ruler components.

You can override the preferred podAntiAffinity settings for Loki components by configuring required settings in the requiredDuringSchedulingIgnoredDuringExecution field:

Example user settings for the ingester component

  1. apiVersion: loki.grafana.com/v1
  2. kind: LokiStack
  3. metadata:
  4. name: logging-loki
  5. namespace: openshift-logging
  6. spec:
  7. # ...
  8. template:
  9. ingester:
  10. podAntiAffinity:
  11. # ...
  12. requiredDuringSchedulingIgnoredDuringExecution: (1)
  13. - labelSelector:
  14. matchLabels: (2)
  15. app.kubernetes.io/component: ingester
  16. topologyKey: kubernetes.io/hostname
  17. # ...
1The stanza to define a required rule.
2The key-value pair (label) that must be matched to apply the rule.

Loki pod placement

You can control which nodes the Loki pods run on, and prevent other workloads from using those nodes, by using tolerations or node selectors on the pods.

You can apply tolerations to the log store pods with the LokiStack custom resource (CR) and apply taints to a node with the node specification. A taint on a node is a key:value pair that instructs the node to repel all pods that do not allow the taint. Using a specific key:value pair that is not on other pods ensures that only the log store pods can run on that node.

Example LokiStack with node selectors

  1. apiVersion: loki.grafana.com/v1
  2. kind: LokiStack
  3. metadata:
  4. name: logging-loki
  5. namespace: openshift-logging
  6. spec:
  7. # ...
  8. template:
  9. compactor: (1)
  10. nodeSelector:
  11. node-role.kubernetes.io/infra: "" (2)
  12. distributor:
  13. nodeSelector:
  14. node-role.kubernetes.io/infra: ""
  15. gateway:
  16. nodeSelector:
  17. node-role.kubernetes.io/infra: ""
  18. indexGateway:
  19. nodeSelector:
  20. node-role.kubernetes.io/infra: ""
  21. ingester:
  22. nodeSelector:
  23. node-role.kubernetes.io/infra: ""
  24. querier:
  25. nodeSelector:
  26. node-role.kubernetes.io/infra: ""
  27. queryFrontend:
  28. nodeSelector:
  29. node-role.kubernetes.io/infra: ""
  30. ruler:
  31. nodeSelector:
  32. node-role.kubernetes.io/infra: ""
  33. # ...
1Specifies the component pod type that applies to the node selector.
2Specifies the pods that are moved to nodes containing the defined label.

In the previous example configuration, all Loki pods are moved to nodes containing the node-role.kubernetes.io/infra: "" label.

Example LokiStack CR with node selectors and tolerations

  1. apiVersion: loki.grafana.com/v1
  2. kind: LokiStack
  3. metadata:
  4. name: logging-loki
  5. namespace: openshift-logging
  6. spec:
  7. # ...
  8. template:
  9. compactor:
  10. nodeSelector:
  11. node-role.kubernetes.io/infra: ""
  12. tolerations:
  13. - effect: NoSchedule
  14. key: node-role.kubernetes.io/infra
  15. value: reserved
  16. - effect: NoExecute
  17. key: node-role.kubernetes.io/infra
  18. value: reserved
  19. distributor:
  20. nodeSelector:
  21. node-role.kubernetes.io/infra: ""
  22. tolerations:
  23. - effect: NoSchedule
  24. key: node-role.kubernetes.io/infra
  25. value: reserved
  26. - effect: NoExecute
  27. key: node-role.kubernetes.io/infra
  28. value: reserved
  29. nodeSelector:
  30. node-role.kubernetes.io/infra: ""
  31. tolerations:
  32. - effect: NoSchedule
  33. key: node-role.kubernetes.io/infra
  34. value: reserved
  35. - effect: NoExecute
  36. key: node-role.kubernetes.io/infra
  37. value: reserved
  38. indexGateway:
  39. nodeSelector:
  40. node-role.kubernetes.io/infra: ""
  41. tolerations:
  42. - effect: NoSchedule
  43. key: node-role.kubernetes.io/infra
  44. value: reserved
  45. - effect: NoExecute
  46. key: node-role.kubernetes.io/infra
  47. value: reserved
  48. ingester:
  49. nodeSelector:
  50. node-role.kubernetes.io/infra: ""
  51. tolerations:
  52. - effect: NoSchedule
  53. key: node-role.kubernetes.io/infra
  54. value: reserved
  55. - effect: NoExecute
  56. key: node-role.kubernetes.io/infra
  57. value: reserved
  58. querier:
  59. nodeSelector:
  60. node-role.kubernetes.io/infra: ""
  61. tolerations:
  62. - effect: NoSchedule
  63. key: node-role.kubernetes.io/infra
  64. value: reserved
  65. - effect: NoExecute
  66. key: node-role.kubernetes.io/infra
  67. value: reserved
  68. queryFrontend:
  69. nodeSelector:
  70. node-role.kubernetes.io/infra: ""
  71. tolerations:
  72. - effect: NoSchedule
  73. key: node-role.kubernetes.io/infra
  74. value: reserved
  75. - effect: NoExecute
  76. key: node-role.kubernetes.io/infra
  77. value: reserved
  78. ruler:
  79. nodeSelector:
  80. node-role.kubernetes.io/infra: ""
  81. tolerations:
  82. - effect: NoSchedule
  83. key: node-role.kubernetes.io/infra
  84. value: reserved
  85. - effect: NoExecute
  86. key: node-role.kubernetes.io/infra
  87. value: reserved
  88. gateway:
  89. nodeSelector:
  90. node-role.kubernetes.io/infra: ""
  91. tolerations:
  92. - effect: NoSchedule
  93. key: node-role.kubernetes.io/infra
  94. value: reserved
  95. - effect: NoExecute
  96. key: node-role.kubernetes.io/infra
  97. value: reserved
  98. # ...

To configure the nodeSelector and tolerations fields of the LokiStack (CR), you can use the oc explain command to view the description and fields for a particular resource:

  1. $ oc explain lokistack.spec.template

Example output

  1. KIND: LokiStack
  2. VERSION: loki.grafana.com/v1
  3. RESOURCE: template <Object>
  4. DESCRIPTION:
  5. Template defines the resource/limits/tolerations/nodeselectors per
  6. component
  7. FIELDS:
  8. compactor <Object>
  9. Compactor defines the compaction component spec.
  10. distributor <Object>
  11. Distributor defines the distributor component spec.
  12. ...

For more detailed information, you can add a specific field:

  1. $ oc explain lokistack.spec.template.compactor

Example output

  1. KIND: LokiStack
  2. VERSION: loki.grafana.com/v1
  3. RESOURCE: compactor <Object>
  4. DESCRIPTION:
  5. Compactor defines the compaction component spec.
  6. FIELDS:
  7. nodeSelector <map[string]string>
  8. NodeSelector defines the labels required by a node to schedule the
  9. component onto it.
  10. ...

Additional resources

Zone aware data replication

In the logging subsystem 5.8 and later versions, the Loki Operator offers support for zone-aware data replication through pod topology spread constraints. Enabling this feature enhances reliability and safeguards against log loss in the event of a single zone failure. When configuring the deployment size as 1x.extra.small, 1x.small, or 1x.medium, the replication.factor field is automatically set to 2.

To ensure proper replication, you need to have at least as many availability zones as the replication factor specifies. While it is possible to have more availability zones than the replication factor, having fewer zones can lead to write failures. Each zone should host an equal number of instances for optimal operation.

Example LokiStack CR with zone replication enabled

  1. apiVersion: loki.grafana.com/v1
  2. kind: LokiStack
  3. metadata:
  4. name: logging-loki
  5. namespace: openshift-logging
  6. spec:
  7. replicationFactor: 2 (1)
  8. replication:
  9. factor: 2 (2)
  10. zones:
  11. - maxSkew: 1 (3)
  12. topologyKey: topology.kubernetes.io/zone (4)
1Deprecated field, values entered are overwritten by replication.factor.
2This value is automatically set when deployment size is selected at setup.
3The maximum difference in number of pods between any two topology domains. The default is 1, and you cannot specify a value of 0.
4Defines zones in the form of a topology key that corresponds to a node label.

Recovering Loki pods from failed zones

In OKD a zone failure happens when specific availability zone resources become inaccessible. Availability zones are isolated areas within a cloud provider’s data center, aimed at enhancing redundancy and fault tolerance. If your OKD cluster isn’t configured to handle this, a zone failure can lead to service or data loss.

Loki pods are part of a StatefulSet, and they come with Persistent Volume Claims (PVCs) provisioned by a StorageClass object. Each Loki pod and its PVCs reside in the same zone. When a zone failure occurs in a cluster, the StatefulSet controller automatically attempts to recover the affected pods in the failed zone.

The following procedure will delete the PVCs in the failed zone, and all data contained therein. To avoid complete data loss the replication factor field of the LokiStack CR should always be set to a value greater than 1 to ensure that Loki is replicating.

Prerequisites

  • Logging version 5.8 or later.

  • Verify your LokiStack CR has a replication factor greater than 1.

  • Zone failure detected by the control plane, and nodes in the failed zone are marked by cloud provider integration.

The StatefulSet controller automatically attempts to reschedule pods in a failed zone. Because the associated PVCs are also in the failed zone, automatic rescheduling to a different zone does not work. You must manually delete the PVCs in the failed zone to allow successful re-creation of the stateful Loki Pod and its provisioned PVC in the new zone.

Procedure

  1. List the pods in Pending status by running the following command:

    1. oc get pods --field-selector status.phase==Pending -n openshift-logging

    Example oc get pods output

    1. NAME READY STATUS RESTARTS AGE (1)
    2. logging-loki-index-gateway-1 0/1 Pending 0 17m
    3. logging-loki-ingester-1 0/1 Pending 0 16m
    4. logging-loki-ruler-1 0/1 Pending 0 16m
    1These pods are in Pending status because their corresponding PVCs are in the failed zone.
  2. List the PVCs in Pending status by running the following command:

    1. oc get pvc -o=json -n openshift-logging | jq '.items[] | select(.status.phase == "Pending") | .metadata.name' -r

    Example oc get pvc output

    1. storage-logging-loki-index-gateway-1
    2. storage-logging-loki-ingester-1
    3. wal-logging-loki-ingester-1
    4. storage-logging-loki-ruler-1
    5. wal-logging-loki-ruler-1
  3. Delete the PVC(s) for a pod by running the following command:

    1. oc delete pvc __<pvc_name>__ -n openshift-logging
  4. Then delete the pod(s) by running the following command:

    1. oc delete pod __<pod_name>__ -n openshift-logging

Once these objects have been successfully deleted, they should automatically be rescheduled in an available zone.

Troubleshooting PVC in a terminating state

The PVCs might hang in the terminating state without being deleted, if PVC metadata finalizers are set to kubernetes.io/pv-protection. Removing the finalizers should allow the PVCs to delete successfully.

  1. Remove the finalizer for each PVC by running the command below, then retry deletion.

    1. oc patch pvc __<pvc_name>__ -p '{"metadata":{"finalizers":null}}' -n openshift-logging

Additional resources

Fine grained access for Loki logs

In logging subsystem 5.8 and later, the Red Hat OpenShift Logging Operator does not grant all users access to logs by default. As an administrator, you must configure your users’ access unless the Operator was upgraded and prior configurations are in place. Depending on your configuration and need, you can configure fine grain access to logs using the following:

  • Cluster wide policies

  • Namespace scoped policies

  • Creation of custom admin groups

As an administrator, you need to create the role bindings and cluster role bindings appropriate for your deployment. The Red Hat OpenShift Logging Operator provides the following cluster roles:

  • cluster-logging-application-view grants permission to read application logs.

  • cluster-logging-infrastructure-view grants permission to read infrastructure logs.

  • cluster-logging-audit-view grants permission to read audit logs.

If you have upgraded from a prior version, an additional cluster role logging-application-logs-reader and associated cluster role binding logging-all-authenticated-application-logs-reader provide backward compatibility, allowing any authenticated user read access in their namespaces.

Users with access by namespace must provide a namespace when querying application logs.

Cluster wide access

Cluster role binding resources reference cluster roles, and set permissions cluster wide.

Example ClusterRoleBinding

  1. kind: ClusterRoleBinding
  2. apiVersion: rbac.authorization.k8s.io/v1
  3. metadata:
  4. name: logging-all-application-logs-reader
  5. roleRef:
  6. apiGroup: rbac.authorization.k8s.io
  7. kind: ClusterRole
  8. name: cluster-logging-application-view (1)
  9. subjects: (2)
  10. - kind: Group
  11. name: system:authenticated
  12. apiGroup: rbac.authorization.k8s.io
1Additional ClusterRoles are cluster-logging-infrastructure-view, and cluster-logging-audit-view.
2Specifies the users or groups this object applies to.

Namespaced access

RoleBinding resources can be used with ClusterRole objects to define the namespace a user or group has access to logs for.

Example RoleBinding

  1. kind: RoleBinding
  2. apiVersion: rbac.authorization.k8s.io/v1
  3. metadata:
  4. name: allow-read-logs
  5. namespace: log-test-0 (1)
  6. roleRef:
  7. apiGroup: rbac.authorization.k8s.io
  8. kind: ClusterRole
  9. name: cluster-logging-application-view
  10. subjects:
  11. - kind: User
  12. apiGroup: rbac.authorization.k8s.io
  13. name: testuser-0
1Specifies the namespace this RoleBinding applies to.

Custom admin group access

If you have a large deployment with a number of users who require broader permissions, you can create a custom group using the adminGroup field. Users who are members of any group specified in the adminGroups field of the LokiStack CR are considered admins. Admin users have access to all application logs in all namespaces, if they also get assigned the cluster-logging-application-view role.

Example LokiStack CR

  1. apiVersion: loki.grafana.com/v1
  2. kind: LokiStack
  3. metadata:
  4. name: logging-loki
  5. namespace: openshift-logging
  6. spec:
  7. tenants:
  8. mode: openshift-logging (1)
  9. openshift:
  10. adminGroups: (2)
  11. - cluster-admin
  12. - custom-admin-group (3)
1Custom admin groups are only available in this mode.
2Entering an empty list [] value for this field disables admin groups.
3Overrides the default groups (system:cluster-admins, cluster-admin, dedicated-admin)

Enabling stream-based retention with Loki

Additional resources

With Logging version 5.6 and higher, you can configure retention policies based on log streams. Rules for these may be set globally, per tenant, or both. If you configure both, tenant rules apply before global rules.

  1. To enable stream-based retention, create a LokiStack custom resource (CR):

    Example global stream-based retention

    1. apiVersion: loki.grafana.com/v1
    2. kind: LokiStack
    3. metadata:
    4. name: logging-loki
    5. namespace: openshift-logging
    6. spec:
    7. limits:
    8. global: (1)
    9. retention: (2)
    10. days: 20
    11. streams:
    12. - days: 4
    13. priority: 1
    14. selector: '{kubernetes_namespace_name=~"test.+"}' (3)
    15. - days: 1
    16. priority: 1
    17. selector: '{log_type="infrastructure"}'
    18. managementState: Managed
    19. replicationFactor: 1
    20. size: 1x.small
    21. storage:
    22. schemas:
    23. - effectiveDate: "2020-10-11"
    24. version: v11
    25. secret:
    26. name: logging-loki-s3
    27. type: aws
    28. storageClassName: standard
    29. tenants:
    30. mode: openshift-logging
    1Sets retention policy for all log streams. Note: This field does not impact the retention period for stored logs in object storage.
    2Retention is enabled in the cluster when this block is added to the CR.
    3Contains the LogQL query used to define the log stream.

    Example per-tenant stream-based retention

    1. apiVersion: loki.grafana.com/v1
    2. kind: LokiStack
    3. metadata:
    4. name: logging-loki
    5. namespace: openshift-logging
    6. spec:
    7. limits:
    8. global:
    9. retention:
    10. days: 20
    11. tenants: (1)
    12. application:
    13. retention:
    14. days: 1
    15. streams:
    16. - days: 4
    17. selector: '{kubernetes_namespace_name=~"test.+"}' (2)
    18. infrastructure:
    19. retention:
    20. days: 5
    21. streams:
    22. - days: 1
    23. selector: '{kubernetes_namespace_name=~"openshift-cluster.+"}'
    24. managementState: Managed
    25. replicationFactor: 1
    26. size: 1x.small
    27. storage:
    28. schemas:
    29. - effectiveDate: "2020-10-11"
    30. version: v11
    31. secret:
    32. name: logging-loki-s3
    33. type: aws
    34. storageClassName: standard
    35. tenants:
    36. mode: openshift-logging
    1Sets retention policy by tenant. Valid tenant types are application, audit, and infrastructure.
    2Contains the LogQL query used to define the log stream.
  2. Apply the LokiStack CR:

    1. $ oc apply -f <filename>.yaml

This is not for managing the retention for stored logs. Global retention periods for stored logs to a supported maximum of 30 days is configured with your object storage.

Troubleshooting Loki rate limit errors

If the Log Forwarder API forwards a large block of messages that exceeds the rate limit to Loki, Loki generates rate limit (429) errors.

These errors can occur during normal operation. For example, when adding the logging subsystem to a cluster that already has some logs, rate limit errors might occur while the logging subsystem tries to ingest all of the existing log entries. In this case, if the rate of addition of new logs is less than the total rate limit, the historical data is eventually ingested, and the rate limit errors are resolved without requiring user intervention.

In cases where the rate limit errors continue to occur, you can fix the issue by modifying the LokiStack custom resource (CR).

The LokiStack CR is not available on Grafana-hosted Loki. This topic does not apply to Grafana-hosted Loki servers.

Conditions

  • The Log Forwarder API is configured to forward logs to Loki.

  • Your system sends a block of messages that is larger than 2 MB to Loki. For example:

    1. "values":[["1630410392689800468","{\"kind\":\"Event\",\"apiVersion\":\
    2. .......
    3. ......
    4. ......
    5. ......
    6. \"received_at\":\"2021-08-31T11:46:32.800278+00:00\",\"version\":\"1.7.4 1.6.0\"}},\"@timestamp\":\"2021-08-31T11:46:32.799692+00:00\",\"viaq_index_name\":\"audit-write\",\"viaq_msg_id\":\"MzFjYjJkZjItNjY0MC00YWU4LWIwMTEtNGNmM2E5ZmViMGU4\",\"log_type\":\"audit\"}"]]}]}
  • After you enter oc logs -n openshift-logging -l component=collector, the collector logs in your cluster show a line containing one of the following error messages:

    1. 429 Too Many Requests Ingestion rate limit exceeded

    Example Vector error message

    1. 2023-08-25T16:08:49.301780Z WARN sink{component_kind="sink" component_id=default_loki_infra component_type=loki component_name=default_loki_infra}: vector::sinks::util::retries: Retrying after error. error=Server responded with an error: 429 Too Many Requests internal_log_rate_limit=true

    Example Fluentd error message

    1. 2023-08-30 14:52:15 +0000 [warn]: [default_loki_infra] failed to flush the buffer. retry_times=2 next_retry_time=2023-08-30 14:52:19 +0000 chunk="604251225bf5378ed1567231a1c03b8b" error_class=Fluent::Plugin::LokiOutput::LogPostError error="429 Too Many Requests Ingestion rate limit exceeded for user infrastructure (limit: 4194304 bytes/sec) while attempting to ingest '4082' lines totaling '7820025' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased\n"

    The error is also visible on the receiving end. For example, in the LokiStack ingester pod:

    Example Loki ingester error message

    1. level=warn ts=2023-08-30T14:57:34.155592243Z caller=grpc_logging.go:43 duration=1.434942ms method=/logproto.Pusher/Push err="rpc error: code = Code(429) desc = entry with timestamp 2023-08-30 14:57:32.012778399 +0000 UTC ignored, reason: 'Per stream rate limit exceeded (limit: 3MB/sec) while attempting to ingest for stream

Procedure

  • Update the ingestionBurstSize and ingestionRate fields in the LokiStack CR:

    1. apiVersion: loki.grafana.com/v1
    2. kind: LokiStack
    3. metadata:
    4. name: logging-loki
    5. namespace: openshift-logging
    6. spec:
    7. limits:
    8. global:
    9. ingestion:
    10. ingestionBurstSize: 16 (1)
    11. ingestionRate: 8 (2)
    12. # ...
    1The ingestionBurstSize field defines the maximum local rate-limited sample size per distributor replica in MB. This value is a hard limit. Set this value to at least the maximum logs size expected in a single push request. Single requests that are larger than the ingestionBurstSize value are not permitted.
    2The ingestionRate field is a soft limit on the maximum amount of ingested samples per second in MB. Rate limit errors occur if the rate of logs exceeds the limit, but the collector retries sending the logs. As long as the total average is lower than the limit, the system recovers and errors are resolved without user intervention.

Additional Resources