Loggie’s Monitoring and Alarming

Loggie’s Monitoring and Alarming

Loggie’s monitor eventbus is designed in a publish and subscribe mode. Each component sends metrics to a specified topic, which is consumed and processed by an independent listener.

For example, file source collects some indicators from the collected logs and send them to filesource topic. After aggregated and calculated by filesource listener, these indicators will be printed and exposed as Prometheus indicators.

There is a loose coupling relationship between components, topics and listeners. For example, file source will regularly send the full matching indicators to filewatcher topic, filewatcher listener process and expose the indicators.

Monitor Configuration

The monitor eventbus is configured in the global system configuration, the example is as follows:

Config

loggie:
  monitor:
    logger:
      period: 30s
      enabled: true
    listeners:
      filesource: ~
      filewatcher: ~
      reload: ~
      queue: ~
      sink: ~
  http:
    enabled: true
    port: 9196

logger controlls the log printing of all metrics indicators. Metrics generated by the configured listeners will be aggregated and printed in Loggie log at fixed intervals (set by period), which is convenient for backtracking and troubleshooting.

listeners is used to configure whether the related listener is enabled.

Prometheus metrics are exposed at /metrics on http.port by default. You can curl <podIp>:9196/metrics to view the current metrics.

Core Indicators of Log Collection

Currently there are the following listeners:

filesource: The indicator data about current log collection, such as which files are currently being collected, and what is the collection status?
filewatcher: Timed full traversal (default 5min) to check all files matching the path, monitor the global collection status, and determine whether there are files that have not been collected in time, etc.
reload: Number of reloads
queue: Queue status
sink: Metrics of sending, such as the number of successes or failures, etc.

Deploy Prometheus and Grafana

You can use the existing Prometheus or Grafana. If you need a new deployment, please refer to https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack.

Deploy with Helm:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus prometheus-community/kube-prometheus-stack -nprometheus --create-namespace

Note

Due to some reasons, some of these k8s.gcr.io images may not be available for download. You can consider downloading the chart package to replace it and then redeploy it.

After confirming that the Pod is running normally, you can access grafana. The way to access grafana through a proxy can be referred to:

kubectl -nprometheus port-forward --address 0.0.0.0 service/prometheus-grafana 8181:80

Grafana username and password can be viewed in prometheus-grafana sercret. Use base64 -d.

Added Loggie Prometheus Monitoring

In the Kubernetes cluster where Loggie is deployed, create the following ServiceMonitor to allow Prometheus to collect Loggie Agent metrics.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app: loggie
    release: prometheus
  name: loggie-agent
  namespace: prometheus
spec:
  namespaceSelector:
    matchNames:
    - loggie
  endpoints:
  - port: monitor
  selector:
    matchLabels:
      app: loggie
      instance: loggie

At the same time, we need to add the json in the install project to Grafana to display Loggie’s monitoring console.

Note

The Kubernetes version and the Grafana version may be different, which may lead to incompatible chart display. Modify as you need.

The imported Grafana chart is as shown following:

Monitoring and Alarming for Loggie

Loggie’s Monitoring and Alarming

Monitor Configuration

Core Indicators of Log Collection

Deploy Prometheus and Grafana

Added Loggie Prometheus Monitoring