Log format and Meta Information Fields

It is recommended to first understand the schema design of Loggie’s internal log data schema design.

Loggie is deployed in different environments. If you need to add some meta information to the original log data and want it be compatible with existing formats, you can refer to the following methods.

Field Format Conversion

Use Schema Interceptor

Use schema interceptor to add time fields, as well as pipelineName and sourceName fields. In addition, you can rename the field, such as modify body to message. Refer to schema interceptor.

In most cases, we need the configuration to take effect globally instead of just adding the interceptor in a pipeline. Therefore, it is recommended to add the schema interceptor to the defaults of the system configuration, so as to avoid to configure the interceptor for each pipeline.

loggie.yml

  1. loggie:
  2. defaults:
  3. interceptors:
  4. - type: schema
  5. name: global
  6. order: 700
  7. addMeta:
  8. timestamp:
  9. key: "@timestamp"
  10. remap:
  11. body:
  12. key: message

The name here is for identification, to avoid that when a schema interceptor is added to the pipeline, the verification will fail. In addition, set the order field to a smaller value (the default is 900), so that the interceptor in default will be executed prior to other interceptors defined in the pipeline.

Use Transformer Interceptor

Tranformer provides richer functions and can deal with complex log scenarios. Please refer to transformer interceptor

Add Meta Information

Add Custom Meta Information to Fields

If we configure some custom fields on the source.

pipelines.yml

  1. pipelines:
  2. - name: local
  3. sources:
  4. - type: file
  5. name: demo
  6. paths:
  7. - /tmp/log/*.log
  8. fields:
  9. topic: "loggie"
  10. sink:
  11. type: dev
  12. printEvents: true
  13. codec:
  14. pretty: true

Then the sink output is:

  1. {
  2. "fields": {
  3. "topic": "loggie",
  4. },
  5. "body": "01-Dec-2021 03:13:58.298 INFO [main] Starting service [Catalina]"
  6. }

We can also configure fieldsUnderRoot: true, so that key:value are at the same level as body.

pipelines.yml

  1. pipelines:
  2. - name: local
  3. sources:
  4. - type: file
  5. fields:
  6. topic: "loggie"
  7. fieldsUnderRoot: true
  8. ...
  1. {
  2. "topic": "loggie",
  3. "body": "01-Dec-2021 03:13:58.298 INFO [main] Starting service [Catalina]"
  4. }

Add Log Collection Status Information of File Source

When we use the file source, we may want to automatically add some log collection status to the raw data, such as the name of the collected file, the offset of the collected file, etc. The file source provides a addonMeta configuration that can be quickly enabled.

Example: add addonMeta: true .

file source

  1. sources:
  2. - type: file
  3. paths:
  4. - /var/log/*.log
  5. addonMeta: true

At this point, the collected events will become similar to the following:

Example

  1. {
  2. "body": "this is test",
  3. "state": {
  4. "pipeline": "local",
  5. "source": "demo",
  6. "filename": "/var/log/a.log",
  7. "timestamp": "2006-01-02T15:04:05.000Z",
  8. "offset": 1024,
  9. "bytes": 4096,
  10. "hostname": "node-1"
  11. }
  12. }

For specific field meanings, please refer to file source

Add Kubernetes Meta Information

In Kubernetes, in order to retrieve the collected container logs using the namespace/podName and other information during query, it is often necessary to add relevant metadata. We can configure additional k8s fields in discovery.kubernetes of the system configuration.

See discovery

Add System Built-in Meta Information

There is some built-in meta information in the Loggie system, and we also want to send it to the downstream. At this time, we need to use the addMeta processor in the normalize interceptor. (It should be noted that this operation will have a certain impact on the collection and transmission performance. Under normal circumstances, this method is not recommended)

pipelines.yml

  1. pipelines:
  2. - name: local
  3. sources:
  4. - type: file
  5. name: demo
  6. paths:
  7. - /tmp/log/*.log
  8. fields:
  9. topic: "loggie"
  10. interceptors:
  11. - type: normalize
  12. processors:
  13. - addMeta: ~
  14. sink:
  15. type: dev
  16. printEvents: true
  17. codec:
  18. pretty: true

After the addMeta processor is configured, all the built-in meta information of the system will be output by default.

The default Json format output example is as follows:

Example

  1. {
  2. "fields": {
  3. "topic": "loggie"
  4. },
  5. "meta": {
  6. "systemState": {
  7. "nextOffset": 720,
  8. "filename": "/tmp/log/a.log",
  9. "collectTime": "2022-03-08T11:33:47.369813+08:00",
  10. "contentBytes": 90,
  11. "jobUid": "43772050-16777231",
  12. "lineNumber": 8,
  13. "offset": 630
  14. },
  15. "systemProductTime": "2022-03-08T11:33:47.370166+08:00",
  16. "systemPipelineName": "local",
  17. "systemSourceName": "demo"
  18. },
  19. "body": "01-Dec-2021 03:13:58.298 INFO [main] Starting service [Catalina]"
  20. }

We may feel that this data is too much, or want to modify the field. We can use the action in the transformer interceptor.