Tail

The tail input plugin allows to monitor one or several text files. It has a similar behavior like tail -f shell command.

The plugin reads every matched file in the Path pattern and for every new line found (separated by a \n), it generates a new record. Optionally a database file can be used so the plugin can have a history of tracked files and a state of offsets, this is very useful to resume a state if the service is restarted.

Configuration Parameters

The plugin supports the following configuration parameters:

Key Description Default
Buffer_Chunk_Size Set the initial buffer size to read files data. This value is used to increase buffer size. The value must be according to the Unit Size specification. 32k
Buffer_Max_Size Set the limit of the buffer size per monitored file. When a buffer needs to be increased (e.g: very long lines), this value is used to restrict how much the memory buffer can grow. If reading a file exceeds this limit, the file is removed from the monitored file list. The value must be according to the Unit Size specification. Buffer_Chunk_Size
Path Pattern specifying a specific log file or multiple ones through the use of common wildcards. Multiple patterns separated by commas are also allowed.
Path_Key If enabled, it appends the name of the monitored file as part of the record. The value assigned becomes the key in the map.
Exclude_Path Set one or multiple shell patterns separated by commas to exclude files matching certain criteria, e.g: Exclude_Path *.gz,*.zip
Read_from_Head For new discovered files on start (without a database offset/position), read the content from the head of the file, not tail. Off
Refresh_Interval The interval of refreshing the list of watched files in seconds. 60
Rotate_Wait Specify the number of extra time in seconds to monitor a file once is rotated in case some pending data is flushed. 5
Ignore_Older Ignores records which are older than this time in seconds. Supports m,h,d (minutes, hours, days) syntax. Default behavior is to read all records from specified files. Only available when a Parser is specificied and it can parse the time of a record.
Skip_Long_Lines When a monitored file reach it buffer capacity due to a very long line (Buffer_Max_Size), the default behavior is to stop monitoring that file. Skip_Long_Lines alter that behavior and instruct Fluent Bit to skip long lines and continue processing other lines that fits into the buffer size. Off
DB Specify the database file to keep track of monitored files and offsets.
DB.sync Set a default synchronization (I/O) method. Values: Extra, Full, Normal, Off. This flag affects how the internal SQLite engine do synchronization to disk, for more details about each option please refer to this section. Most of workload scenarios will be fine with normal mode, but if you really need full synchronization after every write operation you should set full mode. Note that full has a high I/O performance cost. normal
DB.locking Specify that the database will be accessed only by Fluent Bit. Enabling this feature helps to increase performance when accessing the database but it restrict any external tool to query the content. false
Mem_Buf_Limit Set a limit of memory that Tail plugin can use when appending data to the Engine. If the limit is reach, it will be paused; when the data is flushed it resumes.
exit_on_eof Exit Fluent Bit when reaching EOF of the monitored files. false
Parser Specify the name of a parser to interpret the entry as a structured message.
Key When a message is unstructured (no parser applied), it’s appended as a string under the key name log. This option allows to define an alternative name for that key. log
Tag Set a tag (with regex-extract fields) that will be placed on lines read. E.g. kube.<namespace_name>.<pod_name>.<container_name>. Note that “tag expansion” is supported: if the tag includes an asterisk (*), that asterisk will be replaced with the absolute path of the monitored file (also see Workflow of Tail + Kubernetes Filter).
Tag_Regex Set a regex to extract fields from the file. E.g. (?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-

Note that if the database parameter DB is not specified, by default the plugin will start reading each target file from the beginning. This also might cause some unwanted behaviour, for example when a line is bigger that Buffer_Chunk_Size and Skip_Long_Lines is not turned on, the file will be read from the beginning each Refresh_Interval until the file is rotated.

Multiline Configuration Parameters

Additionally the following options exists to configure the handling of multi-lines files:

Key Description Default
Multiline If enabled, the plugin will try to discover multiline messages and use the proper parsers to compose the outgoing messages. Note that when this option is enabled the Parser option is not used. Off
Multiline_Flush Wait period time in seconds to process queued multiline messages 4
Parser_Firstline Name of the parser that matchs the beginning of a multiline message. Note that the regular expression defined in the parser must include a group name (named capture)
Parser_N Optional-extra parser to interpret and structure multiline entries. This option can be used to define multiple parsers, e.g: Parser_1 ab1, Parser_2 ab2, Parser_N abN.

Docker Mode Configuration Parameters

Docker mode exists to recombine JSON log lines split by the Docker daemon due to its line length limit. To use this feature, configure the tail plugin with the corresponding parser and then enable Docker mode:

Key Description Default
Docker_Mode If enabled, the plugin will recombine split Docker log lines before passing them to any parser as configured above. This mode cannot be used at the same time as Multiline. Off
Docker_Mode_Flush Wait period time in seconds to flush queued unfinished split lines. 4
Docker_Mode_Parser Specify an optional parser for the first line of the docker multiline mode. The parser name to be specified must be registered in the parsers.conf file.

Getting Started

In order to tail text or log files, you can run the plugin from the command line or through the configuration file:

Command Line

From the command line you can let Fluent Bit parse text files with the following options:

  1. $ fluent-bit -i tail -p path=/var/log/syslog -o stdout

Configuration File

In your main configuration file append the following Input & Output sections. An example visualization can be found here

  1. [INPUT]
  2. Name tail
  3. Path /var/log/syslog
  4. [OUTPUT]
  5. Name stdout
  6. Match *

Tail - 图1

Multi-line example

When using multi-line configuration you need to first specify Multiline On in the configuration and use the Parser_Firstline and additional parser parameters Parser_N if needed. If we are trying to read the following Java Stacktrace as a single event

  1. Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
  2. at com.myproject.module.MyProject.badMethod(MyProject.java:22)
  3. at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
  4. at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
  5. at com.myproject.module.MyProject.someMethod(MyProject.java:10)
  6. at com.myproject.module.MyProject.main(MyProject.java:6)

We need to specify a Parser_Firstline parameter that matches the first line of a multi-line event. Once a match is made Fluent Bit will read all future lines until another match with Parser_Firstline is made .

In the case above we can use the following parser, that extracts the Time as time and the remaining portion of the multiline as log

  1. [PARSER]
  2. Name multiline
  3. Format regex
  4. Regex /(?<time>Dec \d+ \d+\:\d+\:\d+)(?<message>.*)/
  5. Time_Key time
  6. Time_Format %b %d %H:%M:%S

If we want to further parse the entire event we can add additional parsers with Parser_N where N is an integer. The final Fluent Bit configuration looks like the following:

  1. # Note this is generally added to parsers.conf and referenced in [SERVICE]
  2. [PARSER]
  3. Name multiline
  4. Format regex
  5. Regex /(?<time>Dec \d+ \d+\:\d+\:\d+)(?<message>.*)/
  6. Time_Key time
  7. Time_Format %b %d %H:%M:%S
  8. [INPUT]
  9. Name tail
  10. Multiline On
  11. Parser_Multiline multiline
  12. Path /var/log/java.log
  13. [OUTPUT]
  14. Name stdout
  15. Match *

Our output will be as follows.

  1. [0] tail.0: [1607928428.466041977, {"message"=>"Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
  2. at com.myproject.module.MyProject.badMethod(MyProject.java:22)
  3. at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
  4. at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
  5. at com.myproject.module.MyProject.someMethod(MyProject.java:10)", "message"=>"at com.myproject.module.MyProject.main(MyProject.java:6)"}]

Tailing files keeping state

The tail input plugin a feature to save the state of the tracked files, is strongly suggested you enabled this. For this purpose the db property is available, e.g:

  1. $ fluent-bit -i tail -p path=/var/log/syslog -p db=/path/to/logs.db -o stdout

When running, the database file /path/to/logs.db will be created, this database is backed by SQLite3 so if you are interested into explore the content, you can open it with the SQLite client tool, e.g:

  1. $ sqlite3 tail.db
  2. -- Loading resources from /home/edsiper/.sqliterc
  3. SQLite version 3.14.1 2016-08-11 18:53:32
  4. Enter ".help" for usage hints.
  5. sqlite> SELECT * FROM in_tail_files;
  6. id name offset inode created
  7. ----- -------------------------------- ------------ ------------ ----------
  8. 1 /var/log/syslog 73453145 23462108 1480371857
  9. sqlite>

Make sure to explore when Fluent Bit is not hard working on the database file, otherwise you will see some Error: database is locked messages.

Formatting SQLite

By default SQLite client tool do not format the columns in a human read-way, so to explore in_tail_files table you can create a config file in ~/.sqliterc with the following content:

  1. .headers on
  2. .mode column
  3. .width 5 32 12 12 10

File Rotation

File rotation is properly handled, including logrotate’s copytruncate mode.

Note that the Path patterns cannot match the rotated files. Otherwise, the rotated file would be read again and lead to duplicate records.