Overview of Loki

Overview of Loki

Grafana Loki is a set of components that can be composed into a fully featuredlogging stack.

Unlike other logging systems, Loki is built around the idea of only indexinglabels for logs and leaving the original log message unindexed. This meansthat Loki is cheaper to operate and can be orders of magnitude more efficient.

For a more detailed version of this same document, please readArchitecture.

Multi Tenancy

Loki supports multi-tenancy so that data between tenants is completelyseparated. Multi-tenancy is achieved through a tenant ID (which is representedas an alphanumeric string). When multi-tenancy mode is disabled, all requestsare internally given a tenant ID of “fake”.

Modes of Operation

Loki is optimized for both running locally (or at small scale) and for scalinghorizontally: Loki comes with a single process mode that runs all of the requiredmicroservices in one process. The single process mode is great for testing Lokior for running it at a small scale. For horizontal scalability, themicroservices of Loki can be broken out into separate processes, allowing themto scale independently of each other.

Components

Distributor

The distributor service is responsible for handling logs written byclients. It’s essentially the “first stop” in the writepath for log data. Once the distributor receives log data, it splits them intobatches and sends them to multiple ingesters in parallel.

Distributors communicate with ingesters via gRPC. They arestateless and can be scaled up and down as needed.

Hashing

Distributors use consistent hashing in conjunction with a configurablereplication factor to determine which instances of the ingester service shouldreceive log data.

The hash is based on a combination of the log’s labels and the tenant ID.

A hash ring stored in Consul is used to achieveconsistent hashing; all ingesters register themselves into thehash ring with a set of tokens they own. Distributors then find the token thatmost closely matches the value of the log’s hash and will send data to thattoken’s owner.

Quorum consistency

Since all distributors share access to the same hash ring, write requests can besent to any distributor.

To ensure consistent query results, Loki usesDynamo-stylequorum consistency on reads and writes. This means that the distributor will waitfor a positive response of at least one half plus one of the ingesters to sendthe sample to before responding to the user.

Ingester

The ingester service is responsible for writing log data to long-termstorage backends (DynamoDB, S3, Cassandra, etc.).

The ingester validates that ingested log lines are received intimestamp-ascending order (i.e., each log has a timestamp that occurs at a latertime than the log before it). When the ingester receives a log that does notfollow this order, the log line is rejected and an error is returned.

Logs from each unique set of labels are built up into “chunks” in memory andthen flushed to the backing storage backend.

If an ingester process crashes or exits abruptly, all the data that has not yetbeen flushed will be lost. Loki is usually configured to replicate multiplereplicas (usually 3) of each log to mitigate this risk.

Handoff

By default, when an ingester is shutting down and tries to leave the hash ring,it will wait to see if a new ingester tries to enter before flushing and willtry to initiate a handoff. The handoff will transfer all of the tokens andin-memory chunks owned by the leaving ingester to the new ingester.

This process is used to avoid flushing all chunks when shutting down, which is aslow process.

Filesystem Support

While ingesters do support writing to the filesystem through BoltDB, this onlyworks in single-process mode as queriers need access to the sameback-end store and BoltDB only allows one process to have a lock on the DB at agiven time.

Querier

The querier service handles the actual LogQL evaluation oflogs stored in long-term storage.

It first tries to query all ingesters for in-memory data before falling back toloading data from the backend store.

Chunk Store

The chunk store is Loki’s long-term data store, designed to supportinteractive querying and sustained writing without the need for backgroundmaintenance tasks. It consists of:

An index for the chunks. This index can be backed byDynamoDB from Amazon Web Services,Bigtable from Google Cloud Platform, orApache Cassandra.
A key-value (KV) store for the chunk data itself, which can be DynamoDB,Bigtable, Cassandra again, or an object store such asAmazon * S3

Unlike the other core components of Loki, the chunk store is not a separateservice, job, or process, but rather a library embedded in the two servicesthat need to access Loki data: the ingester and querier.

The chunk store relies on a unified interface to the“NoSQL“ stores (DynamoDB, Bigtable, andCassandra) that can be used to back the chunk store index. This interfaceassumes that the index is a collection of entries keyed by:

A hash key. This is required for all reads and writes.
A range key. This is required for writes and can be omitted for reads,which can be queried by prefix or range.

The interface works somewhat differently across the supported databases:

DynamoDB supports range and hash keys natively. Index entries are thusmodelled directly as DynamoDB entries, with the hash key as the distributionkey and the range as the range key.
For Bigtable and Cassandra, index entries are modelled as individual columnvalues. The hash key becomes the row key and the range key becomes the columnkey.

A set of schemas are used to map the matchers and label sets used on reads andwrites to the chunk store into appropriate operations on the index. Schemas havebeen added as Loki has evolved, mainly in an attempt to better load balancewrites and improve query performance.

The current schema recommendation is the v10 schema.