Query Logs

This document provides a guide on how to use GreptimeDB’s query language for effective searching and analysis of log data.

Overview

GreptimeDB allows for flexible querying of data using SQL statements. This section introduces specific search functions and query statements designed to enhance your log querying capabilities.

Full-Text Search Using the MATCHES Function

In SQL statements, you can use the MATCHES function to perform full-text searches, which is especially useful for log analysis. The MATCHES function supports full-text searches on String type columns. Here’s an example of how it can be used:

  1. SELECT * FROM logs WHERE MATCHES(message, 'error OR fail');

The MATCHES function is designed for full-text search and accepts two parameters:

  • column_name: The column to perform the full-text search on, which should contain textual data of type String. The full-text index must be created on this column to optimize queries.
  • search_query: A string containing query statement which you want to search for. See the Query Statements section below for more details.

Query Statements

Simple Term

Simple term searches are straightforward:

  1. SELECT * FROM logs WHERE MATCHES(message, 'Barack Obama');

The value Barack Obama in the search_query parameter of the MATCHES function will be considered as two separate terms: Barack and Obama. This means the query will match all rows containing either Barack or Obama, equivalent to using OR:

  1. SELECT * FROM logs WHERE MATCHES(message, 'Barack OR Obama');

Negative Term

Prefixing a term with - excludes rows containing that term. For instance, to find rows containing apple but not fruit:

  1. SELECT * FROM logs WHERE MATCHES(message, 'apple -fruit');

Must Term

Prefixing a term with + specifies that it must be included in the results. For example, to query rows containing both apple and fruit:

  1. SELECT * FROM logs WHERE MATCHES(message, '+apple +fruit');

Boolean Operators

Boolean operators can specify logical conditions for the search. For example, the AND operator requires all specified terms to be included, while the OR operator requires at least one term to be included. The AND operator takes precedence over OR, so the expression a AND b OR c is interpreted as (a AND b) OR c. For example:

  1. SELECT * FROM logs WHERE MATCHES(message, 'a AND b OR c');

This matches rows containing both a and b, or rows containing c. Equivalent to:

  1. SELECT * FROM logs WHERE MATCHES(message, '(+a +b) c');

Phrase Term

A phrase term is enclosed within quotes " " and matches the exact sequence of words. For example, to match rows containing Barack followed directly by Obama:

  1. SELECT * FROM logs WHERE MATCHES(message, '"Barack Obama"');

To include quotes within a phrase, use a backslash \ to escape them:

  1. SELECT * FROM logs WHERE MATCHES(message, '"He said \"hello\""');

A full-text index is essential for full-text search, especially when dealing with large datasets. Without a full-text index, the search operation could be very slow, impacting the overall query performance and user experience. You can configure a full-text index either directly via SQL during table creation or through the Pipeline configuration, ensuring that search operations are performed efficiently, even with significant data volumes.

Creating Full-Text Index via SQL

You can create a full-text index for a column by specifying the FULLTEXT option in the column definition. Below is an example of creating a table with a full-text index on the message column:

  1. CREATE TABLE `logs` (
  2. `message` STRING FULLTEXT,
  3. `time` TIMESTAMP TIME INDEX,
  4. ) WITH (
  5. append_mode = 'true'
  6. );

For more details, see the Fulltext Column Option.

Configuring Full-Text Index via Pipeline

In the Pipeline configuration, you can specify a column to use a full-text index. Below is a configuration example where the message column is set with a full-text index:

  1. processors:
  2. - date:
  3. field: time
  4. formats:
  5. - "%Y-%m-%d %H:%M:%S%.3f"
  6. ignore_missing: true
  7. transform:
  8. - field: message
  9. type: string
  10. index: fulltext
  11. - field: time
  12. type: time
  13. index: timestamp

Viewing Table Schema

After data is written, you can use an SQL statement to view the table schema and confirm that the message column is set for full-text indexing:

  1. SHOW CREATE TABLE logs\G
  2. *************************** 1. row ***************************
  3. Table: logs
  4. Create Table: CREATE TABLE IF NOT EXISTS `logs` (
  5. `message` STRING NULL FULLTEXT WITH(analyzer = 'English', case_sensitive = 'false'),
  6. `time` TIMESTAMP(9) NOT NULL,
  7. TIME INDEX (`time`),
  8. )
  9. ENGINE=mito
  10. WITH(
  11. append_mode = 'true'
  12. )