Common Table Expression (CTE)

CTEs are similar to Views in that they help you reduce the complexity of your queries, break down long and complex SQL statements, and improve readability and reusability.

You already read a CTE in the quickstart document.

What is a Common Table Expression (CTE)?

A Common Table Expression (CTE) is a temporary result set that you can reference within a SELECT, INSERT, UPDATE, or DELETE statement. CTEs help to break down complex queries into more readable parts and can be referenced multiple times within the same query.

Basic syntax of CTE

CTEs are typically defined using the WITH keyword. The basic syntax is as follows:

  1. WITH cte_name [(column1, column2, ...)] AS (
  2. QUERY
  3. )
  4. SELECT ...
  5. FROM cte_name;

Example explanation

Next, we’ll go through a complete example that demonstrates how to use CTEs, including data preparation, CTE creation, and usage.

Step 1: Create example data

Let’s assume we have the following two tables:

  • grpc_latencies: Contains gRPC request latency data.
  • app_logs: Contains application log information.
  1. CREATE TABLE grpc_latencies (
  2. ts TIMESTAMP TIME INDEX,
  3. host VARCHAR(255),
  4. latency FLOAT,
  5. PRIMARY KEY(host),
  6. );
  7. INSERT INTO grpc_latencies VALUES
  8. ('2023-10-01 10:00:00', 'host1', 120),
  9. ('2023-10-01 10:00:00', 'host2', 150),
  10. ('2023-10-01 10:00:05', 'host1', 130);
  11. CREATE TABLE app_logs (
  12. ts TIMESTAMP TIME INDEX,
  13. host VARCHAR(255),
  14. log TEXT,
  15. log_level VARCHAR(50),
  16. PRIMARY KEY(host, log_level),
  17. );
  18. INSERT INTO app_logs VALUES
  19. ('2023-10-01 10:00:00', 'host1', 'Error on service', 'ERROR'),
  20. ('2023-10-01 10:00:00', 'host2', 'All services OK', 'INFO'),
  21. ('2023-10-01 10:00:05', 'host1', 'Error connecting to DB', 'ERROR');

Step 2: Define and use CTEs

We will create two CTEs to calculate the 95th percentile latency and the number of error logs, respectively.

  1. WITH
  2. metrics AS (
  3. SELECT
  4. ts,
  5. host,
  6. approx_percentile_cont(latency, 0.95) RANGE '5s' AS p95_latency
  7. FROM
  8. grpc_latencies
  9. ALIGN '5s' FILL PREV
  10. ),
  11. logs AS (
  12. SELECT
  13. ts,
  14. host,
  15. COUNT(log) RANGE '5s' AS num_errors
  16. FROM
  17. app_logs
  18. WHERE
  19. log_level = 'ERROR'
  20. ALIGN '5s' BY (HOST)
  21. )
  22. SELECT
  23. metrics.ts,
  24. metrics.host,
  25. metrics.p95_latency,
  26. COALESCE(logs.num_errors, 0) AS num_errors
  27. FROM
  28. metrics
  29. LEFT JOIN logs ON metrics.host = logs.host AND metrics.ts = logs.ts
  30. ORDER BY
  31. metrics.ts;

Output:

  1. +---------------------+-------+-------------+------------+
  2. | ts | host | p95_latency | num_errors |
  3. +---------------------+-------+-------------+------------+
  4. | 2023-10-01 10:00:00 | host2 | 150 | 0 |
  5. | 2023-10-01 10:00:00 | host1 | 120 | 1 |
  6. | 2023-10-01 10:00:05 | host1 | 130 | 1 |
  7. +---------------------+-------+-------------+------------+

Detailed explanation

  1. Define CTEs:
  • metrics:

    1. WITH
    2. metrics AS (
    3. SELECT
    4. ts,
    5. host,
    6. approx_percentile_cont(latency, 0.95) RANGE '5s' AS p95_latency
    7. FROM
    8. grpc_latencies
    9. ALIGN '5s' FILL PREV
    10. ),

    Here we use a range query to calculate the 95th percentile latency for each host within every 5-second window.

  • logs:

    1. logs AS (
    2. SELECT
    3. ts,
    4. host,
    5. COUNT(log) RANGE '5s' AS num_errors
    6. FROM
    7. app_logs
    8. WHERE
    9. log_level = 'ERROR'
    10. ALIGN '5s' BY (HOST)
    11. )

    Similarly, we calculate the number of error logs for each host within every 5-second window.

  1. Use CTEs: The final query part:

    1. SELECT
    2. metrics.ts,
    3. metrics.host,
    4. metrics.p95_latency,
    5. COALESCE(logs.num_errors, 0) AS num_errors
    6. FROM
    7. metrics
    8. LEFT JOIN logs ON metrics.host = logs.host AND metrics.ts = logs.ts
    9. ORDER BY
    10. metrics.ts;

    We perform a left join on the two CTE result sets to get a comprehensive analysis result.

Summary

With CTEs, you can break down complex SQL queries into more manageable and understandable parts. In this example, we created two CTEs to calculate the 95th percentile latency and the number of error logs separately and then merged them into the final query for analysis. Read more WITH.