SQL 方言兼容 - 图1提示

从 2.1 版本开始,Doris 可以支持多种 SQL 方言,如 Presto、Trino、Hive、PostgreSQL、Spark、Clickhouse 等等。通过这个功能,用户可以直接使用对应的 SQL 方言查询 Doris 中的数据,方便用户将原先的业务平滑的迁移到 Doris 中。

SQL 方言兼容 - 图2警告

  1. 该功能目前是实验性功能,您在使用过程中如遇到任何问题,欢迎通过邮件组、GitHub Issue 等方式进行反馈。

部署服务

1. 下载最新版本的 SQL 方言转换工具

  1. > 注:
  2. >
  3. > SQL 方言转换工具基于开源的 [SQLGlot](https://github.com/tobymao/sqlglot) 二次开发,关于 SQLGlot 可参阅 [SQLGlot 官网](https://sqlglot.com/sqlglot.html)
  1. 在任意 FE 节点,通过以下命令启动服务:

    sh bin/start.sh

    SQL 方言兼容 - 图3提示

    1. 该服务是一个无状态的服务,可随时启停。

    1. 默认启动端口是5001,可在conf/config.conf中配置指定端口。

    2. 建议在每个 FE 节点都单独启动一个服务。

  2. 启动 Doris 集群(2.1 或更高版本)

  3. 通过以下命令,在 Doris 中设置 SQL 方言转换服务的 URL:

    MySQL> set global sql_converter_service_url = "http://127.0.0.1:5001/api/v1/convert"

    SQL 方言兼容 - 图4提示

    127.0.0.1:5001 是 SQL 方言转换服务的部署节点 ip 和端口。

使用 SQL 方言

目前支持的方言类型包括:

  • presto

  • trino

  • hive

  • spark

  • postgres

  • clickhouse

示例:

Presto

  1. mysql> CREATE TABLE test_sqlconvert (
  2. id int,
  3. start_time DateTime,
  4. value String,
  5. arr_int ARRAY<Int>,
  6. arr_str ARRAY<String>
  7. ) ENGINE=OLAP
  8. DUPLICATE KEY(`id`)
  9. COMMENT 'OLAP'
  10. DISTRIBUTED BY HASH(`id`) BUCKETS 1
  11. PROPERTIES (
  12. "replication_allocation" = "tag.location.default: 1"
  13. );
  14. Query OK, 0 rows affected (0.01 sec)
  15. mysql> INSERT INTO test_sqlconvert values(1, '2024-05-20 13:14:52', '2024-01-14',[1, 2, 3, 3], ['Hello', 'World']);
  16. Query OK, 1 row affected (0.08 sec)
  17. mysql> set sql_dialect=presto;
  18. Query OK, 0 rows affected (0.00 sec)
  19. mysql> SELECT cast(start_time as varchar(20)) as col1,
  20. array_distinct(arr_int) as col2,
  21. FILTER(arr_str, x -> x LIKE '%World%') as col3,
  22. to_date(value,'%Y-%m-%d') as col4,
  23. YEAR(start_time) as col5,
  24. date_add('month', 1, start_time) as col6,
  25. REGEXP_EXTRACT_ALL(value, '-.') as col7,
  26. JSON_EXTRACT('{"id": "33"}', '$.id')as col8,
  27. element_at(arr_int, 1) as col9,
  28. date_trunc('day',start_time) as col10
  29. FROM test_sqlconvert
  30. where date_trunc('day',start_time)= DATE'2024-05-20'
  31. order by id;
  32. +---------------------+-----------+-----------+------------+------+---------------------+-------------+------+------+---------------------+
  33. | col1 | col2 | col3 | col4 | col5 | col6 | col7 | col8 | col9 | col10 |
  34. +---------------------+-----------+-----------+------------+------+---------------------+-------------+------+------+---------------------+
  35. | 2024-05-20 13:14:52 | [1, 2, 3] | ["World"] | 2024-01-14 | 2024 | 2024-06-20 13:14:52 | ['-0','-1'] | "33" | 1 | 2024-05-20 00:00:00 |
  36. +---------------------+-----------+-----------+------------+------+---------------------+-------------+------+------+---------------------+
  37. 1 row in set (0.03 sec)

Clickhouse

  1. mysql> set sql_dialect=clickhouse;
  2. Query OK, 0 rows affected (0.00 sec)
  3. mysql> select toString(start_time) as col1,
  4. arrayCompact(arr_int) as col2,
  5. arrayFilter(x -> x like '%World%',arr_str)as col3,
  6. toDate(value) as col4,
  7. toYear(start_time)as col5,
  8. addMonths(start_time, 1)as col6,
  9. extractAll(value, '-.')as col7,
  10. JSONExtractString('{"id": "33"}' , 'id')as col8,
  11. arrayElement(arr_int, 1) as col9,
  12. date_trunc('day',start_time) as col10
  13. FROM test_sqlconvert
  14. where date_trunc('day',start_time)= '2024-05-20 00:00:00'
  15. order by id;
  16. +---------------------+-----------+-----------+------------+------+---------------------+-------------+------+------+---------------------+
  17. | col1 | col2 | col3 | col4 | col5 | col6 | col7 | col8 | col9 | col10 |
  18. +---------------------+-----------+-----------+------------+------+---------------------+-------------+------+------+---------------------+
  19. | 2024-05-20 13:14:52 | [1, 2, 3] | ["World"] | 2024-01-14 | 2024 | 2024-06-20 13:14:52 | ['-0','-1'] | "33" | 1 | 2024-05-20 00:00:00 |
  20. +---------------------+-----------+-----------+------------+------+---------------------+-------------+------+------+---------------------+
  21. 1 row in set (0.02 sec)