背景

我们时常遇到对应 SQL 执行时间不及预期的情况,为了优化 SQL 达到预期查询时延,通过 Profile 我们能够看出可以做哪些优化。现在说明在不同环境下应该如何拿到对应 Query 的 Profile。

请求

HTTP://FE_IP:HTTP_PORT GET /API/Profile

Doris 集群能够正常访问外网

  1. 开启 Profile 上报参数 enable_profile

    该参数开启的是 session 变量,此变量不建议全局开启。

    1. --开启变量
    2. mysql> set enable_profile=true;
    3. Query OK, 0 rows affected (0.00 sec)
    4. --确认变量是否正常开启
    5. mysql> show variables like '%profile%';
    6. +----------------+-------+---------------+---------+
    7. | Variable_name | Value | Default_Value | Changed |
    8. +----------------+-------+---------------+---------+
    9. | enable_profile | true | false | 1 |
    10. +----------------+-------+---------------+---------+
    11. 1 row in set (0.00 sec)
  2. 执行对应 Query

    集群在多个 FE 的情况下,需要到开启 Profile 上报参数的 FE 上执行对应 Query, 参数并没有全局生效。

    1. --开启变量
    2. mysql> set enable_profile=true;
    3. Query OK, 0 rows affected (0.00 sec)
    4. --确认变量是否正常开启
    5. mysql> show variables like '%profile%';
    6. +----------------+-------+---------------+---------+
    7. | Variable_name | Value | Default_Value | Changed |
    8. +----------------+-------+---------------+---------+
    9. | enable_profile | true | false | 1 |
    10. +----------------+-------+---------------+---------+
    11. 1 row in set (0.00 sec)
    12. --执行对应 Query
    13. mysql> select id,name from test.test where name like "%RuO%";
    14. +---------------------------+-----------------------------------------------------------------------------------------------------------------------------+
    15. | id | name |
    16. +---------------------------+-----------------------------------------------------------------------------------------------------------------------------+
    17. | 1ZWXYGbb8nr5Pi29J4cEMyEMb | ZN1nqzBRSl1rTrr99rnX1aplxhRuOUTLw6so7rzjlRQ317gTPxh0dHljmrARDJjH7FjRkJW9c7YuUBmWikq7eNgmFKJPreWirDrGrFzUYH4eP6kDtSA3UTnNIIj |
    18. +---------------------------+-----------------------------------------------------------------------------------------------------------------------------+
    19. 1 row in set (0.01 sec)
  3. 获取 Profile

    集群在多个 FE 的情况下,需要访问执行对应 Query 的 FE HTTP 界面 ( HTTP://FE_IP:HTTP_PORT ) 的 QueryProfile 页面,点击对应 Profile ID 查看对应 Profile,还可以在 Profile 界面下载对应 Profile。

    profile1.png profile2.png

Doris 集群访问外网受到限制

集群不能正常访问外网时,需要通过 API 的方式拿到对应 Profile (HTTP://FE_IP:HTTP_PORT/API/Profile?Query_ID=),IP 和端口是指执行对应 Query 的 FE 对应 IP 和端口。此时获取对应 Query 的 Profile 步骤前两步和正常访问外网时是一样的,第三步获取 Profile 时会有差别。

获取 Porfile

  • 找到对应 Query ID

    1. --根据对应 Query 找到 Profile ID
    2. mysql> show query profile "/";
    3. +-----------------------------------+-----------+---------------------+---------------------+-------+------------+------+------------+-------------------------------------------------------+
    4. | Profile ID | Task Type | Start Time | End Time | Total | Task State | User | Default Db | Sql Statement |
    5. +-----------------------------------+-----------+---------------------+---------------------+-------+------------+------+------------+-------------------------------------------------------+
    6. | 1b0bb22689734d30-bbe56e17c2ff21dc | QUERY | 2024-02-28 11:00:17 | 2024-02-28 11:00:17 | 7ms | EOF | root | | select id,name from test.test where name like "%RuO%" |
    7. | 202fb174510c4772-965289e8f7f0cf10 | QUERY | 2024-02-25 19:39:20 | 2024-02-25 19:39:20 | 19ms | EOF | root | | select id,name from test.test where name like "%KJ%" |
    8. +-----------------------------------+-----------+---------------------+---------------------+-------+------------+------+------------+-------------------------------------------------------+
    9. 2 rows in set (0.00 sec)
  • 查询 Profile 并将 Profile 重定向到一个文本中

    1. 模板:CURL -X GET -u user:password http://fe_ip:http_port/api/profile?query_id=1b0bb22689734d30-bbe56e17c2ff21dc > test.profile
    2. [user@VM-10-6-centos profile]$ curl -X GET -u root:root http://127.0.0.1:8030/api/profile?query_id=1b0bb22689734d30-bbe56e17c2ff21dc > test.profile
    3. % Total % Received % Xferd Average Speed Time Time Time Current
    4. Dload Upload Total Spent Left Speed
    5. 100 1211 0 1211 0 0 168k 0 --:--:-- --:--:-- --:--:-- 168k
  • 返回的 Profile 换行符为 \ \n 分析起来很不方便,可以在文本编辑工具中将 \ \n 替换为 \n

    1. [user@VM-10-6-centos profile]$ cat test.profile
    2. {"msg":"success","code":0,"data":{"profile":"Query:\n Summary:\n
    3. - Profile ID: 1b0bb22689734d30-bbe56e17c2ff21dc\n - Task Type: QUERY\n
    4. - Start Time: 2024-02-28 11:00:17\n - End Time: 2024-02-28 11:00:17\n
    5. - Total: 7ms\n - Task State: EOF\n - User: root\n - Default Db: \n
    6. - Sql Statement: select id,name from test.test where name like \"%RuO%\"\n Execution Summary:\n
    7. - Workload Group: \n - Analysis Time: 1ms\n
    8. - Plan Time: 2ms\n - JoinReorder Time: N/A\n
    9. - CreateSingleNode Time: N/A\n - QueryDistributed Time: N/A\n
    10. - Init Scan Node Time: N/A\n - Finalize Scan Node Time: N/A\n
    11. - Get Splits Time: N/A\n - Get PARTITIONS Time: N/A\n
    12. - Get PARTITION FILES Time: N/A\n - Create Scan Range Time: N/A\n
    13. - Schedule Time: N/A\n - Fetch Result Time: 0ms\n - Write Result Time: 0ms\n
    14. - Wait and Fetch Result Time: N/A\n - Doris Version: doris-2.0.4-rc06-003a815b63\n
    15. - Is Nereids: Yes\n - Is Pipeline: Yes\n - Is Cached: Yes\n
    16. - Total Instances Num: 0\n - Instances Num Per BE: \n
    17. - Parallel Fragment Exec Instance Num: 48\n - Trace ID: \n"},"count":0}
  • 替换后的效果如下

    1. {"msg":"success","code":0,"data":{"profile":"Query:
    2. Summary:
    3. - Profile ID: 1b0bb22689734d30-bbe56e17c2ff21dc
    4. - Task Type: QUERY
    5. - Start Time: 2024-02-28 11:00:17
    6. - End Time: 2024-02-28 11:00:17
    7. - Total: 7ms
    8. - Task State: EOF
    9. - User: root
    10. - Default Db:
    11. - Sql Statement: select id,name from test.test where name like \"%RuO%\"
    12. Execution Summary:
    13. - Workload Group:
    14. - Analysis Time: 1ms
    15. - Plan Time: 2ms
    16. - JoinReorder Time: N/A
    17. - CreateSingleNode Time: N/A
    18. - QueryDistributed Time: N/A
    19. - Init Scan Node Time: N/A
    20. - Finalize Scan Node Time: N/A
    21. - Get Splits Time: N/A
    22. - Get PARTITIONS Time: N/A
    23. - Get PARTITION FILES Time: N/A
    24. - Create Scan Range Time: N/A
    25. - Schedule Time: N/A
    26. - Fetch Result Time: 0ms
    27. - Write Result Time: 0ms
    28. - Wait and Fetch Result Time: N/A
    29. - Doris Version: doris-2.0.4-rc06-003a815b63
    30. - Is Nereids: Yes
    31. - Is Pipeline: Yes
    32. - Is Cached: Yes
    33. - Total Instances Num: 0
    34. - Instances Num Per BE:
    35. - Parallel Fragment Exec Instance Num: 48
    36. - Trace ID:
    37. "},"count":0}