背景

在数据库中,”Hint” 是一种用于指导查询优化器执行计划的指令。通过在 SQL 语句中嵌入 Hint,可以影响优化器的决策,以选中期望的执行路径。以下是一个使用 Hint 的背景示例: 假设有一个包含大量数据的表,而你知道在某些特定情况下,在一个查询中,表的连接顺序可能会影响查询性能。Leading Hint 允许你指定希望优化器遵循的表连接的顺序。 例如,考虑以下 SQL 查询:

  1. mysql> explain shape plan select * from t1 join t2 on t1.c1 = c2;
  2. +-------------------------------------------+
  3. | Explain String |
  4. +-------------------------------------------+
  5. | PhysicalResultSink |
  6. | --PhysicalDistribute |
  7. | ----PhysicalProject |
  8. | ------hashJoin[INNER_JOIN](t1.c1 = t2.c2) |
  9. | --------PhysicalOlapScan[t2] |
  10. | --------PhysicalDistribute |
  11. | ----------PhysicalOlapScan[t1] |
  12. +-------------------------------------------+
  13. 7 rows in set (0.06 sec)

在上述例子里面,在执行效率不理想的时候,我们希望调整下 join 顺序而不改变原始 sql 以免影响到用户原始场景且能达到调优的目的。我们可以使用 leading 任意改变 tableA 和 tableB 的 join 顺序。例如可以写成:

  1. mysql> explain shape plan select /*+ leading(t2 t1) */ * from t1 join t2 on c1 = c2;
  2. +-----------------------------------------------------------------------------------------------------+
  3. | Explain String(Nereids Planner) |
  4. +-----------------------------------------------------------------------------------------------------+
  5. | PhysicalResultSink |
  6. | --PhysicalDistribute |
  7. | ----PhysicalProject |
  8. | ------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() build RFs:RF0 c1->[c2] |
  9. | --------PhysicalOlapScan[t2] apply RFs: RF0 |
  10. | --------PhysicalDistribute |
  11. | ----------PhysicalOlapScan[t1] |
  12. | |
  13. | Hint log: |
  14. | Used: leading(t2 t1) |
  15. | UnUsed: |
  16. | SyntaxError: |
  17. +-----------------------------------------------------------------------------------------------------+
  18. 12 rows in set (0.06 sec)

在这个例子中,使用了 /+ leading(t2 t1) / 这个 Hint。这个 Hint 告诉优化器在执行计划中使用指定表(t2)作为驱动表,并置于 (t1) 之前。 本文主要阐述如何在 Doris 里面使用 join 相关的 hint:leading hint、ordered hint 和 distribute hint

Leading hint 使用说明

Leading Hint 用于指导优化器确定查询计划的连接顺序。在一个查询中,表的连接顺序可能会影响查询性能。Leading Hint 允许你指定希望优化器遵循的表连接的顺序。 在 doris 里面,其语法为 /+LEADING( tablespec [ tablespec ]… ) /,leading 由”/+”和”/“包围并置于 select 语句里面 select 的正后方。注意,leading 后方的 ‘/‘ 和 selectlist 需要隔开至少一个分割符例如空格。至少需要写两个以上的表才认为这个 leadinghint 是合理的。且任意的 join 里面可以用大括号括号起来,来显式地指定 joinTree 的形状。例:

  1. mysql> explain shape plan select /*+ leading(t2 t1) */ * from t1 join t2 on c1 = c2;
  2. +------------------------------------------------------------------------------+
  3. | Explain String(Nereids Planner) |
  4. +------------------------------------------------------------------------------+
  5. | PhysicalResultSink |
  6. | --PhysicalDistribute[DistributionSpecGather] |
  7. | ----PhysicalProject |
  8. | ------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
  9. | --------PhysicalOlapScan[t2] |
  10. | --------PhysicalDistribute[DistributionSpecHash] |
  11. | ----------PhysicalOlapScan[t1] |
  12. | |
  13. | Hint log: |
  14. | Used: leading(t2 t1) |
  15. | UnUsed: |
  16. | SyntaxError: |
  17. +------------------------------------------------------------------------------+
  18. 12 rows in set (0.01 sec)
  • 当 leadinghint 不生效的时候会走正常的流程生成计划,explain 会显示使用的 hint 是否生效,主要分三种来显示:
    • Used:leading hint 正常生效
    • Unused:这里不支持的情况包含 leading 指定的 join order 与原 sql 不等价或本版本暂不支持特性(详见限制)
    • SyntaxError:指 leading hint 语法错误,如找不到对应的表等
  • leading hint 语法默认造出来左深树,例:select /+ leading(t1 t2 t3) / * from t1 join t2 on…默认指定出来
  1. join
  2. / \
  3. join t3
  4. / \
  5. t1 t2
  6. mysql> explain shape plan select /*+ leading(t1 t2 t3) */ * from t1 join t2 on c1 = c2 join t3 on c2=c3;
  7. +--------------------------------------------------------------------------------+
  8. | Explain String(Nereids Planner) |
  9. +--------------------------------------------------------------------------------+
  10. | PhysicalResultSink |
  11. | --PhysicalDistribute[DistributionSpecGather] |
  12. | ----PhysicalProject |
  13. | ------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=() |
  14. | --------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
  15. | ----------PhysicalOlapScan[t1] |
  16. | ----------PhysicalDistribute[DistributionSpecHash] |
  17. | ------------PhysicalOlapScan[t2] |
  18. | --------PhysicalDistribute[DistributionSpecHash] |
  19. | ----------PhysicalOlapScan[t3] |
  20. | |
  21. | Hint log: |
  22. | Used: leading(t1 t2 t3) |
  23. | UnUsed: |
  24. | SyntaxError: |
  25. +--------------------------------------------------------------------------------+
  26. 15 rows in set (0.00 sec)
  • 同时允许使用大括号指定 join 树形状。例:/+ leading(t1 {t2 t3}) / join / \ t1 join / \ t2 t3
  1. mysql> explain shape plan select /*+ leading(t1 {t2 t3}) */ * from t1 join t2 on c1 = c2 join t3 on c2=c3;
  2. +----------------------------------------------------------------------------------+
  3. | Explain String(Nereids Planner) |
  4. +----------------------------------------------------------------------------------+
  5. | PhysicalResultSink |
  6. | --PhysicalDistribute[DistributionSpecGather] |
  7. | ----PhysicalProject |
  8. | ------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
  9. | --------PhysicalOlapScan[t1] |
  10. | --------PhysicalDistribute[DistributionSpecHash] |
  11. | ----------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=() |
  12. | ------------PhysicalOlapScan[t2] |
  13. | ------------PhysicalDistribute[DistributionSpecHash] |
  14. | --------------PhysicalOlapScan[t3] |
  15. | |
  16. | Hint log: |
  17. | Used: leading(t1 { t2 t3 }) |
  18. | UnUsed: |
  19. | SyntaxError: |
  20. +----------------------------------------------------------------------------------+
  21. 15 rows in set (0.02 sec)
  • 当有 view 作为别名参与 joinReorder 的时候可以指定对应的 view 作为 leading 的参数。例:
  1. mysql> explain shape plan select /*+ leading(alias t1) */ count(*) from t1 join (select c2 from t2 join t3 on t2.c2 = t3.c3) as alias on t1.c1 = alias.c2;
  2. +--------------------------------------------------------------------------------------+
  3. | Explain String(Nereids Planner) |
  4. +--------------------------------------------------------------------------------------+
  5. | PhysicalResultSink |
  6. | --hashAgg[GLOBAL] |
  7. | ----PhysicalDistribute[DistributionSpecGather] |
  8. | ------hashAgg[LOCAL] |
  9. | --------PhysicalProject |
  10. | ----------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = alias.c2)) otherCondition=() |
  11. | ------------PhysicalProject |
  12. | --------------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=() |
  13. | ----------------PhysicalProject |
  14. | ------------------PhysicalOlapScan[t2] |
  15. | ----------------PhysicalDistribute[DistributionSpecHash] |
  16. | ------------------PhysicalProject |
  17. | --------------------PhysicalOlapScan[t3] |
  18. | ------------PhysicalDistribute[DistributionSpecHash] |
  19. | --------------PhysicalProject |
  20. | ----------------PhysicalOlapScan[t1] |
  21. | |
  22. | Hint log: |
  23. | Used: leading(alias t1) |
  24. | UnUsed: |
  25. | SyntaxError: |
  26. +--------------------------------------------------------------------------------------+
  27. 21 rows in set (0.06 sec)

基本用例

(注意这里列命名和表命名相关,例:只有 t1 中有 c1 字段,后续例子为了简化会将 t1.c1 直接写成 c1)

  1. CREATE DATABASE testleading;
  2. USE testleading;
  3. create table t1 (c1 int, c11 int) distributed by hash(c1) buckets 3 properties('replication_num' = '1');
  4. create table t2 (c2 int, c22 int) distributed by hash(c2) buckets 3 properties('replication_num' = '1');
  5. create table t3 (c3 int, c33 int) distributed by hash(c3) buckets 3 properties('replication_num' = '1');
  6. create table t4 (c4 int, c44 int) distributed by hash(c4) buckets 3 properties('replication_num' = '1');

举个简单的例子,当我们需要交换 t1 和 t2 的 join 顺序的时候只需要在前面加上 leading(t2 t1) 即可,explain 的时候会 显示是否用上了这个 hint。 原始 plan

  1. mysql> explain shape plan select * from t1 join t2 on t1.c1 = c2;
  2. +-------------------------------------------+
  3. | Explain String |
  4. +-------------------------------------------+
  5. | PhysicalResultSink |
  6. | --PhysicalDistribute |
  7. | ----PhysicalProject |
  8. | ------hashJoin[INNER_JOIN](t1.c1 = t2.c2) |
  9. | --------PhysicalOlapScan[t2] |
  10. | --------PhysicalDistribute |
  11. | ----------PhysicalOlapScan[t1] |
  12. +-------------------------------------------+
  13. 7 rows in set (0.06 sec)

Leading plan

  1. mysql> explain shape plan select /*+ leading(t2 t1) */ * from t1 join t2 on c1 = c2;
  2. +------------------------------------------------------------------------------+
  3. | Explain String(Nereids Planner) |
  4. +------------------------------------------------------------------------------+
  5. | PhysicalResultSink |
  6. | --PhysicalDistribute[DistributionSpecGather] |
  7. | ----PhysicalProject |
  8. | ------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
  9. | --------PhysicalOlapScan[t2] |
  10. | --------PhysicalDistribute[DistributionSpecHash] |
  11. | ----------PhysicalOlapScan[t1] |
  12. | |
  13. | Hint log: |
  14. | Used: leading(t2 t1) |
  15. | UnUsed: |
  16. | SyntaxError: |
  17. +------------------------------------------------------------------------------+
  18. 12 rows in set (0.00 sec)

hint 效果展示 (Used unused) 若 leading hint 有语法错误,explain 的时候会在 syntax error 里面显示相应的信息,但是计划能照常生成,只不过没有使用 leading 而已

  1. mysql> explain shape plan select /*+ leading(t2 t3) */ * from t1 join t2 on t1.c1 = c2;
  2. +--------------------------------------------------------+
  3. | Explain String |
  4. +--------------------------------------------------------+
  5. | PhysicalResultSink |
  6. | --PhysicalDistribute |
  7. | ----PhysicalProject |
  8. | ------hashJoin[INNER_JOIN](t1.c1 = t2.c2) |
  9. | --------PhysicalOlapScan[t1] |
  10. | --------PhysicalDistribute |
  11. | ----------PhysicalOlapScan[t2] |
  12. | |
  13. | Used: |
  14. | UnUsed: |
  15. | SyntaxError: leading(t2 t3) Msg:can not find table: t3 |
  16. +--------------------------------------------------------+
  17. 11 rows in set (0.01 sec)

扩展场景

左深树

当我们不使用任何括号的情况下 leading 会默认生成左深树

  1. mysql> explain shape plan select /*+ leading(t1 t2 t3) */ * from t1 join t2 on t1.c1 = c2 join t3 on c2 = c3;
  2. +--------------------------------------------------------------------------------+
  3. | Explain String(Nereids Planner) |
  4. +--------------------------------------------------------------------------------+
  5. | PhysicalResultSink |
  6. | --PhysicalDistribute[DistributionSpecGather] |
  7. | ----PhysicalProject |
  8. | ------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=() |
  9. | --------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
  10. | ----------PhysicalOlapScan[t1] |
  11. | ----------PhysicalDistribute[DistributionSpecHash] |
  12. | ------------PhysicalOlapScan[t2] |
  13. | --------PhysicalDistribute[DistributionSpecHash] |
  14. | ----------PhysicalOlapScan[t3] |
  15. | |
  16. | Hint log: |
  17. | Used: leading(t1 t2 t3) |
  18. | UnUsed: |
  19. | SyntaxError: |
  20. +--------------------------------------------------------------------------------+
  21. 15 rows in set (0.10 sec)

右深树

当我们想将计划的形状做成右深树或者 bushy 树或者 zigzag 树的时候,只需要加上大括号来限制 plan 的形状即可,不需要像 oracle 一样用 swap 从左深树一步步调整。

  1. mysql> explain shape plan select /*+ leading(t1 {t2 t3}) */ * from t1 join t2 on t1.c1 = c2 join t3 on c2 = c3;
  2. +-----------------------------------------------+
  3. | Explain String |
  4. +-----------------------------------------------+
  5. | PhysicalResultSink |
  6. | --PhysicalDistribute |
  7. | ----PhysicalProject |
  8. | ------hashJoin[INNER_JOIN](t1.c1 = t2.c2) |
  9. | --------PhysicalOlapScan[t1] |
  10. | --------PhysicalDistribute |
  11. | ----------hashJoin[INNER_JOIN](t2.c2 = t3.c3) |
  12. | ------------PhysicalOlapScan[t2] |
  13. | ------------PhysicalDistribute |
  14. | --------------PhysicalOlapScan[t3] |
  15. | |
  16. | Used: leading(t1 { t2 t3 }) |
  17. | UnUsed: |
  18. | SyntaxError: |
  19. +-----------------------------------------------+
  20. 14 rows in set (0.02 sec)

Bushy 树

  1. mysql> explain shape plan select /*+ leading({t1 t2} {t3 t4}) */ * from t1 join t2 on t1.c1 = c2 join t3 on c2 = c3 join t4 on c3 = c4;
  2. +-----------------------------------------------+
  3. | Explain String |
  4. +-----------------------------------------------+
  5. | PhysicalResultSink |
  6. | --PhysicalDistribute |
  7. | ----PhysicalProject |
  8. | ------hashJoin[INNER_JOIN](t2.c2 = t3.c3) |
  9. | --------hashJoin[INNER_JOIN](t1.c1 = t2.c2) |
  10. | ----------PhysicalOlapScan[t1] |
  11. | ----------PhysicalDistribute |
  12. | ------------PhysicalOlapScan[t2] |
  13. | --------PhysicalDistribute |
  14. | ----------hashJoin[INNER_JOIN](t3.c3 = t4.c4) |
  15. | ------------PhysicalOlapScan[t3] |
  16. | ------------PhysicalDistribute |
  17. | --------------PhysicalOlapScan[t4] |
  18. | |
  19. | Used: leading({ t1 t2 } { t3 t4 }) |
  20. | UnUsed: |
  21. | SyntaxError: |
  22. +-----------------------------------------------+
  23. 17 rows in set (0.02 sec)

zig-zag 树

  1. mysql> explain shape plan select /*+ leading(t1 {t2 t3} t4) */ * from t1 join t2 on t1.c1 = c2 join t3 on c2 = c3 join t4 on c3 = c4;
  2. +--------------------------------------------------------------------------------------+
  3. | Explain String(Nereids Planner) |
  4. +--------------------------------------------------------------------------------------+
  5. | PhysicalResultSink |
  6. | --PhysicalDistribute[DistributionSpecGather] |
  7. | ----PhysicalProject |
  8. | ------hashJoin[INNER_JOIN] hashCondition=((t3.c3 = t4.c4)) otherCondition=() |
  9. | --------PhysicalDistribute[DistributionSpecHash] |
  10. | ----------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
  11. | ------------PhysicalOlapScan[t1] |
  12. | ------------PhysicalDistribute[DistributionSpecHash] |
  13. | --------------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=() |
  14. | ----------------PhysicalOlapScan[t2] |
  15. | ----------------PhysicalDistribute[DistributionSpecHash] |
  16. | ------------------PhysicalOlapScan[t3] |
  17. | --------PhysicalDistribute[DistributionSpecHash] |
  18. | ----------PhysicalOlapScan[t4] |
  19. | |
  20. | Hint log: |
  21. | Used: leading(t1 { t2 t3 } t4) |
  22. | UnUsed: |
  23. | SyntaxError: |
  24. +--------------------------------------------------------------------------------------+
  25. 19 rows in set (0.02 sec)

Non-inner join:

当遇到非 inner-join 的时候,例如 Outer join 或者 semi/anti join 的时候,leading hint 会根据原始 sql 语义自动推导各个 join 的 join 方式。若遇到与原始 sql 语义不同的 leading hint 或者生成不了的情况则会放到 unused 里面,但是不影响计划正常流程的生成。 下面是不能交换的例子:
-———- test outer join which can not swap
-- t1 leftjoin (t2 join t3 on (P23)) on (P12) != (t1 leftjoin t2 on (P12)) join t3 on (P23)

  1. mysql> explain shape plan select /*+ leading(t1 {t2 t3}) */ * from t1 left join t2 on c1 = c2 join t3 on c2 = c3;
  2. +--------------------------------------------------------------------------------+
  3. | Explain String(Nereids Planner) |
  4. +--------------------------------------------------------------------------------+
  5. | PhysicalResultSink |
  6. | --PhysicalDistribute[DistributionSpecGather] |
  7. | ----PhysicalProject |
  8. | ------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=() |
  9. | --------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
  10. | ----------PhysicalOlapScan[t1] |
  11. | ----------PhysicalDistribute[DistributionSpecHash] |
  12. | ------------PhysicalOlapScan[t2] |
  13. | --------PhysicalDistribute[DistributionSpecHash] |
  14. | ----------PhysicalOlapScan[t3] |
  15. | |
  16. | Hint log: |
  17. | Used: |
  18. | UnUsed: leading(t1 { t2 t3 }) |
  19. | SyntaxError: |
  20. +--------------------------------------------------------------------------------+
  21. 15 rows in set (0.01 sec)

下面是一些可以交换的例子和不能交换的例子,读者可自行验证

  1. -------- test outer join which can swap
  2. -- (t1 leftjoin t2 on (P12)) innerjoin t3 on (P13) = (t1 innerjoin t3 on (P13)) leftjoin t2 on (P12)
  3. explain shape plan select * from t1 left join t2 on c1 = c2 join t3 on c1 = c3;
  4. explain shape plan select /*+ leading(t1 t3 t2) */ * from t1 left join t2 on c1 = c2 join t3 on c1 = c3;
  5. -- (t1 leftjoin t2 on (P12)) leftjoin t3 on (P13) = (t1 leftjoin t3 on (P13)) leftjoin t2 on (P12)
  6. explain shape plan select * from t1 left join t2 on c1 = c2 left join t3 on c1 = c3;
  7. explain shape plan select /*+ leading(t1 t3 t2) */ * from t1 left join t2 on c1 = c2 left join t3 on c1 = c3;
  8. -- (t1 leftjoin t2 on (P12)) leftjoin t3 on (P23) = t1 leftjoin (t2 leftjoin t3 on (P23)) on (P12)
  9. select /*+ leading(t2 t3 t1) SWAP_INPUT(t1) */ * from t1 left join t2 on c1 = c2 left join t3 on c2 = c3;
  10. explain shape plan select /*+ leading(t1 {t2 t3}) */ * from t1 left join t2 on c1 = c2 left join t3 on c2 = c3;
  11. explain shape plan select /*+ leading(t1 {t2 t3}) */ * from t1 left join t2 on c1 = c2 left join t3 on c2 = c3;
  12. -------- test outer join which can not swap
  13. -- t1 leftjoin (t2 join t3 on (P23)) on (P12) != (t1 leftjoin t2 on (P12)) join t3 on (P23)
  14. -- eliminated to inner join
  15. explain shape plan select /*+ leading(t1 {t2 t3}) */ * from t1 left join t2 on c1 = c2 join t3 on c2 = c3;
  16. explain graph select /*+ leading(t1 t2 t3) */ * from t1 left join (select * from t2 join t3 on c2 = c3) on c1 = c2;
  17. -- test semi join
  18. explain shape plan select * from t1 where c1 in (select c2 from t2);
  19. explain shape plan select /*+ leading(t2 t1) */ * from t1 where c1 in (select c2 from t2);
  20. -- test anti join
  21. explain shape plan select * from t1 where exists (select c2 from t2);

View

遇到别名的情况,可以将别名作为一个完整的子树进行指定,子树里面的 joinOrder 由文本序生成。

  1. mysql> explain shape plan select /*+ leading(alias t1) */ count(*) from t1 join (select c2 from t2 join t3 on t2.c2 = t3.c3) as alias on t1.c1 = alias.c2;
  2. +--------------------------------------------------------------------------------------+
  3. | Explain String(Nereids Planner) |
  4. +--------------------------------------------------------------------------------------+
  5. | PhysicalResultSink |
  6. | --hashAgg[GLOBAL] |
  7. | ----PhysicalDistribute[DistributionSpecGather] |
  8. | ------hashAgg[LOCAL] |
  9. | --------PhysicalProject |
  10. | ----------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = alias.c2)) otherCondition=() |
  11. | ------------PhysicalProject |
  12. | --------------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=() |
  13. | ----------------PhysicalProject |
  14. | ------------------PhysicalOlapScan[t2] |
  15. | ----------------PhysicalDistribute[DistributionSpecHash] |
  16. | ------------------PhysicalProject |
  17. | --------------------PhysicalOlapScan[t3] |
  18. | ------------PhysicalDistribute[DistributionSpecHash] |
  19. | --------------PhysicalProject |
  20. | ----------------PhysicalOlapScan[t1] |
  21. | |
  22. | Hint log: |
  23. | Used: leading(alias t1) |
  24. | UnUsed: |
  25. | SyntaxError: |
  26. +--------------------------------------------------------------------------------------+
  27. 21 rows in set (0.02 sec)

与 ordered 混合使用

当与 ordered hint 混合使用的时候以 ordered hint 为主,即 ordered hint 生效优先级高于 leading hint。例:

  1. mysql> explain shape plan select /*+ ORDERED LEADING(t1 t2 t3) */ t1.c1 from t2 join t1 on t1.c1 = t2.c2 join t3 on c2 = c3;
  2. +--------------------------------------------------------------------------------+
  3. | Explain String(Nereids Planner) |
  4. +--------------------------------------------------------------------------------+
  5. | PhysicalResultSink |
  6. | --PhysicalDistribute[DistributionSpecGather] |
  7. | ----PhysicalProject |
  8. | ------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=() |
  9. | --------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
  10. | ----------PhysicalProject |
  11. | ------------PhysicalOlapScan[t2] |
  12. | ----------PhysicalDistribute[DistributionSpecHash] |
  13. | ------------PhysicalProject |
  14. | --------------PhysicalOlapScan[t1] |
  15. | --------PhysicalDistribute[DistributionSpecHash] |
  16. | ----------PhysicalProject |
  17. | ------------PhysicalOlapScan[t3] |
  18. | |
  19. | Hint log: |
  20. | Used: ORDERED |
  21. | UnUsed: leading(t1 t2 t3) |
  22. | SyntaxError: |
  23. +--------------------------------------------------------------------------------+
  24. 18 rows in set (0.02 sec)

OrderedHint 使用说明

  • 使用 ordered hint 会让 join tree 的形状固定下来,按照文本序来显示
  • 语法为 /+ ORDERED /,leading 由”/+”和”/“包围并置于 select 语句里面 select 的正后方,例: explain shape plan select /+ ORDERED / t1.c1 from t2 join t1 on t1.c1 = t2.c2 join t3 on c2 = c3; join / \ join t3 / \ t2 t1
  1. mysql> explain shape plan select /*+ ORDERED */ t1.c1 from t2 join t1 on t1.c1 = t2.c2 join t3 on c2 = c3;
  2. +--------------------------------------------------------------------------------+
  3. | Explain String(Nereids Planner) |
  4. +--------------------------------------------------------------------------------+
  5. | PhysicalResultSink |
  6. | --PhysicalDistribute[DistributionSpecGather] |
  7. | ----PhysicalProject |
  8. | ------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=() |
  9. | --------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
  10. | ----------PhysicalProject |
  11. | ------------PhysicalOlapScan[t2] |
  12. | ----------PhysicalDistribute[DistributionSpecHash] |
  13. | ------------PhysicalProject |
  14. | --------------PhysicalOlapScan[t1] |
  15. | --------PhysicalDistribute[DistributionSpecHash] |
  16. | ----------PhysicalProject |
  17. | ------------PhysicalOlapScan[t3] |
  18. | |
  19. | Hint log: |
  20. | Used: ORDERED |
  21. | UnUsed: |
  22. | SyntaxError: |
  23. +--------------------------------------------------------------------------------+
  24. 18 rows in set (0.02 sec)
  • 当 ordered hint 和 leading hint 同时使用时以 ordered hint 为准,leading hint 会失效
  1. mysql> explain shape plan select /*+ ORDERED LEADING(t1 t2 t3) */ t1.c1 from t2 join t1 on t1.c1 = t2.c2 join t3 on c2 = c3;
  2. +--------------------------------------------------------------------------------+
  3. | Explain String(Nereids Planner) |
  4. +--------------------------------------------------------------------------------+
  5. | PhysicalResultSink |
  6. | --PhysicalDistribute[DistributionSpecGather] |
  7. | ----PhysicalProject |
  8. | ------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=() |
  9. | --------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
  10. | ----------PhysicalProject |
  11. | ------------PhysicalOlapScan[t2] |
  12. | ----------PhysicalDistribute[DistributionSpecHash] |
  13. | ------------PhysicalProject |
  14. | --------------PhysicalOlapScan[t1] |
  15. | --------PhysicalDistribute[DistributionSpecHash] |
  16. | ----------PhysicalProject |
  17. | ------------PhysicalOlapScan[t3] |
  18. | |
  19. | Hint log: |
  20. | Used: ORDERED |
  21. | UnUsed: leading(t1 t2 t3) |
  22. | SyntaxError: |
  23. +--------------------------------------------------------------------------------+
  24. 18 rows in set (0.02 sec)

DistributeHint 使用说明

  • 目前只能指定右表的 distribute Type 而且只有[shuffle] 和 [broadcast]两种,写在 join 右表前面且允许中括号和/+ /两种写法
  • 目前能使用任意个 DistributeHint
  • 当遇到无法正确生成计划的 DistributeHint,没有显示,按最大努力生效,最后以 explain 显示的 distribute 方式为主
  • 当前版本暂不与 leading 混用,且当 distribute 指定的表位于 join 右边才可生效。
  • 多与 ordered 混用,利用文本序把 join 顺序固定下来,然后再指定相应的 join 里面我们预期使用什么样的 distribute 方式。例: 使用前:
  1. mysql> explain shape plan select count(*) from t1 join t2 on t1.c1 = t2.c2;
  2. +----------------------------------------------------------------------------------+
  3. | Explain String(Nereids Planner) |
  4. +----------------------------------------------------------------------------------+
  5. | PhysicalResultSink |
  6. | --hashAgg[GLOBAL] |
  7. | ----PhysicalDistribute[DistributionSpecGather] |
  8. | ------hashAgg[LOCAL] |
  9. | --------PhysicalProject |
  10. | ----------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
  11. | ------------PhysicalProject |
  12. | --------------PhysicalOlapScan[t1] |
  13. | ------------PhysicalDistribute[DistributionSpecHash] |
  14. | --------------PhysicalProject |
  15. | ----------------PhysicalOlapScan[t2] |
  16. +----------------------------------------------------------------------------------+
  17. 11 rows in set (0.01 sec)

使用后:

  1. mysql> explain shape plan select /*+ ordered */ count(*) from t2 join[broadcast] t1 on t1.c1 = t2.c2;
  2. +----------------------------------------------------------------------------------+
  3. | Explain String(Nereids Planner) |
  4. +----------------------------------------------------------------------------------+
  5. | PhysicalResultSink |
  6. | --hashAgg[GLOBAL] |
  7. | ----PhysicalDistribute[DistributionSpecGather] |
  8. | ------hashAgg[LOCAL] |
  9. | --------PhysicalProject |
  10. | ----------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
  11. | ------------PhysicalProject |
  12. | --------------PhysicalOlapScan[t2] |
  13. | ------------PhysicalDistribute[DistributionSpecReplicated] |
  14. | --------------PhysicalProject |
  15. | ----------------PhysicalOlapScan[t1] |
  16. | |
  17. | Hint log: |
  18. | Used: ORDERED |
  19. | UnUsed: |
  20. | SyntaxError: |
  21. +----------------------------------------------------------------------------------+
  22. 16 rows in set (0.01 sec)
  • Explain shape plan 里面会显示 distribute 算子相关的信息,其中 DistributionSpecReplicated 表示该算子将对应的数据变成所有 be 节点复制一份,DistributionSpecGather 表示将数据 gather 到 fe 节点,DistributionSpecHash 表示将数据按照特定的 hashKey 以及算法打散到不同的 be 节点。

待支持

  • leadingHint 待支持子查询解嵌套指定,当前和子查询提升以后不能混用,需要有 hint 来控制是否可以解嵌套
  • 需要新的 distributeHint 来更好且更全面地控制 distribute 算子
  • 混合使用 leadingHint 与 distributeHint 来共同确定 join 的形状