分布式执行计划的简单调度模型为,在计划生成的最后阶段,以 exchange 节点为界,拆分成多个子计划,每个子计划被封装成为一个 DFO, 在并行度大于 1 的场景下, 会一次调度两个 DFO, 依次完成DFO树的遍历执行; 在并行度等于 1 的场景下, 每个 DFO 会将产生的数据存入中间结果管理器, 按照后序遍历的形式完成整个 DFO 树的遍历执行。

单 DFO 调度示例

在并行度为 1 的场景下, 对于如下查询计划执行单 DFO 调度。

  1. ======================================================================================
  2. |ID|OPERATOR |NAME |EST. ROWS |COST |
  3. --------------------------------------------------------------------------------------
  4. |0 |LIMIT | |10 |6956829987|
  5. |1 | PX COORDINATOR MERGE SORT | |10 |6956829985|
  6. |2 | EXCHANGE OUT DISTR |:EX10002 |10 |6956829976|
  7. |3 | LIMIT | |10 |6956829976|
  8. |4 | TOP-N SORT | |10 |6956829975|
  9. |5 | HASH GROUP BY | |454381562 |5815592885|
  10. |6 | HASH JOIN | |500918979 |5299414557|
  11. |7 | EXCHANGE IN DISTR | |225943610 |2081426759|
  12. |8 | EXCHANGE OUT DISTR (PKEY) |:EX10001 |225943610 |1958446695|
  13. |9 | MATERIAL | |225943610 |1958446695|
  14. |10| HASH JOIN | |225943610 |1480989849|
  15. |11| JOIN FILTER CREATE | |30142669 |122441311 |
  16. |12| PX PARTITION ITERATOR | |30142669 |122441311 |
  17. |13| TABLE SCAN |CUSTOMER |30142669 |122441311 |
  18. |14| EXCHANGE IN DISTR | |731011898 |900388059 |
  19. |15| EXCHANGE OUT DISTR (PKEY)|:EX10000 |731011898 |614947815 |
  20. |16| JOIN FILTER USE | |731011898 |614947815 |
  21. |17| PX BLOCK ITERATOR | |731011898 |614947815 |
  22. |18| TABLE SCAN |ORDERS |731011898 |614947815 |
  23. |19| PX PARTITION ITERATOR | |3243094528|1040696710|
  24. |20| TABLE SCAN |LINEITEM(I_L_Q06_001)|3243094528|1040696710|
  25. ======================================================================================

如下图所示, DFO 树除 ROOT DFO 外, 在垂直方向上被分别划分为 0 , 1 , 2 个 DFO, 从而后序遍历调度的顺序为 0->1->2, 即可完成整个计划树的迭代。

单 DFO.jpeg

两 DFO 调度示例

对于并行度大于 1 的计划, 调度方式会采用两 DFO 调度。对于如下查询计划执行两 DFO 调度:

  1. Query Plan
  2. =============================================================================
  3. |ID|OPERATOR |NAME |EST. ROWS|COST |
  4. -----------------------------------------------------------------------------
  5. |0 |PX COORDINATOR MERGE SORT | |9873917 |692436562|
  6. |1 | EXCHANGE OUT DISTR |:EX10002|9873917 |689632565|
  7. |2 | SORT | |9873917 |689632565|
  8. |3 | SUBPLAN SCAN |VIEW5 |9873917 |636493382|
  9. |4 | WINDOW FUNCTION | |29621749 |629924873|
  10. |5 | HASH GROUP BY | |29621749 |624266752|
  11. |6 | HASH JOIN | |31521003 |591048941|
  12. |7 | JOIN FILTER CREATE | |407573 |7476793 |
  13. |8 | EXCHANGE IN DISTR | |407573 |7476793 |
  14. |9 | EXCHANGE OUT DISTR (BROADCAST) |:EX10001|407573 |7303180 |
  15. |10| HASH JOIN | |407573 |7303180 |
  16. |11| JOIN FILTER CREATE | |1 |53 |
  17. |12| EXCHANGE IN DISTR | |1 |53 |
  18. |13| EXCHANGE OUT DISTR (BROADCAST)|:EX10000|1 |53 |
  19. |14| PX BLOCK ITERATOR | |1 |53 |
  20. |15| TABLE SCAN |NATION |1 |53 |
  21. |16| JOIN FILTER USE | |10189312 |3417602 |
  22. |17| PX BLOCK ITERATOR | |10189312 |3417602 |
  23. |18| TABLE SCAN |SUPPLIER|10189312 |3417602 |
  24. |19| JOIN FILTER USE | |803481600|276540086|
  25. |20| PX PARTITION ITERATOR | |803481600|276540086|
  26. |21| TABLE SCAN |PARTSUPP|803481600|276540086|
  27. =============================================================================

如下图所示,DFO 树除 ROOT DFO 外, 被划分为 3 个 DFO, 调度时会先调 0 和 1 对应的 DFO, 待 0 号 DFO 执行完毕后, 会再调度 1 和 2 号 DFO, 依次迭代完成执行。

两DFO.jpeg