聚合操作指的是将多行结果按照指定方式组合成一行。常见的聚合有 GROUP BY 语句,聚合函数和 DISTINCT 操作。聚合算子主要包括:GROUP BY,DISTINCT 和 WINDOW FUNCTION。本节只介绍 GROUP BY 和 DISTINCT。
GROUP BY 的算法
GROUP BY 的逻辑算子有三种不同的算法,分别是 SCALAR GROUP BY,MERGE GROUP BY 和 HASH GROUP BY。
SCALAR GROUP BY
SCALAR GROUP BY 是指聚合后生成的结果是一个标量的算法。
如下例所示,SELECT 语句中有聚合函数,但是没有 GROUP BY 语句,则语义为:对 t1 的 a 列取和,结果为标量,优化器决定使用 SCALAR GROUP BY。
obclient> explain select sum(a) from t1\G
*************************** 1. row ***************************
Query Plan:
========================================
|ID|OPERATOR |NAME|EST. ROWS|COST|
----------------------------------------
|0 |SCALAR GROUP BY| |1 |624 |
|1 | TABLE SCAN |t1 |1000 |433 |
========================================
Outputs & filters:
-------------------------------------
0 - output([T_FUN_SUM(t1.a)]), filter(nil),
group(nil), agg_func([T_FUN_SUM(t1.a)])
1 - output([t1.a]), filter(nil),
access([t1.a]), partitions(p0)
1 row in set (0.01 sec)
MERGE GROUP BY
当语句包含 GROUP BY 语句时,优化器可能会生成 MERGE GROUP BY 算子。MERGE GROUP BY 要求下层必须按照 GROUP BY 的列有序,否则 MERGE GROUP BY 会自行分配 SORT 算子。
如下例所示,t1 主键列为 a,GROUP BY 的表达式同为 a,MERGE GROUP BY 可以使用下层的序。
obclient> explain extended_noaddr select sum(a) from t1 group by a\G
*************************** 1. row ***************************
Query Plan: =======================================
|ID|OPERATOR |NAME|EST. ROWS|COST|
---------------------------------------
|0 |MERGE GROUP BY| |250 |686 |
|1 | TABLE SCAN |t1 |1000 |411 |
=======================================
Outputs & filters:
-------------------------------------
0 - output([T_FUN_SUM(t1.a)]), filter(nil),
group([t1.a]), agg_func([T_FUN_SUM(t1.a)])
1 - output([t1.a]), filter(nil),
access([t1.a]), partitions(p0),
is_index_back=false,
range_key([t1.a]), range(MIN ; MAX)always true
1 row in set (0.01 sec)
而当 ORDER BY 表达式为 b 时,不能直接利用下层 TABLE SCAN 算子结果的序,所以优化器会在 MERGE GROUP BY 下分配 SORT 算子。
obclient> explain extended_noaddr select /*+ NO_USE_HASH_AGGREGATION */ sum(a) from t1 group by b\G
*************************** 1. row ***************************
Query Plan:
=======================================
|ID|OPERATOR |NAME|EST. ROWS|COST|
---------------------------------------
|0 |MERGE GROUP BY| |250 |2256|
|1 | SORT | |1000 |1981|
|2 | TABLE SCAN |t1 |1000 |433 |
=======================================
Outputs & filters:
-------------------------------------
0 - output([T_FUN_SUM(t1.a)]), filter(nil),
group([t1.b]), agg_func([T_FUN_SUM(t1.a)])
1 - output([t1.a], [t1.b]), filter(nil), sort_keys([t1.b, ASC])
2 - output([t1.a], [t1.b]), filter(nil),
access([t1.a], [t1.b]), partitions(p0),
is_index_back=false,
range_key([t1.a]), range(MIN ; MAX)always true
1 row in set (0.01 sec)
另外,MERGE GROUP BY 生成的结果是按照 GROUP BY 表达式有序的。
如下例所示,语句输出结果需要按照 a 排序,而因为 MERGE GROUP BY 的结果已经有序,所以优化器会把最上层的 ORDER BY 消除。
obclient> explain extended_noaddr select sum(a) from t1 group by a order by a\G
*************************** 1. row ***************************
Query Plan:
=======================================
|ID|OPERATOR |NAME|EST. ROWS|COST|
---------------------------------------
|0 |MERGE GROUP BY| |250 |686 |
|1 | TABLE SCAN |t1 |1000 |411 |
=======================================
Outputs & filters:
-------------------------------------
0 - output([T_FUN_SUM(t1.a)]), filter(nil),
group([t1.a]), agg_func([T_FUN_SUM(t1.a)])
1 - output([t1.a]), filter(nil),
access([t1.a]), partitions(p0),
is_index_back=false,
range_key([t1.a]), range(MIN ; MAX)always true
1 row in set (0.01 sec)
HASH GROUP BY
当语句包含 GROUP BY 语句时,优化器可能会生成 HASH GROUP BY 算子。HASH GROUP BY 内部实现使用了 HASH 算法,所以该算子对下层算子结果的序没有依赖,产生的结果是无序的。
obclient> explain extended_noaddr select sum(a) from t1 group by b order by b\G
*************************** 1. row ***************************
Query Plan:
=======================================
|ID|OPERATOR |NAME|EST. ROWS|COST|
---------------------------------------
|0 |SORT | |250 |1338|
|1 | HASH GROUP BY| |250 |993 |
|2 | TABLE SCAN |t1 |1000 |433 |
=======================================
Outputs & filters:
-------------------------------------
0 - output([T_FUN_SUM(t1.a)]), filter(nil), sort_keys([t1.b, ASC])
1 - output([T_FUN_SUM(t1.a)], [t1.b]), filter(nil),
group([t1.b]), agg_func([T_FUN_SUM(t1.a)])
2 - output([t1.a], [t1.b]), filter(nil),
access([t1.a], [t1.b]), partitions(p0),
is_index_back=false,
range_key([t1.a]), range(MIN ; MAX)always true
1 row in set (0.02 sec)
MERGE GROUP BY 与 HASH GROUP BY 的选择
当一个语句中有存在 GROUP BY 语句时,优化器可能会根据代价选择 MERGE/HASH GROUP BY。通常来说,HASH GROUP BY 的代价比 MERGE GROUP BY 要大,但是如果 MERGE GROUP BY 算子要求下层分配 SORT 算子时,聚合的整体代价应该视为MERGE GROUP BY 和下层 SORT 算子的代价之和,此时,HASH GROUP BY 的代价反而可能更低。通俗来讲也就是,如果 GROUP BY 算子下层有可以利用的序时通常会走 MERGE GROUP BY,否则优化器有可能会选择 HASH GROUP BY。
注意
用户可以通过 HINT(NO_USE_HASH_AGGREGATION和USE_HASH_AGGREGATION) 指定用哪种聚合算法。
DISTINCT 的算法
DISTINCT 内部算法的实现其实与 GROUP BY 类似,种类有 MERGE DISTINCT 与 HASH DISTINCT。优化器在选择DISTINCT 算法时的过程也与 GROUP BY 类似,同时也可以使用 HINT(NO_USE_HASH_AGGREGATION 和USE_HASH_AGGREGATION) 控制。
MERGE DISTINCT
MERGE DISTINCT 算子与 MERGE GROUP BY 类似,也要求下层有序,算子输出的结果也是有序的。
如下例所示, t1 主键为 a。
obclient> explain extended_noaddr select distinct a from t1 order by a\G
*************************** 1. row ***************************
Query Plan: ===================================
|ID|OPERATOR |NAME|EST. ROWS|COST|
-----------------------------------
|0 |TABLE SCAN|t1 |1000 |411 |
===================================
Outputs & filters:
-------------------------------------
0 - output([t1.a]), filter(nil),
access([t1.a]), partitions(p0),
is_index_back=false,
range_key([t1.a]), range(MIN ; MAX)always true
1 row in set (0.02 sec)
HASH DISTINCT
HASH DISTINCT 算子与 HASH GROUP BY 类似,不要求下层有序,输出的结果无序。
如下例所示,t1 主键为 a。
obclient> explain extended_noaddr select distinct b from t1 order by b\G
*************************** 1. row ***************************
Query Plan:
=======================================
|ID|OPERATOR |NAME|EST. ROWS|COST|
---------------------------------------
|0 |SORT | |250 |1237|
|1 | HASH DISTINCT| |250 |940 |
|2 | TABLE SCAN |t1 |1000 |433 |
=======================================
Outputs & filters:
-------------------------------------
0 - output([t1.b]), filter(nil), sort_keys([t1.b, ASC])
1 - output([t1.b]), filter(nil),
distinct([t1.b])
2 - output([t1.b]), filter(nil),
access([t1.b]), partitions(p0),
is_index_back=false,
range_key([t1.a]), range(MIN ; MAX)always true
1 row in set (0.01 sec)