GROUP BY

description

GROUP BY GROUPING SETSCUBEROLLUP 是对 GROUP BY 子句的扩展,它能够在一个 GROUP BY 子句中实现多个集合的分组的聚合。其结果等价于将多个相应 GROUP BY 子句进行 UNION 操作。

GROUP BY 子句是只含有一个元素的 GROUP BY GROUPING SETS 的特例。 例如,GROUPING SETS 语句:

  1. SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a, b), (a), (b), ( ) );

其查询结果等价于:

  1. SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b
  2. UNION
  3. SELECT a, null, SUM( c ) FROM tab1 GROUP BY a
  4. UNION
  5. SELECT null, b, SUM( c ) FROM tab1 GROUP BY b
  6. UNION
  7. SELECT null, null, SUM( c ) FROM tab1

GROUPING(expr) 指示一个列是否为聚合列,如果是聚合列为0,否则为1

GROUPING_ID(expr [ , expr [ , ... ] ]) 与GROUPING 类似, GROUPING_ID根据指定的column 顺序,计算出一个列列表的 bitmap 值,每一位为GROUPING的值. GROUPING_ID()函数返回位向量的十进制值。

Syntax

  1. SELECT ...
  2. FROM ...
  3. [ ... ]
  4. GROUP BY [
  5. , ... |
  6. GROUPING SETS [, ...] ( groupSet [ , groupSet [ , ... ] ] ) |
  7. ROLLUP(expr [ , expr [ , ... ] ]) |
  8. expr [ , expr [ , ... ] ] WITH ROLLUP |
  9. CUBE(expr [ , expr [ , ... ] ]) |
  10. expr [ , expr [ , ... ] ] WITH CUBE
  11. ]
  12. [ ... ]

Parameters

groupSet 表示 select list 中的列,别名或者表达式组成的集合 groupSet ::= { ( expr [ , expr [ , ... ] ] )}

expr 表示 select list 中的列,别名或者表达式

Note

doris 支持类似PostgreSQL 语法, 语法实例如下

  1. SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a, b), (a), (b), ( ) );
  2. SELECT a, b,c, SUM( d ) FROM tab1 GROUP BY ROLLUP(a,b,c)
  3. SELECT a, b,c, SUM( d ) FROM tab1 GROUP BY CUBE(a,b,c)

ROLLUP(a,b,c) 等价于如下GROUPING SETS 语句

  1. GROUPING SETS (
  2. (a,b,c),
  3. ( a, b ),
  4. ( a),
  5. ( )
  6. )

CUBE ( a, b, c ) 等价于如下GROUPING SETS 语句

  1. GROUPING SETS (
  2. ( a, b, c ),
  3. ( a, b ),
  4. ( a, c ),
  5. ( a ),
  6. ( b, c ),
  7. ( b ),
  8. ( c ),
  9. ( )
  10. )

example

下面是一个实际数据的例子

  1. > SELECT * FROM t;
  2. +------+------+------+
  3. | k1 | k2 | k3 |
  4. +------+------+------+
  5. | a | A | 1 |
  6. | a | A | 2 |
  7. | a | B | 1 |
  8. | a | B | 3 |
  9. | b | A | 1 |
  10. | b | A | 4 |
  11. | b | B | 1 |
  12. | b | B | 5 |
  13. +------+------+------+
  14. 8 rows in set (0.01 sec)
  15. > SELECT k1, k2, SUM(k3) FROM t GROUP BY GROUPING SETS ( (k1, k2), (k2), (k1), ( ) );
  16. +------+------+-----------+
  17. | k1 | k2 | sum(`k3`) |
  18. +------+------+-----------+
  19. | b | B | 6 |
  20. | a | B | 4 |
  21. | a | A | 3 |
  22. | b | A | 5 |
  23. | NULL | B | 10 |
  24. | NULL | A | 8 |
  25. | a | NULL | 7 |
  26. | b | NULL | 11 |
  27. | NULL | NULL | 18 |
  28. +------+------+-----------+
  29. 9 rows in set (0.06 sec)
  30. > SELECT k1, k2, GROUPING_ID(k1,k2), SUM(k3) FROM t GROUP BY GROUPING SETS ((k1, k2), (k1), (k2), ());
  31. +------+------+---------------+----------------+
  32. | k1 | k2 | grouping_id(k1,k2) | sum(`k3`) |
  33. +------+------+---------------+----------------+
  34. | a | A | 0 | 3 |
  35. | a | B | 0 | 4 |
  36. | a | NULL | 1 | 7 |
  37. | b | A | 0 | 5 |
  38. | b | B | 0 | 6 |
  39. | b | NULL | 1 | 11 |
  40. | NULL | A | 2 | 8 |
  41. | NULL | B | 2 | 10 |
  42. | NULL | NULL | 3 | 18 |
  43. +------+------+---------------+----------------+
  44. 9 rows in set (0.02 sec)

keyword

GROUP, GROUPING, GROUPING_ID, GROUPING_SETS, GROUPING SETS, CUBE, ROLLUP