异步物化视图

物化视图的构建和维护

创建物化视图

准备两张表和数据

  1. use tpch;
  2. CREATE TABLE IF NOT EXISTS orders (
  3. o_orderkey integer not null,
  4. o_custkey integer not null,
  5. o_orderstatus char(1) not null,
  6. o_totalprice decimalv3(15,2) not null,
  7. o_orderdate date not null,
  8. o_orderpriority char(15) not null,
  9. o_clerk char(15) not null,
  10. o_shippriority integer not null,
  11. o_comment varchar(79) not null
  12. )
  13. DUPLICATE KEY(o_orderkey, o_custkey)
  14. PARTITION BY RANGE(o_orderdate)(
  15. FROM ('2023-10-17') TO ('2023-10-20') INTERVAL 1 DAY)
  16. DISTRIBUTED BY HASH(o_orderkey) BUCKETS 3
  17. PROPERTIES ("replication_num" = "1");
  18. insert into orders values
  19. (1, 1, 'ok', 99.5, '2023-10-17', 'a', 'b', 1, 'yy'),
  20. (2, 2, 'ok', 109.2, '2023-10-18', 'c','d',2, 'mm'),
  21. (3, 3, 'ok', 99.5, '2023-10-19', 'a', 'b', 1, 'yy');
  22. CREATE TABLE IF NOT EXISTS lineitem (
  23. l_orderkey integer not null,
  24. l_partkey integer not null,
  25. l_suppkey integer not null,
  26. l_linenumber integer not null,
  27. l_quantity decimalv3(15,2) not null,
  28. l_extendedprice decimalv3(15,2) not null,
  29. l_discount decimalv3(15,2) not null,
  30. l_tax decimalv3(15,2) not null,
  31. l_returnflag char(1) not null,
  32. l_linestatus char(1) not null,
  33. l_shipdate date not null,
  34. l_commitdate date not null,
  35. l_receiptdate date not null,
  36. l_shipinstruct char(25) not null,
  37. l_shipmode char(10) not null,
  38. l_comment varchar(44) not null
  39. )
  40. DUPLICATE KEY(l_orderkey, l_partkey, l_suppkey, l_linenumber)
  41. PARTITION BY RANGE(l_shipdate)
  42. (FROM ('2023-10-17') TO ('2023-10-20') INTERVAL 1 DAY)
  43. DISTRIBUTED BY HASH(l_orderkey) BUCKETS 3
  44. PROPERTIES ("replication_num" = "1");
  45. insert into lineitem values
  46. (1, 2, 3, 4, 5.5, 6.5, 7.5, 8.5, 'o', 'k', '2023-10-17', '2023-10-17', '2023-10-17', 'a', 'b', 'yyyyyyyyy'),
  47. (2, 2, 3, 4, 5.5, 6.5, 7.5, 8.5, 'o', 'k', '2023-10-18', '2023-10-18', '2023-10-18', 'a', 'b', 'yyyyyyyyy'),
  48. (3, 2, 3, 6, 7.5, 8.5, 9.5, 10.5, 'k', 'o', '2023-10-19', '2023-10-19', '2023-10-19', 'c', 'd', 'xxxxxxxxx');

创建物化视图

  1. CREATE MATERIALIZED VIEW mv1
  2. BUILD DEFERRED REFRESH AUTO ON MANUAL
  3. partition by(l_shipdate)
  4. DISTRIBUTED BY RANDOM BUCKETS 2
  5. PROPERTIES ('replication_num' = '1')
  6. AS
  7. select l_shipdate, o_orderdate, l_partkey, l_suppkey, sum(o_totalprice) as sum_total
  8. from lineitem
  9. left join orders on lineitem.l_orderkey = orders.o_orderkey and l_shipdate = o_orderdate
  10. group by
  11. l_shipdate,
  12. o_orderdate,
  13. l_partkey,
  14. l_suppkey;

具体的语法可查看CREATE MATERIALIZED VIEW

查看物化视图元信息

  1. select * from mv_infos("database"="tpch") where Name="mv1";

物化视图独有的特性可以通过mv_infos()查看

和table相关的属性,仍通过SHOW TABLES来查看

刷新物化视图

物化视图支持不同刷新策略,如定时刷新和手动刷新。也支持不同的刷新粒度,如全量刷新,分区粒度的增量刷新等。这里我们以手动刷新物化视图的部分分区为例。

首先查看物化视图分区列表

  1. SHOW PARTITIONS FROM mv1;

刷新名字为p_20231017_20231018的分区

  1. REFRESH MATERIALIZED VIEW mv1 partitions(p_20231017_20231018);

具体的语法可查看REFRESH MATERIALIZED VIEW

任务管理

每个物化视图都会默认有一个job负责刷新数据,job用来描述物化视图的刷新策略等信息,每次触发刷新,都会产生一个task, task用来描述具体的一次刷新信息,例如刷新用的时间,刷新了哪些分区等

查看物化视图的job

  1. select * from jobs("type"="mv") order by CreateTime;

具体的语法可查看jobs(“type”=”mv”)

暂停物化视图job定时调度

  1. PAUSE MATERIALIZED VIEW JOB ON mv1;

可以暂停物化视图的定时调度

具体的语法可查看PAUSE MATERIALIZED VIEW JOB

恢复物化视图job定时调度

  1. RESUME MATERIALIZED VIEW JOB ON mv1;

可以恢复物化视图的定时调度

具体的语法可查看RESUME MATERIALIZED VIEW JOB

查看物化视图的task

  1. select * from tasks("type"="mv");

具体的语法可查看tasks(“type”=”mv”)

取消物化视图的task

  1. CANCEL MATERIALIZED VIEW TASK realTaskId on mv1;

可以取消本次task的运行

具体的语法可查看CANCEL MATERIALIZED VIEW TASK

修改物化视图

修改物化视图的属性

  1. ALTER MATERIALIZED VIEW mv1 set("grace_period"="3333");

修改物化视图的名字,物化视图的刷新方式及物化视图特有的property可通过ALTER MATERIALIZED VIEW来修改

物化视图本身也是一个 Table,所以 Table 相关的属性,例如副本数,仍通过ALTER TABLE相关的语法来修改。

删除物化视图

  1. DROP MATERIALIZED VIEW mv1;

物化视图有专门的删除语法,不能通过drop table来删除,

具体的语法可查看DROP MATERIALIZED VIEW

物化视图的使用

请参阅 查询异步物化视图

注意事项

  • 异步物化视图仅支持在Nereids优化器使用,Nereids优化器
  • 当前判断物化视图和基表是否同步仅支持OlapTable。对于其它外表,会直接认为是同步的。例如,物化视图的基表全是外表。在查询mv_infos()时,SyncWithBaseTables会永远为1(true)。在刷新物化视图时需要手动刷新指定的分区或指定complete刷新全部分区