案例:改写SQL消除子查询(案例1)

现象描述

  1. select
  2. 1,
  3. (select count(*) from normal_date n where n.id = a.id) as GZCS
  4. from normal_date a;

此SQL性能较差,查看发现执行计划中存在SubPlan,具体如下:

  1. QUERY PLAN
  2. ---------------------------------------------------------------------------------------------------------------------------------------
  3. Seq Scan on normal_date a (cost=0.00..888118.42 rows=5129 width=4) (actual time=2.394..22194.907 rows=10000 loops=1)
  4. SubPlan 1
  5. -> Aggregate (cost=173.12..173.12 rows=1 width=8) (actual time=22179.496..22179.942 rows=10000 loops=10000)
  6. -> Seq Scan on normal_date n (cost=0.00..173.11 rows=1 width=0) (actual time=11279.349..22159.608 rows=10000 loops=10000)
  7. Filter: (id = a.id)
  8. Rows Removed by Filter: 99990000
  9. Total runtime: 22196.415 ms
  10. (7 rows)

优化说明

此优化的核心就是消除子查询。分析业务场景发现_a**.**id_不为null,那么从SQL语义出发,可以等价改写SQL为:

  1. select
  2. count(*)
  3. from normal_date n, normal_date a
  4. where n.id = a.id
  5. group by a.id;
  6. 计划如下:
  7. QUERY PLAN
  8. ----------------------------------------------------------------------------------------------------------------------------------
  9. HashAggregate (cost=480.86..532.15 rows=5129 width=12) (actual time=21.539..24.356 rows=10000 loops=1)
  10. Group By Key: a.id
  11. -> Hash Join (cost=224.40..455.22 rows=5129 width=4) (actual time=6.402..13.484 rows=10000 loops=1)
  12. Hash Cond: (n.id = a.id)
  13. -> Seq Scan on normal_date n (cost=0.00..160.29 rows=5129 width=4) (actual time=0.087..1.459 rows=10000 loops=1)
  14. -> Hash (cost=160.29..160.29 rows=5129 width=4) (actual time=6.065..6.065 rows=10000 loops=1)
  15. Buckets: 32768 Batches: 1 Memory Usage: 352kB
  16. -> Seq Scan on normal_date a (cost=0.00..160.29 rows=5129 width=4) (actual time=0.046..2.738 rows=10000 loops=1)
  17. Total runtime: 26.844 ms
  18. (9 rows)

案例:改写SQL消除子查询(案例1) - 图1 说明:

为了保证改写的等效性,在_normal_date.id_加了_not null_约束。