背景

很多人小时候都有一个武侠梦,独孤求败更是金庸武侠小说里的一位传奇人物。

纵横江湖三十馀载,杀尽仇寇奸人,败尽英雄豪杰,天下更无抗手,无可奈何,惟隐居深谷,以雕为友。 呜呼,生平求一敌手而不可得,诚寂寥难堪也。

独孤老前辈的佩剑描写非常有意思,从使用的佩剑,可以看出一个人的武功修为。

第一柄是一柄青光闪闪的无名利剑。「凌厉刚猛,无坚不摧,弱冠前以之与河朔群雄争锋。」

第二柄是紫薇软剑,「三十岁前所用,误伤义士不祥,乃弃之深谷。」

第三柄是玄铁重剑,「重剑无锋,大巧不工,四十岁之前恃之横行天下。」

第四柄是柄已腐朽的木剑,原因是独孤求败「四十岁后,不滞于物,草木竹石均可为剑。」

IT行业也很相似,开发语言有C, python, java, ….各式各样的可选,数据库有Oracle, PostgreSQL, MySQL等等,也是各式各样的产品可选。

宝剑赠英雄,美玉配佳人。

不管你是李逍遥、还是杨过、小龙女,希望搞IT的你也能找到合适你的武器。

pic

pic

进入正题。

多列的组合查询在实际的应用中较为常见,比如淘宝购物网站的商品搜索

pic

有发货地、分类、是否包邮、是否货到付款、是否天猫、二手、等等许多选项,这些选项在设计时可能使用多个字段来表示(当然,有些可能会使用BIT合并成单个字段来表述)。

举个非常简单的例子

  1. CREATE TABLE test2 (
  2. major int,
  3. minor int,
  4. name varchar
  5. );

查询条件中,包含test2表的两个字段的检索

  1. SELECT name FROM test2 WHERE major = constant AND minor = constant;

这种情况下,我们可以使用两个索引,也可以使用一个复合索引(多列索引)。

  1. CREATE INDEX test2_mm_idx ON test2 (major, minor);

以上例子可以转化为对多个字段的任意组合查询需求(任意单一、任意两个、任意三个,任意若干个字段的查询需求)。

看过我写的文档的童鞋,可能会联想到我在之前写过一篇文档

《PostgreSQL 9.6 黑科技 bloom 算法索引,一个索引支撑任意列组合查询》

不过本文并不是要讲bloom,本文要讲一讲另一种技术(gin索引的暗藏功能,多列展开式B树)。

在开始正文之前,大家有没有想过这些问题呢?

1. 哪些索引方法支持多列索引?

PostgreSQL目前支持btree, hash, gin, gist, sp-gist, brin, bloom, rum等众多索引访问方法,哪些访问方法支持多列索引呢?

2. 多列索引的内部存储结构如何?

比如b-tree单列索引很好理解,就是以被索引列值为KEY的类B-Tree(nbtree)结构。但是当使用多列索引时,内部是如何组织的呢?

不同的索引方法,内部组织有什么差异呢?

3. 多列索引支持哪些查询组合

比如index on (a,b,c)三列,那么哪些查询条件能用上多列索引呢?比如where a=? and b>?

不同的索引方法,适用的查询条件是不是都一样呢?

4. 不同的查询组合,使用多列索引的效率如何,效率是否一样(是否与索引访问方法有关?)

比如b-tree index on (a,b,c)三列,where a=? and b>? 以及 where b>? and c=? 效率一样吗?

5. 多列索引,每个列的顺序是否可以指定

比如,是不是所有的索引方法都可创建这样的索引 index on (a,b desc, c desc nulls first)

6. 同样的查询组合,使用什么索引方法更高效

比如 where a=? and b=? and c=? 这样的查询,适用gin好还是b-tree好呢?

7. 如何选择合适的索引方法,与查询条件,数据分布有关系吗?

要回答这些问题,需要对索引方法,内部存储结构有一定的了解。

本文将以gin和btree为例,讲解一下multi column index,它们的内部存储结构,适应的场景。

一 单列索引的内部结构

建议读者先了解一下单列索引的内部结构,本文就不展开了,可以参考我之前写的一些文章。

《深入浅出PostgreSQL B-Tree索引结构》

《B-Tree和B+Tree》

《PostgreSQL GIN索引实现原理》

《从难缠的模糊查询聊开 - PostgreSQL独门绝招之一 GIN , GiST , SP-GiST , RUM 索引原理与技术背景》

二 PostgreSQL use bitmap combine single column indexs

在没有multi column index时,如果我们有多个列的查询条件,通常可以使用选择性好的列,或者多个列索引的组合。

PostgreSQL 使用多个列索引组合查询时,可以使用多列查询结果的ctid bitmap and or ,筛选出最终符合多列条件的ctid。

不仅适用于多列的查询条件,也适用于单列的多个查询条件。

例如

  1. WHERE x = 42 OR x = 47 OR x = 53 OR x = 99
  2. WHERE x = 5 AND y = 6

https://www.postgresql.org/docs/9.6/static/indexes-bitmap-scans.html

Combining Multiple Indexes

  1. A single index scan can only use query clauses that use the index's columns with operators of its operator class and are joined with AND.
  2. For example, given an index on (a, b) a query condition like WHERE a = 5 AND b = 6 could use the index, but a query like WHERE a = 5 OR b = 6 could not directly use the index.
  3. Fortunately, PostgreSQL has the ability to combine multiple indexes (including multiple uses of the same index) to handle cases that cannot be implemented by single index scans.
  4. The system can form AND and OR conditions across several index scans.
  5. For example, a query like WHERE x = 42 OR x = 47 OR x = 53 OR x = 99 could be broken down into four separate scans of an index on x, each scan using one of the query clauses.
  6. The results of these scans are then ORed together to produce the result.
  7. Another example is that if we have separate indexes on x and y,
  8. one possible implementation of a query like WHERE x = 5 AND y = 6 is to use each index with the appropriate query clause and then AND together the index results to identify the result rows.
  9. 支持单列组合条件,也支持多列组合条件。
  10. To combine multiple indexes, the system scans each needed index and prepares a bitmap in memory giving the locations of table rows that are reported as matching that index's conditions.
  11. The bitmaps are then ANDed and ORed together as needed by the query. Finally, the actual table rows are visited and returned.
  12. 注意bitmap扫描是按CTID顺序输出,而非KEY的顺序输出,所以如果对列有ORDER BY的需求,那么会需要额外的sort,优化器会根据实际情况选择合适的执行计划(bitmap 单个索引filter但是不用sort
  13. The table rows are visited in physical order, because that is how the bitmap is laid out;
  14. this means that any ordering of the original indexes is lost, and so a separate sort step will be needed if the query has an ORDER BY clause.
  15. For this reason, and because each additional index scan adds extra time, the planner will sometimes choose to use a simple index scan even though additional indexes are available that could have been used as well.
  16. In all but the simplest applications, there are various combinations of indexes that might be useful, and the database developer must make trade-offs to decide which indexes to provide.
  17. Sometimes multicolumn indexes are best, but sometimes it's better to create separate indexes and rely on the index-combination feature.
  18. For example, if your workload includes a mix of queries that sometimes involve only column x, sometimes only column y, and sometimes both columns,
  19. you might choose to create two separate indexes on x and y, relying on index combination to process the queries that use both columns.
  20. You could also create a multicolumn index on (x, y).
  21. This index would typically be more efficient than index combination for queries involving both columns,
  22. but as discussed in Section 11.3, it would be almost useless for queries involving only y, so it should not be the only index.
  23. A combination of the multicolumn index and a separate index on y would serve reasonably well. For queries involving only x, the multicolumn index could be used,
  24. though it would be larger and hence slower than an index on x alone. The last alternative is to create all three indexes,
  25. but this is probably only reasonable if the table is searched much more often than it is updated and all three types of query are common.
  26. If one of the types of query is much less common than the others, you'd probably settle for creating just the two indexes that best match the common types.
  27. 在选择单列还是多列索引时,请根据查询需求选择。如果查询很多,更新插入删除很少,那么如果查询条件有a, b, a and b这类的,可以建立3个索引,分别是(a), (b), (a,b)。

三 哪些索引方法支持multi column index

Currently, only the B-tree, GiST, GIN, and BRIN index types support multicolumn indexes.

Up to 32 columns can be specified. (This limit can be altered when building PostgreSQL; see the file pg_config_manual.h.)

目前PostgreSQL的B-tree, GiST, GIN, and BRIN索引方法,支持多列索引。

目前支持最多32个列的多列索引,实际上可以更大(通过调整pg_config_manual.h可以做到更大,但是还有另一个限制,indextuple不能超过约1/4的数据块大小,也就是说复合索引列很多的情况下,可能会触发这个限制)。

四 multi column index的查询组合与效率解读

Multicolumn Indexes

https://www.postgresql.org/docs/current/static/indexes-multicolumn.html

由于b-tree, gin , gist, brin都支持multi column索引,但是这几种索引的内部存储方式不一样,所以不同的组合查询的效率也不一样。

例如a,b,c三列的组合索引,select * from tbl where a=? and b>? 以及 where b=?,这两种查询组合,哪个效率高?和索引方法有大大的关系。

btree 查询组合与效率

b-tree多列索引支持任意列的组合查询

A multicolumn B-tree index can be used with query conditions that involve any subset of the index’s columns, but the index is most efficient when there are constraints on the leading (leftmost) columns.
虽然b-tree多列索引支持任意列的组合查询,但是最有效的查询还是包含驱动列条件的查询。

The exact rule is that equality constraints on leading columns, plus any inequality constraints on the first column that does not have an equality constraint, will be used to limit the portion of the index that is scanned.
对于b-tree的多列索引来说,一个查询要扫描索引的哪些部分呢?

从驱动列开始算,按索引列的顺序算到非驱动列的第一个不相等条件为止(没有任何条件也算)。

Constraints on columns to the right of these columns are checked in the index, so they save visits to the table proper, but they do not reduce the portion of the index that has to be scanned.

For example, given an index on (a, b, c) and a query condition WHERE a = 5 AND b >= 42 AND c < 77, the index would have to be scanned from the first entry with a = 5 and b = 42 up through the last entry with a = 5.

(WHERE a = 5 AND b >= 42 AND c < 77),从a=5, b=42开始的所有索引条目,都会被扫描。

Index entries with c >= 77 would be skipped, but they’d still have to be scanned through.

其他例子

(WHERE b >= 42 AND c < 77),所有索引条目,都会被扫描。只要不包含驱动列,则扫描所有索引条目。

(WHERE a = 5 AND c < 77),a=5的所有索引条目,都会被扫描。

(WHERE a >= 5 AND b=1 and c < 77),从a=5开始的所有索引条目,都会被扫描。

This index could in principle be used for queries that have constraints on b and/or c with no constraint on a — but the entire index would have to be scanned, so in most cases the planner would prefer a sequential table scan over using the index.

建议有频繁的复合查询,并且复合查询带有驱动列以及其他列的查询时,可以考虑使用多列索引。

gist 查询组合与效率

gist多列索引支持任意列的组合查询。

A multicolumn GiST index can be used with query conditions that involve any subset of the index’s columns.

Conditions on additional columns restrict the entries returned by the index, but the condition on the first column is the most important one for determining how much of the index needs to be scanned.

注意与b-tree不一样的地方,驱动列的选择性决定了需要扫描多少索引条目,扫描多少条目与非驱动列无关(而b-tree是与非驱动列也有关的)。

A GiST index will be relatively ineffective if its first column has only a few distinct values, even if there are many distinct values in additional columns.

如果驱动列的选择性不好、其他列的选择性很好,即使查询条件同时包含了 驱动列以及其他列 ,也需要扫描很多索引条目,因为扫描多少索引条目和其他列无关。

这么说,并不建议使用gist多列索引。

如果一定要使用GIST多列索引,请一定要把选择性好的列作为驱动列。

gin 查询组合与效率

gin多列索引支持任意列的组合查询。

并且任意查询条件的查询效率都是一样的。

A multicolumn GIN index can be used with query conditions that involve any subset of the index’s columns.

Unlike B-tree or GiST, index search effectiveness is the same regardless of which index column(s) the query conditions use.

brin 查询组合与效率

brin多列索引支持任意列的组合查询。

并且任意查询条件的查询效率都是一样的。

如果有brin组合查询的必要(比如多个与ctid线性相关的列的范围查询,无所谓线性的方向),任何时候都建议使用BRIN的multi column index,除非想针对不同的列使用不同的pages_per_range(比如有些列10个块的范围和另外一些列100个块的范围覆盖差不多,那么建议它们使用不同的pages_per_range)

A multicolumn BRIN index can be used with query conditions that involve any subset of the index’s columns.

Like GIN and unlike B-tree or GiST, index search effectiveness is the same regardless of which index column(s) the query conditions use.

The only reason to have multiple BRIN indexes instead of one multicolumn BRIN index on a single table is to have a different pages_per_range storage parameter.

五 multi column使用建议

多列索引每个列的operator class必须和实际查询匹配,在创建索引时可以指定。

Of course, each column must be used with operators appropriate to the index type; clauses that involve other operators will not be considered.

Multicolumn indexes should be used sparingly.

In most situations, an index on a single column is sufficient and saves space and time.

Indexes with more than three columns are unlikely to be helpful unless the usage of the table is extremely stylized.

六 多列索引,在不同组合下的查询效率差异,原理剖析 - 从索引内部结构说起

前面分析了b-tree, gin都支持任意组合查询。

但是b-tree推荐使用包含驱动列的查询条件,如果查询条件未包含驱动列,则需要扫描整个复合索引。

而gin则通吃,可以输入任意组合列作为查询条件,并且效率一致。

例如

index on (a,b,c)

b-tree 对于包含驱动列a查询条件的SQL,效率可能比较好,不包括a查询条件的SQL,即使走索引,也要扫描整个索引的所有条目。

而gin 则无论任何查询条件,效果都一样。

这是为什么呢?必须从索引的内部存储组织结构来分析。

b-tree multi column index 剖析

btree 对被索引列按创建索引时指定的顺序排序,然后建立B树。

如create index idx on tbl using btree (a,b desc nulls first,c desc, e);

所以B树中的KEY实际上就是被索引列的组合对象,这个结构决定了什么查询能用上这个复合索引。

  1. (a,b,c), row?
  2. (a,b,c), row?
  3. (a,b,c), row?
  4. ....

要达到最高效的使用这种复合索引,必须带上驱动列的条件。

如果order by要用上索引,那么必须order by的写法要与创建索引时指定的顺序一致。

例如select * from tbl where a=? order by a,b desc nulls first;

gin multi column index 剖析

gin 的复合索引很有趣,它将所有列展开,然后将展开后的数据(列ID+值)排序并建立B树。

因此在gin的复合索引中,B树的KEY实际上是列ID+值。

  1. (column_a, v1), row?
  2. (column_b, v1), row?
  3. (column_b, v2), row?
  4. (column_c, v2), row?
  5. ....

这样的树,以任意组合进行查询,效率都一样。

where a=? 与 where b=? 效率一样,而且和B-tree的单列索引的效率几乎一致(当索引层级一致时)。

where a=? and b=? 与 where b=? and c=? 效率一样(复合索引查两次,在内部使用bitmapAnd取得结果)。

仅仅当多列组合查询时,gin效率可能比不上b-tree的带驱动列的查询(因为b-tree索引不需要bitmapAnd,而gin需要内部bitmapAnd)。

七 例子

创建一个测试表,包含3个字段

  1. postgres=# create table t3(c1 int, c2 text, c3 int);
  2. CREATE TABLE

插入100万记录,其中c2,c3的值固定

  1. postgres=# postgres=# insert into t3 select generate_series(1,1000000),'test',1;
  2. INSERT 0 1000000

创建gin复合索引

  1. postgres=# create index idx_t3_1 on t3 using gin(c1,c2,c3);
  2. CREATE INDEX

查询c1=1,效率与单列索引一致

这个查询结果也可以说明另一个问题,不同列并不是单纯展开后直接构建B树,它依旧添加了列ID进来,所以即使c3=1有100万记录,并不影响c1=1的扫描PAGE数。

  1. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t3 where c1=1;
  2. QUERY PLAN
  3. -----------------------------------------------------------------------------------------------------------------
  4. Bitmap Heap Scan on public.t3 (cost=5.01..6.02 rows=1 width=13) (actual time=0.021..0.021 rows=1 loops=1)
  5. Output: c1, c2, c3
  6. Recheck Cond: (t3.c1 = 1)
  7. Heap Blocks: exact=1
  8. Buffers: shared hit=5
  9. -> Bitmap Index Scan on idx_t3_1 (cost=0.00..5.01 rows=1 width=0) (actual time=0.016..0.016 rows=1 loops=1)
  10. Index Cond: (t3.c1 = 1)
  11. Buffers: shared hit=4
  12. Planning time: 0.076 ms
  13. Execution time: 0.047 ms
  14. (10 rows)

查询c2=?,效率与单列索引一致

  1. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t3 where c2='test';
  2. QUERY PLAN
  3. ----------------------------------------------------------------------------------------------------------------------------------
  4. Bitmap Heap Scan on public.t3 (cost=8121.00..26027.00 rows=1000000 width=13) (actual time=74.467..179.603 rows=1000000 loops=1)
  5. Output: c1, c2, c3
  6. Recheck Cond: (t3.c2 = 'test'::text)
  7. Heap Blocks: exact=5406
  8. Buffers: shared hit=5542
  9. -> Bitmap Index Scan on idx_t3_1 (cost=0.00..7871.00 rows=1000000 width=0) (actual time=73.640..73.640 rows=1000000 loops=1)
  10. Index Cond: (t3.c2 = 'test'::text)
  11. Buffers: shared hit=136
  12. Planning time: 0.130 ms
  13. Execution time: 230.770 ms
  14. (10 rows)
  15. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t3 where c2='t';
  16. QUERY PLAN
  17. -----------------------------------------------------------------------------------------------------------------
  18. Bitmap Heap Scan on public.t3 (cost=5.00..6.01 rows=1 width=13) (actual time=0.014..0.014 rows=0 loops=1)
  19. Output: c1, c2, c3
  20. Recheck Cond: (t3.c2 = 't'::text)
  21. Buffers: shared hit=4
  22. -> Bitmap Index Scan on idx_t3_1 (cost=0.00..5.00 rows=1 width=0) (actual time=0.013..0.013 rows=0 loops=1)
  23. Index Cond: (t3.c2 = 't'::text)
  24. Buffers: shared hit=4
  25. Planning time: 0.081 ms
  26. Execution time: 0.039 ms
  27. (9 rows)

查询c3=?,效率与单列索引一致

  1. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t3 where c3=1;
  2. QUERY PLAN
  3. ----------------------------------------------------------------------------------------------------------------------------------
  4. Bitmap Heap Scan on public.t3 (cost=8121.00..26027.00 rows=1000000 width=13) (actual time=77.949..182.939 rows=1000000 loops=1)
  5. Output: c1, c2, c3
  6. Recheck Cond: (t3.c3 = 1)
  7. Heap Blocks: exact=5406
  8. Buffers: shared hit=5542
  9. -> Bitmap Index Scan on idx_t3_1 (cost=0.00..7871.00 rows=1000000 width=0) (actual time=77.116..77.116 rows=1000000 loops=1)
  10. Index Cond: (t3.c3 = 1)
  11. Buffers: shared hit=136
  12. Planning time: 0.083 ms
  13. Execution time: 234.558 ms
  14. (10 rows)
  15. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t3 where c3=2;
  16. QUERY PLAN
  17. -----------------------------------------------------------------------------------------------------------------
  18. Bitmap Heap Scan on public.t3 (cost=5.00..6.01 rows=1 width=13) (actual time=0.015..0.015 rows=0 loops=1)
  19. Output: c1, c2, c3
  20. Recheck Cond: (t3.c3 = 2)
  21. Buffers: shared hit=4
  22. -> Bitmap Index Scan on idx_t3_1 (cost=0.00..5.00 rows=1 width=0) (actual time=0.014..0.014 rows=0 loops=1)
  23. Index Cond: (t3.c3 = 2)
  24. Buffers: shared hit=4
  25. Planning time: 0.081 ms
  26. Execution time: 0.040 ms
  27. (9 rows)

gin任意组合(不需要限定驱动列)多列查询的隐含bitmapAnd, bitmapOr操作

  1. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t3 where c1=2 and c3=1;
  2. QUERY PLAN
  3. -----------------------------------------------------------------------------------------------------------------
  4. Bitmap Heap Scan on public.t3 (cost=9.01..10.03 rows=1 width=13) (actual time=0.044..0.044 rows=1 loops=1)
  5. Output: c1, c2, c3
  6. Recheck Cond: ((t3.c1 = 2) AND (t3.c3 = 1))
  7. Heap Blocks: exact=1
  8. Buffers: shared hit=10
  9. -> Bitmap Index Scan on idx_t3_1 (cost=0.00..9.01 rows=1 width=0) (actual time=0.040..0.040 rows=1 loops=1)
  10. Index Cond: ((t3.c1 = 2) AND (t3.c3 = 1))
  11. Buffers: shared hit=9
  12. Planning time: 0.061 ms
  13. Execution time: 0.063 ms
  14. (10 rows)

没有驱动列,一样高效无比

  1. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t3 where c2='test' and c3=2;
  2. QUERY PLAN
  3. -----------------------------------------------------------------------------------------------------------------
  4. Bitmap Heap Scan on public.t3 (cost=9.00..10.02 rows=1 width=13) (actual time=0.052..0.052 rows=0 loops=1)
  5. Output: c1, c2, c3
  6. Recheck Cond: ((t3.c2 = 'test'::text) AND (t3.c3 = 2))
  7. Buffers: shared hit=9
  8. -> Bitmap Index Scan on idx_t3_1 (cost=0.00..9.00 rows=1 width=0) (actual time=0.051..0.051 rows=0 loops=1)
  9. Index Cond: ((t3.c2 = 'test'::text) AND (t3.c3 = 2))
  10. Buffers: shared hit=9
  11. Planning time: 0.086 ms
  12. Execution time: 0.075 ms
  13. (9 rows)

gin复合索引的展开式B树决定了不能按单列设置顺序

  1. postgres=# create index idx_t1_0 on t1 using gin (c1, c2 desc);
  2. ERROR: 0A000: access method "gin" does not support ASC/DESC options
  3. LOCATION: ComputeIndexAttrs, indexcmds.c:1248

八 btree vs gin 多列索引

1. 由于btree index, 多列值根据创建索引的DDL指定顺序sort后,多列的值组合后作为一个KEY存储在B树中。

例如4条记录如下

  1. 1,1,2;
  2. 1,100,2;
  3. 2,1,10;
  4. 1,1,3;

btree 中的key排序后分布(有多少条记录,就有多少KEY)

  1. 1,1,2; 1,1,3; 1,100,2; 2,1,10;

2. GIN MULTI COLUMN INDEX 构建了一个包含多种数据类型的B-TREE , 将多列的数据展开后,排序后分布 (key的数量为每列的count distinct总和)

  1. column1,1; column2,1; column1,2; column3,2; column3,3; column3,10; column2,100;

更形象的比喻

比如有三幅扑克牌(每幅54张牌),每一幅代表一列,如果要创建3列的复合索引,那么B-TREE会创建出54个条目的B树,而GIN会创建出包含162个条目的B树。

请看这个例子,可以说明这个情况

  1. postgres=# create table t2(c1 int2, c2 int4, c3 int8, c4 numeric, c5 text, c6 timestamp);
  2. CREATE TABLE
  3. postgres=# insert into t2 select c1,c1,c1,c1,c5,c6 from (select trunc(random()*1000) c1, md5(random()::text) c5, now()+(random()*10000||' sec')::interval c6 from generate_series(1,100000)) t;
  4. INSERT 0 100000
  5. postgres=# create index idx_t2_1 on t2 using gin (c1,c2,c3,c4,c5,c6);
  6. CREATE INDEX
  7. postgres=# select count(distinct c4) from t2;
  8. count
  9. -------
  10. 1000
  11. (1 row)
  12. postgres=# select count(distinct c1) from t2;
  13. count
  14. -------
  15. 1000
  16. (1 row)
  17. postgres=# select count(distinct c2) from t2;
  18. count
  19. -------
  20. 1000
  21. (1 row)
  22. postgres=# select count(distinct c3) from t2;
  23. count
  24. -------
  25. 1000
  26. (1 row)
  27. postgres=# select count(distinct c5) from t2;
  28. count
  29. -------
  30. 99996
  31. (1 row)
  32. postgres=# select count(distinct c6) from t2;
  33. count
  34. --------
  35. 100000
  36. (1 row)
  37. postgres=# select 99996+100000+4000;
  38. ?column?
  39. ----------
  40. 203996
  41. (1 row)
  42. postgres=# select * from gin_metapage_info(get_raw_page('idx_t2_1',0));
  43. pending_head | pending_tail | tail_free_size | n_pending_pages | n_pending_tuples | n_total_pages | n_entry_pages | n_data_pages | n_entries | version
  44. --------------+--------------+----------------+-----------------+------------------+---------------+---------------+--------------+-----------+---------
  45. 4294967295 | 4294967295 | 0 | 0 | 0 | 2611 | 2610 | 0 | 203996 | 2
  46. (1 row)
  47. n_entries = count(distinct indexed column1) + ....

b-tree 要高效的使用复合索引,必须带驱动列的查询条件。

GIN 使用复合索引,可以任意组合查询条件,当有多列查询条件时,使用隐含的bitmapAnd or bitmapOr。

什么情况下,gin比btree慢?

通常,多列查询时,如果使用了驱动列,那么B-TREE索引会更快一些,因为它不需要使用bitmapAnd or bitmapOr,而GIN需要。

其他情况,gin都比btree的复合查询快。

九 gin 多列索引的独门秘籍

通过前面的分析,我们已经摸清楚了GIN的复合索引结构(展开式B树),并且也知道GIN是key+ctid list的类b+tree树组织形式,它可以有很高的压缩比,也可以高效的查询单KEY。

那么GIN具备这些特性后,有哪些独门秘籍呢?

1. 任意组合查询,都可以达到很高效,你不需要创建多个b-tree索引了,一个GIN搞定它(不必担心GIN的IO,有fastupdate技术支撑,并且读写(entry合并)不堵塞)。

比如淘宝的搜索页面,用户可能根据任意属性,勾选任意条件进行查询。

创建一张测试表, 6个字段

  1. postgres=# create table taobao(c1 int, c2 int, c3 timestamp, c4 text, c5 numeric, c6 text);
  2. CREATE TABLE

插入1000万随机记录

  1. postgres=# insert into taobao select random()*2000, random()*3000, now()+((50000-100000*random())||' sec')::interval , md5(random()::text), round((random()*1000000)::numeric,2), md5(random()::text) from generate_series(1,10000000);
  2. INSERT 0 10000000

创建GIN多列索引

  1. postgres=# create index idx_taobao_gin on taobao using gin(c1,c2,c3,c4,c5,c6);

数据样例

  1. postgres=# select * from taobao limit 10;
  2. c1 | c2 | c3 | c4 | c5 | c6
  3. ------+------+----------------------------+----------------------------------+-----------+----------------------------------
  4. 1405 | 882 | 2017-02-06 09:41:24.985878 | 49982683517aab7d194f3affe74ba827 | 65157.79 | 256ad6d098a6536e3548b5af91a26557
  5. 1277 | 1269 | 2017-02-06 09:52:16.313212 | b40c1febdb7f62c916d3632a03092261 | 940751.10 | 9911b203e38b57c55a769312c2aaaeba
  6. 870 | 159 | 2017-02-06 05:59:46.853421 | 96a0f84d9f9381d77364d93ca2d7aa6f | 419618.52 | 1e716d90055d3b32027a5e80a19e2f4f
  7. 1990 | 1100 | 2017-02-07 00:35:26.849744 | 684b66b25eb57d97f604eb9d92dfc8b0 | 764940.62 | eea82a253995a70da23a9f9ba3015175
  8. 625 | 1076 | 2017-02-06 01:48:13.929789 | 7fe094f548cffa367ebeb28d4f188875 | 482201.22 | 81ba27a8123dbd3e3a741c5984368709
  9. 968 | 554 | 2017-02-06 06:05:29.971936 | 6c8b34e4eb7c5eca3a8ad6131c5aecd3 | 583617.05 | b5cfc01c845cc87a4eb8a425ac9b2e01
  10. 1795 | 667 | 2017-02-06 20:18:46.027376 | b4ef2282064ef3e7a4eb3c624ba42334 | 911819.49 | a794b4635bb314972d891c7218063642
  11. 222 | 1041 | 2017-02-06 20:29:15.686187 | f3a6bc2b683e272c293d09353fb2465b | 722814.10 | 22b3b06a69b134dee9e3236907637683
  12. 1596 | 2153 | 2017-02-05 22:38:54.795333 | 1455f7f7f198d09e380e680321a31968 | 672431.00 | efac7b0a2154e25321a522b98903df41
  13. 313 | 2955 | 2017-02-06 22:59:49.117719 | 7840bbd2be443904b094b9f8919730c0 | 525295.10 | b5feb900191fd404f3a3de407fa62c93
  14. (10 rows)
  15. public | taobao | table | postgres | 1202 MB |
  16. public | idx_taobao_gin | index | postgres | taobao | 3761 MB |

查询测试

任意单一字段

  1. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from taobao where c1=1;
  2. QUERY PLAN
  3. ------------------------------------------------------------------------------------------------------------------------------
  4. Bitmap Heap Scan on public.taobao (cost=46.51..4873.01 rows=4840 width=90) (actual time=1.602..10.038 rows=5043 loops=1)
  5. Output: c1, c2, c3, c4, c5, c6
  6. Recheck Cond: (taobao.c1 = 1)
  7. Heap Blocks: exact=4958
  8. Buffers: shared hit=4966
  9. -> Bitmap Index Scan on idx_taobao_gin (cost=0.00..45.30 rows=4840 width=0) (actual time=0.849..0.849 rows=5043 loops=1)
  10. Index Cond: (taobao.c1 = 1)
  11. Buffers: shared hit=8
  12. Planning time: 0.319 ms
  13. Execution time: 10.346 ms
  14. (10 rows)
  15. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from taobao where c2=882;
  16. QUERY PLAN
  17. ------------------------------------------------------------------------------------------------------------------------------
  18. Bitmap Heap Scan on public.taobao (cost=34.16..3287.73 rows=3246 width=90) (actual time=1.175..7.024 rows=3373 loops=1)
  19. Output: c1, c2, c3, c4, c5, c6
  20. Recheck Cond: (taobao.c2 = 882)
  21. Heap Blocks: exact=3350
  22. Buffers: shared hit=3358
  23. -> Bitmap Index Scan on idx_taobao_gin (cost=0.00..33.35 rows=3246 width=0) (actual time=0.651..0.651 rows=3373 loops=1)
  24. Index Cond: (taobao.c2 = 882)
  25. Buffers: shared hit=8
  26. Planning time: 0.094 ms
  27. Execution time: 7.236 ms
  28. (10 rows)
  29. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from taobao where c3='2017-02-06 09:41:24.985878';
  30. QUERY PLAN
  31. -----------------------------------------------------------------------------------------------------------------------
  32. Bitmap Heap Scan on public.taobao (cost=8.01..9.02 rows=1 width=90) (actual time=0.025..0.025 rows=1 loops=1)
  33. Output: c1, c2, c3, c4, c5, c6
  34. Recheck Cond: (taobao.c3 = '2017-02-06 09:41:24.985878'::timestamp without time zone)
  35. Heap Blocks: exact=1
  36. Buffers: shared hit=6
  37. -> Bitmap Index Scan on idx_taobao_gin (cost=0.00..8.01 rows=1 width=0) (actual time=0.019..0.019 rows=1 loops=1)
  38. Index Cond: (taobao.c3 = '2017-02-06 09:41:24.985878'::timestamp without time zone)
  39. Buffers: shared hit=5
  40. Planning time: 0.125 ms
  41. Execution time: 0.057 ms
  42. (10 rows)
  43. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from taobao where c4='b40c1febdb7f62c916d3632a03092261';
  44. QUERY PLAN
  45. -----------------------------------------------------------------------------------------------------------------------
  46. Bitmap Heap Scan on public.taobao (cost=8.01..9.02 rows=1 width=90) (actual time=0.028..0.028 rows=1 loops=1)
  47. Output: c1, c2, c3, c4, c5, c6
  48. Recheck Cond: (taobao.c4 = 'b40c1febdb7f62c916d3632a03092261'::text)
  49. Heap Blocks: exact=1
  50. Buffers: shared hit=6
  51. -> Bitmap Index Scan on idx_taobao_gin (cost=0.00..8.01 rows=1 width=0) (actual time=0.023..0.023 rows=1 loops=1)
  52. Index Cond: (taobao.c4 = 'b40c1febdb7f62c916d3632a03092261'::text)
  53. Buffers: shared hit=5
  54. Planning time: 0.101 ms
  55. Execution time: 0.058 ms
  56. (10 rows)
  57. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from taobao where c5='764940.62';
  58. QUERY PLAN
  59. -----------------------------------------------------------------------------------------------------------------------
  60. Bitmap Heap Scan on public.taobao (cost=8.01..9.02 rows=1 width=90) (actual time=0.028..0.028 rows=1 loops=1)
  61. Output: c1, c2, c3, c4, c5, c6
  62. Recheck Cond: (taobao.c5 = 764940.62)
  63. Heap Blocks: exact=1
  64. Buffers: shared hit=6
  65. -> Bitmap Index Scan on idx_taobao_gin (cost=0.00..8.01 rows=1 width=0) (actual time=0.023..0.023 rows=1 loops=1)
  66. Index Cond: (taobao.c5 = 764940.62)
  67. Buffers: shared hit=5
  68. Planning time: 0.127 ms
  69. Execution time: 0.069 ms
  70. (10 rows)
  71. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from taobao where c6='test';
  72. QUERY PLAN
  73. -----------------------------------------------------------------------------------------------------------------------
  74. Bitmap Heap Scan on public.taobao (cost=8.01..9.02 rows=1 width=90) (actual time=0.023..0.023 rows=0 loops=1)
  75. Output: c1, c2, c3, c4, c5, c6
  76. Recheck Cond: (taobao.c6 = 'test'::text)
  77. Buffers: shared hit=5
  78. -> Bitmap Index Scan on idx_taobao_gin (cost=0.00..8.01 rows=1 width=0) (actual time=0.020..0.020 rows=0 loops=1)
  79. Index Cond: (taobao.c6 = 'test'::text)
  80. Buffers: shared hit=5
  81. Planning time: 0.088 ms
  82. Execution time: 0.054 ms
  83. (9 rows)

任意2字段

  1. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from taobao where c2=1 and c3='2017-02-06 09:41:24.985878';
  2. QUERY PLAN
  3. ------------------------------------------------------------------------------------------------------------------------
  4. Bitmap Heap Scan on public.taobao (cost=15.00..16.02 rows=1 width=90) (actual time=0.060..0.060 rows=0 loops=1)
  5. Output: c1, c2, c3, c4, c5, c6
  6. Recheck Cond: ((taobao.c2 = 1) AND (taobao.c3 = '2017-02-06 09:41:24.985878'::timestamp without time zone))
  7. Buffers: shared hit=11
  8. -> Bitmap Index Scan on idx_taobao_gin (cost=0.00..15.00 rows=1 width=0) (actual time=0.057..0.057 rows=0 loops=1)
  9. Index Cond: ((taobao.c2 = 1) AND (taobao.c3 = '2017-02-06 09:41:24.985878'::timestamp without time zone))
  10. Buffers: shared hit=11
  11. Planning time: 0.100 ms
  12. Execution time: 0.093 ms
  13. (9 rows)
  14. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from taobao where c2=1 or c3='2017-02-06 09:41:24.985878';
  15. QUERY PLAN
  16. ------------------------------------------------------------------------------------------------------------------------------------
  17. Bitmap Heap Scan on public.taobao (cost=42.98..3305.68 rows=3247 width=90) (actual time=1.165..6.947 rows=3330 loops=1)
  18. Output: c1, c2, c3, c4, c5, c6
  19. Recheck Cond: ((taobao.c2 = 1) OR (taobao.c3 = '2017-02-06 09:41:24.985878'::timestamp without time zone))
  20. Heap Blocks: exact=3295
  21. Buffers: shared hit=3308
  22. -> BitmapOr (cost=42.98..42.98 rows=3247 width=0) (actual time=0.650..0.650 rows=0 loops=1)
  23. Buffers: shared hit=13
  24. -> Bitmap Index Scan on idx_taobao_gin (cost=0.00..33.35 rows=3246 width=0) (actual time=0.638..0.638 rows=3329 loops=1)
  25. Index Cond: (taobao.c2 = 1)
  26. Buffers: shared hit=8
  27. -> Bitmap Index Scan on idx_taobao_gin (cost=0.00..8.01 rows=1 width=0) (actual time=0.011..0.011 rows=1 loops=1)
  28. Index Cond: (taobao.c3 = '2017-02-06 09:41:24.985878'::timestamp without time zone)
  29. Buffers: shared hit=5
  30. Planning time: 0.099 ms
  31. Execution time: 7.161 ms
  32. (15 rows)

任意3字段

  1. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from taobao where c4='b40c1febdb7f62c916d3632a03092261' and c5=1 and c6='test';
  2. QUERY PLAN
  3. --------------------------------------------------------------------------------------------------------------------------------------------
  4. Bitmap Heap Scan on public.taobao (cost=22.00..23.02 rows=1 width=90) (actual time=0.051..0.051 rows=0 loops=1)
  5. Output: c1, c2, c3, c4, c5, c6
  6. Recheck Cond: ((taobao.c4 = 'b40c1febdb7f62c916d3632a03092261'::text) AND (taobao.c5 = '1'::numeric) AND (taobao.c6 = 'test'::text))
  7. Buffers: shared hit=13
  8. -> Bitmap Index Scan on idx_taobao_gin (cost=0.00..22.00 rows=1 width=0) (actual time=0.048..0.048 rows=0 loops=1)
  9. Index Cond: ((taobao.c4 = 'b40c1febdb7f62c916d3632a03092261'::text) AND (taobao.c5 = '1'::numeric) AND (taobao.c6 = 'test'::text))
  10. Buffers: shared hit=13
  11. Planning time: 0.115 ms
  12. Execution time: 0.084 ms
  13. (9 rows)
  14. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from taobao where c4='b40c1febdb7f62c916d3632a03092261' or c5=1 or c6='test';
  15. QUERY PLAN
  16. --------------------------------------------------------------------------------------------------------------------------------------
  17. Bitmap Heap Scan on public.taobao (cost=24.03..27.08 rows=3 width=90) (actual time=0.061..0.062 rows=1 loops=1)
  18. Output: c1, c2, c3, c4, c5, c6
  19. Recheck Cond: ((taobao.c4 = 'b40c1febdb7f62c916d3632a03092261'::text) OR (taobao.c5 = '1'::numeric) OR (taobao.c6 = 'test'::text))
  20. Heap Blocks: exact=1
  21. Buffers: shared hit=16
  22. -> BitmapOr (cost=24.03..24.03 rows=3 width=0) (actual time=0.055..0.055 rows=0 loops=1)
  23. Buffers: shared hit=15
  24. -> Bitmap Index Scan on idx_taobao_gin (cost=0.00..8.01 rows=1 width=0) (actual time=0.025..0.025 rows=1 loops=1)
  25. Index Cond: (taobao.c4 = 'b40c1febdb7f62c916d3632a03092261'::text)
  26. Buffers: shared hit=5
  27. -> Bitmap Index Scan on idx_taobao_gin (cost=0.00..8.01 rows=1 width=0) (actual time=0.016..0.016 rows=0 loops=1)
  28. Index Cond: (taobao.c5 = '1'::numeric)
  29. Buffers: shared hit=5
  30. -> Bitmap Index Scan on idx_taobao_gin (cost=0.00..8.01 rows=1 width=0) (actual time=0.012..0.012 rows=0 loops=1)
  31. Index Cond: (taobao.c6 = 'test'::text)
  32. Buffers: shared hit=5
  33. Planning time: 0.110 ms
  34. Execution time: 0.122 ms
  35. (18 rows)

任意4字段

不再举例,都能用上索引

2. 由于ctid list组织,所以可以做到很好的压缩,前面讲的posting list compress就是这个功效。所以它节约空间,提升效率。

  1. postgres=# create table t4(c1 int,c2 int);
  2. CREATE TABLE
  3. postgres=# insert into t4 select generate_series(1,10000000),1;
  4. INSERT 0 10000000
  5. postgres=# create index btree_t4_c1 on t4 using btree(c1);
  6. CREATE INDEX
  7. postgres=# create index btree_t4_c2 on t4 using btree(c2);
  8. CREATE INDEX
  9. postgres=# create index btree_t4_c1 on t4 using btree(c1);
  10. CREATE INDEX
  11. postgres=# create index btree_t4_c2 on t4 using btree(c2);
  12. CREATE INDEX
  13. postgres=# create index gin_t4_c1 on t4 using gin(c1);
  14. CREATE INDEX
  15. postgres=# create index gin_t4_c2 on t4 using gin(c2);
  16. CREATE INDEX
  17. postgres=# \di+
  18. List of relations
  19. Schema | Name | Type | Owner | Table | Size | Description
  20. --------+-------------+-------+----------+-------+---------+-------------
  21. public | btree_t4_c1 | index | postgres | t4 | 214 MB | // btree为全entry索引,
  22. public | btree_t4_c2 | index | postgres | t4 | 214 MB | // 所以即使c2字段1000万全重复值,存储空间也一样
  23. public | gin_t4_c1 | index | postgres | t4 | 534 MB | // 原本存储heap ctid的,被拆分为pos 2 bytes、blkid 4 bytes来存储posting list的长度、posting tree root的blkid,所以比btree多了一点
  24. public | gin_t4_c2 | index | postgres | t4 | 10 MB | // ctid list(posting list)自动压缩,压缩比很高

3. 使用btree_gin插件,可以对任意标量数据类型创建GIN索引,前面已有例子。

4. gin对多值类型(如数组、文本、全文检索)的支持就不多说了,那是GIN的发源地,支持非常棒。

注意,目前gin还不支持sort,所以如果你有大数据量的ORDER BY limit 小数据输出需求,建议还是使用b-tree。

  1. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t4 where c2=1 order by c2 limit 1;
  2. QUERY PLAN
  3. ---------------------------------------------------------------------------------------------------------------------------------------------
  4. Limit (cost=81163.88..81163.90 rows=1 width=8) (actual time=754.234..754.235 rows=1 loops=1)
  5. Output: c1, c2
  6. Buffers: shared hit=1307
  7. -> Bitmap Heap Scan on public.t4 (cost=81163.88..250411.70 rows=9999985 width=8) (actual time=754.234..754.234 rows=1 loops=1)
  8. Output: c1, c2
  9. Recheck Cond: (t4.c2 = 1)
  10. Heap Blocks: exact=1
  11. Buffers: shared hit=1307
  12. -> Bitmap Index Scan on gin_t4_c2 (cost=0.00..78663.89 rows=9999985 width=0) (actual time=745.651..745.651 rows=10000000 loops=1)
  13. // gin还不适合limit输出,但是可以通过修改内核来改进
  14. Index Cond: (t4.c2 = 1)
  15. Buffers: shared hit=1306
  16. Planning time: 0.091 ms
  17. Execution time: 754.261 ms
  18. (13 rows)
  19. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t4 order by c2 limit 1;
  20. QUERY PLAN
  21. -----------------------------------------------------------------------------------------------------------------------------------------
  22. Limit (cost=0.43..0.46 rows=1 width=8) (actual time=0.031..0.031 rows=1 loops=1)
  23. Output: c1, c2
  24. Buffers: shared hit=1 read=3
  25. -> Index Scan using btree_t4_c2 on public.t4 (cost=0.43..221670.43 rows=10000000 width=8) (actual time=0.030..0.030 rows=1 loops=1)
  26. Output: c1, c2
  27. Buffers: shared hit=1 read=3
  28. Planning time: 0.141 ms
  29. Execution time: 0.048 ms
  30. (8 rows)
  31. postgres=# drop index btree_t4_c2;
  32. // 不限制c2任何条件的话,不能使用gin,order by也不能使用gin
  33. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t4 order by c2 limit 1;
  34. QUERY PLAN
  35. --------------------------------------------------------------------------------------------------------------------------------------------------
  36. Limit (cost=10000194247.77..10000194247.78 rows=1 width=8) (actual time=1719.522..1719.523 rows=1 loops=1)
  37. Output: c1, c2
  38. Buffers: shared hit=44248
  39. -> Sort (cost=10000194247.77..10000219247.74 rows=9999985 width=8) (actual time=1719.520..1719.520 rows=1 loops=1)
  40. Output: c1, c2
  41. Sort Key: t4.c2
  42. Sort Method: top-N heapsort Memory: 25kB
  43. Buffers: shared hit=44248
  44. -> Seq Scan on public.t4 (cost=10000000000.00..10000144247.85 rows=9999985 width=8) (actual time=0.009..754.991 rows=10000000 loops=1)
  45. Output: c1, c2
  46. Buffers: shared hit=44248
  47. Planning time: 0.084 ms
  48. Execution time: 1719.543 ms
  49. (13 rows)
  50. // 限制c2任何条件的话,可以使用gin,但是order by依旧不能使用gin
  51. postgres=# set enable_sort=off;
  52. SET
  53. Time: 0.112 ms
  54. postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t4 where c2>0 order by c2 limit 1;
  55. QUERY PLAN
  56. ----------------------------------------------------------------------------------------------------------------------------------------------------
  57. Limit (cost=10000301719.00..10000301719.00 rows=1 width=8) (actual time=2801.874..2801.875 rows=1 loops=1)
  58. Output: c1, c2
  59. Buffers: shared hit=45554
  60. -> Sort (cost=10000301719.00..10000326719.00 rows=10000000 width=8) (actual time=2801.873..2801.873 rows=1 loops=1)
  61. Output: c1, c2
  62. Sort Key: t4.c2
  63. Sort Method: top-N heapsort Memory: 25kB
  64. Buffers: shared hit=45554
  65. -> Bitmap Heap Scan on public.t4 (cost=82471.00..251719.00 rows=10000000 width=8) (actual time=817.773..1905.348 rows=10000000 loops=1)
  66. Output: c1, c2
  67. Recheck Cond: (t4.c2 > 0)
  68. Heap Blocks: exact=44248
  69. Buffers: shared hit=45554
  70. -> Bitmap Index Scan on gin_t4_c2 (cost=0.00..79971.00 rows=10000000 width=0) (actual time=809.234..809.234 rows=10000000 loops=1)
  71. Index Cond: (t4.c2 > 0)
  72. Buffers: shared hit=1306
  73. Planning time: 0.103 ms
  74. Execution time: 2801.909 ms
  75. (18 rows)

5. btree_gin目前的查询输入必须和类型完全匹配,例如int4与int8不能匹配,timestamp与date不能匹配。而btree暂时没有这个问题。

原因与btree_gin插件创建的operator class有关

  1. Apparently you're using contrib/btree_gin, because in the core system
  2. that would just fail. btree_gin lacks any support for cross-type
  3. operators, so it can't index "timestamp > date" comparisons.

contrib/btree_gin/btree_gin–1.0.sql

  1. CREATE OPERATOR CLASS int4_ops
  2. DEFAULT FOR TYPE int4 USING gin
  3. AS
  4. OPERATOR 1 <,
  5. OPERATOR 2 <=,
  6. OPERATOR 3 =,
  7. OPERATOR 4 >=,
  8. OPERATOR 5 >,
  9. FUNCTION 1 btint4cmp(int4,int4),
  10. FUNCTION 2 gin_extract_value_int4(int4, internal),
  11. FUNCTION 3 gin_extract_query_int4(int4, internal, int2, internal, internal),
  12. FUNCTION 4 gin_btree_consistent(internal, int2, anyelement, int4, internal, internal),
  13. FUNCTION 5 gin_compare_prefix_int4(int4,int4,int2, internal),
  14. STORAGE int4;
  15. postgres=# \df *.*cmp*
  16. List of functions
  17. Schema | Name | Result data type | Argument data types | Type
  18. ------------+---------------------------+------------------+----------------------------------------------------------+--------
  19. ...
  20. pg_catalog | btint24cmp | integer | smallint, integer | normal
  21. pg_catalog | btint28cmp | integer | smallint, bigint | normal
  22. pg_catalog | btint2cmp | integer | smallint, smallint | normal
  23. pg_catalog | btint42cmp | integer | integer, smallint | normal
  24. pg_catalog | btint48cmp | integer | integer, bigint | normal
  25. pg_catalog | btint4cmp | integer | integer, integer | normal
  26. pg_catalog | btint82cmp | integer | bigint, smallint | normal
  27. pg_catalog | btint84cmp | integer | bigint, integer | normal
  28. pg_catalog | btint8cmp | integer | bigint, bigint | normal
  29. ...

例子

  1. postgres=# create table t1(id int, info text, crt_time timestamp);
  2. CREATE TABLE
  3. postgres=# create table t2(id int8, info text, crt_time timestamp);
  4. CREATE TABLE
  5. postgres=# create table t3(id int8, info text, crt_time timestamp);
  6. CREATE TABLE
  7. postgres=# create index idx_t1_id on t1(id);
  8. CREATE INDEX
  9. postgres=# create index idx_t1_crt_time on t1(crt_time);
  10. CREATE INDEX
  11. postgres=# create index idx_t2_id on t2(id);
  12. CREATE INDEX
  13. postgres=# create index idx_t3 on t3 using gin (id,info,crt_time);
  14. CREATE INDEX
  1. postgres=# set enable_seqscan=off;
  2. SET
  3. btree索引,可以使用隐式类型转换,不需要显示转换
  4. intint2,int8的隐式转换, timestampdate的隐式转换
  5. postgres=# explain select * from t1 where id=1::int8;
  6. QUERY PLAN
  7. ------------------------------------------------------------------------
  8. Bitmap Heap Scan on t1 (cost=1.50..7.01 rows=6 width=44)
  9. Recheck Cond: (id = '1'::bigint)
  10. -> Bitmap Index Scan on idx_t1_id (cost=0.00..1.50 rows=6 width=0)
  11. Index Cond: (id = '1'::bigint)
  12. (4 rows)
  13. postgres=# explain select * from t1 where id=1::int2;
  14. QUERY PLAN
  15. ------------------------------------------------------------------------
  16. Bitmap Heap Scan on t1 (cost=1.50..7.01 rows=6 width=44)
  17. Recheck Cond: (id = '1'::smallint)
  18. -> Bitmap Index Scan on idx_t1_id (cost=0.00..1.50 rows=6 width=0)
  19. Index Cond: (id = '1'::smallint)
  20. (4 rows)
  21. postgres=# explain select * from t1 where crt_time=now();
  22. QUERY PLAN
  23. ------------------------------------------------------------------------------
  24. Bitmap Heap Scan on t1 (cost=1.50..7.03 rows=6 width=44)
  25. Recheck Cond: (crt_time = now())
  26. -> Bitmap Index Scan on idx_t1_crt_time (cost=0.00..1.50 rows=6 width=0)
  27. Index Cond: (crt_time = now())
  28. (4 rows)
  29. postgres=# explain select * from t1 where crt_time=current_date;
  30. QUERY PLAN
  31. ------------------------------------------------------------------------------
  32. Bitmap Heap Scan on t1 (cost=1.50..7.05 rows=6 width=44)
  33. Recheck Cond: (crt_time = ('now'::cstring)::date)
  34. -> Bitmap Index Scan on idx_t1_crt_time (cost=0.00..1.50 rows=6 width=0)
  35. Index Cond: (crt_time = ('now'::cstring)::date)
  36. (4 rows)
  37. 插件 btree_gin 索引,暂时不支持跨类型的COMPARE,必须显示
  38. postgres=# explain select * from t3 where crt_time=current_date;
  39. QUERY PLAN
  40. -----------------------------------------------------------------------
  41. Seq Scan on t3 (cost=10000000000.00..10000000028.73 rows=5 width=48)
  42. Filter: (crt_time = ('now'::cstring)::date)
  43. (2 rows)
  44. postgres=# explain select * from t3 where crt_time=current_date::timestamp;
  45. QUERY PLAN
  46. ----------------------------------------------------------------------------------------
  47. Bitmap Heap Scan on t3 (cost=2.65..7.19 rows=5 width=48)
  48. Recheck Cond: (crt_time = (('now'::cstring)::date)::timestamp without time zone)
  49. -> Bitmap Index Scan on idx_t3 (cost=0.00..2.65 rows=5 width=0)
  50. Index Cond: (crt_time = (('now'::cstring)::date)::timestamp without time zone)
  51. (4 rows)
  52. postgres=# explain select * from t3 where crt_time=current_date::timestamptz;
  53. QUERY PLAN
  54. ---------------------------------------------------------------------------
  55. Seq Scan on t3 (cost=10000000000.00..10000000031.40 rows=5 width=48)
  56. Filter: (crt_time = (('now'::cstring)::date)::timestamp with time zone)
  57. (2 rows)
  58. postgres=# explain select * from t3 where id=1;
  59. QUERY PLAN
  60. -----------------------------------------------------------------------
  61. Seq Scan on t3 (cost=10000000000.00..10000000023.38 rows=5 width=48)
  62. Filter: (id = 1)
  63. (2 rows)
  64. postgres=# explain select * from t3 where id=1::int8;
  65. QUERY PLAN
  66. ---------------------------------------------------------------------
  67. Bitmap Heap Scan on t3 (cost=2.64..7.14 rows=5 width=48)
  68. Recheck Cond: (id = '1'::bigint)
  69. -> Bitmap Index Scan on idx_t3 (cost=0.00..2.64 rows=5 width=0)
  70. Index Cond: (id = '1'::bigint)
  71. (4 rows)
  72. JOIN也是如此
  73. btree 支持自动隐式转换
  74. postgres=# explain select * from t1,t2 where t1.id=t2.id ;
  75. QUERY PLAN
  76. -------------------------------------------------------------------------------------
  77. Merge Join (cost=0.30..155.27 rows=6046 width=92)
  78. Merge Cond: (t2.id = t1.id)
  79. -> Index Scan using idx_t2_id on t2 (cost=0.15..30.50 rows=1070 width=48)
  80. -> Materialize (cost=0.15..34.23 rows=1130 width=44)
  81. -> Index Scan using idx_t1_id on t1 (cost=0.15..31.40 rows=1130 width=44)
  82. (5 rows)
  83. 插件 btree_gin 暂时不支持跨类型join
  84. postgres=# explain select * from t1,t3 where t1.id=t3.id;
  85. QUERY PLAN
  86. --------------------------------------------------------------------------------------
  87. Merge Join (cost=10000000074.69..10000000199.46 rows=6046 width=92)
  88. Merge Cond: (t1.id = t3.id)
  89. -> Index Scan using idx_t1_id on t1 (cost=0.15..31.40 rows=1130 width=44)
  90. -> Sort (cost=10000000074.54..10000000077.21 rows=1070 width=48)
  91. Sort Key: t3.id
  92. -> Seq Scan on t3 (cost=10000000000.00..10000000020.70 rows=1070 width=48)
  93. (6 rows)
  94. postgres=# explain select * from t2,t3 where t2.id=t3.id;
  95. QUERY PLAN
  96. -------------------------------------------------------------------------------
  97. Nested Loop (cost=0.21..1599.15 rows=5724 width=96)
  98. -> Index Scan using idx_t2_id on t2 (cost=0.15..30.50 rows=1070 width=48)
  99. -> Bitmap Heap Scan on t3 (cost=0.05..1.42 rows=5 width=48)
  100. Recheck Cond: (id = t2.id)
  101. -> Bitmap Index Scan on idx_t3 (cost=0.00..0.05 rows=5 width=0)
  102. Index Cond: (id = t2.id)
  103. (6 rows)
  104. postgres=# explain select * from t1,t3 where (t1.id)::int8=t3.id;
  105. QUERY PLAN
  106. --------------------------------------------------------------------------------
  107. Nested Loop (cost=10000000000.06..10000001694.13 rows=6046 width=92)
  108. -> Seq Scan on t1 (cost=10000000000.00..10000000021.30 rows=1130 width=44)
  109. -> Bitmap Heap Scan on t3 (cost=0.06..1.43 rows=5 width=48)
  110. Recheck Cond: (id = (t1.id)::bigint)
  111. -> Bitmap Index Scan on idx_t3 (cost=0.00..0.05 rows=5 width=0)
  112. Index Cond: (id = (t1.id)::bigint)
  113. (6 rows)

绑定变量不受影响,因为绑定变量的类型是指定的。

  1. postgres=# prepare p (int8) as select * from t3 where id=$1;
  2. PREPARE
  3. postgres=# explain execute p(1::int4);
  4. QUERY PLAN
  5. ---------------------------------------------------------------------
  6. Bitmap Heap Scan on t3 (cost=2.64..7.14 rows=5 width=48)
  7. Recheck Cond: (id = '1'::bigint)
  8. -> Bitmap Index Scan on idx_t3 (cost=0.00..2.64 rows=5 width=0)
  9. Index Cond: (id = '1'::bigint)
  10. (4 rows)
  11. postgres=# explain execute p(1::int2);
  12. QUERY PLAN
  13. ---------------------------------------------------------------------
  14. Bitmap Heap Scan on t3 (cost=2.64..7.14 rows=5 width=48)
  15. Recheck Cond: (id = '1'::bigint)
  16. -> Bitmap Index Scan on idx_t3 (cost=0.00..2.64 rows=5 width=0)
  17. Index Cond: (id = '1'::bigint)
  18. (4 rows)
  19. postgres=# explain execute p1(current_date);
  20. QUERY PLAN
  21. -------------------------------------------------------------------------------------
  22. Bitmap Heap Scan on t3 (cost=2.64..7.14 rows=5 width=48)
  23. Recheck Cond: (crt_time = '2017-02-08 00:00:00'::timestamp without time zone)
  24. -> Bitmap Index Scan on idx_t3 (cost=0.00..2.64 rows=5 width=0)
  25. Index Cond: (crt_time = '2017-02-08 00:00:00'::timestamp without time zone)
  26. (4 rows)

6. btree_gin暂时还不可用于group by

  1. postgres=# \d t3
  2. Table "public.t3"
  3. Column | Type | Modifiers
  4. ----------+-----------------------------+-----------
  5. id | bigint |
  6. info | text |
  7. crt_time | timestamp without time zone |
  8. Indexes:
  9. "idx_t3" gin (id, info, crt_time)
  10. postgres=# explain select id from t3 group by id,info,crt_time;
  11. QUERY PLAN
  12. --------------------------------------------------------------------------------------
  13. Group (cost=10000000074.54..10000000085.24 rows=200 width=48)
  14. Group Key: id, info, crt_time
  15. -> Sort (cost=10000000074.54..10000000077.21 rows=1070 width=48)
  16. Sort Key: id, info, crt_time
  17. -> Seq Scan on t3 (cost=10000000000.00..10000000020.70 rows=1070 width=48)
  18. (5 rows)
  19. postgres=# explain select id from t3 group by id;
  20. QUERY PLAN
  21. -------------------------------------------------------------------------------------
  22. Group (cost=10000000074.54..10000000079.89 rows=200 width=8)
  23. Group Key: id
  24. -> Sort (cost=10000000074.54..10000000077.21 rows=1070 width=8)
  25. Sort Key: id
  26. -> Seq Scan on t3 (cost=10000000000.00..10000000020.70 rows=1070 width=8)
  27. (5 rows)
  28. btree 可以
  29. postgres=# \d t1
  30. Table "public.t1"
  31. Column | Type | Modifiers
  32. ----------+-----------------------------+-----------
  33. id | integer |
  34. info | text |
  35. crt_time | timestamp without time zone |
  36. Indexes:
  37. "idx_t1_crt_time" btree (crt_time)
  38. "idx_t1_id" btree (id)
  39. postgres=# explain select id from t1 group by id;
  40. QUERY PLAN
  41. -----------------------------------------------------------------------------------
  42. Group (cost=0.15..34.23 rows=200 width=4)
  43. Group Key: id
  44. -> Index Only Scan using idx_t1_id on t1 (cost=0.15..31.40 rows=1130 width=4)
  45. (3 rows)

十 小结

宝剑赠英雄,美玉配佳人。

PostgreSQL 数据库,了解越多,才会更加随心所欲,驾驭自如。

参考

https://www.postgresql.org/docs/9.6/static/indexes-bitmap-scans.html

https://www.postgresql.org/docs/current/static/indexes-multicolumn.html

https://www.postgresql.org/docs/9.6/static/btree-gin.html

https://www.postgresql.org/docs/9.6/static/btree-gist.html

https://www.postgresql.org/docs/9.6/static/pageinspect.html

《深入浅出PostgreSQL B-Tree索引结构》

《B-Tree和B+Tree》

《PostgreSQL GIN索引实现原理》

《从难缠的模糊查询聊开 - PostgreSQL独门绝招之一 GIN , GiST , SP-GiST , RUM 索引原理与技术背景》

原文:http://mysql.taobao.org/monthly/2017/02/10/