CRDB使用CREATE INDEX创建索引,索引用来帮助SQL查询数据而不用进行全表扫描,从而提高性能。表的字段如果被定义为PRIMARY KEY或UNIQUE,会自动创建索引。如果想在JSONB或ARRAY上面创建索引,可以创建GIN索引。
CRDB的索引包括以下几类:
即索引中只包含一个字段,如
root@10.10.12.87:20158/defaultdb> create index idx_user on users(id);
CREATE INDEX
root@10.10.12.87:20158/defaultdb> show create table users;
table_name | create_statement
-------------+--------------------------------------------------------------
users | CREATE TABLE public.users (
| id INT8 NULL,
| name VARCHAR(10) NULL,
| rowid INT8 NOT VISIBLE NOT NULL DEFAULT unique_rowid(),
| CONSTRAINT "primary" PRIMARY KEY (rowid ASC),
| INDEX idx_user (id ASC),
| FAMILY "primary" (id, name, rowid)
| )
(1 row)
即索引中有多个字段,如
root@10.10.12.87:20158/defaultdb> create index idx2_user on users(id,name);
CREATE INDEX
root@10.10.12.87:20158/defaultdb> show create table users;
table_name | create_statement
-------------+--------------------------------------------------------------
users | CREATE TABLE public.users (
| id INT8 NULL,
| name VARCHAR(10) NULL,
| rowid INT8 NOT VISIBLE NOT NULL DEFAULT unique_rowid(),
| CONSTRAINT "primary" PRIMARY KEY (rowid ASC),
| INDEX idx_user (id ASC),
| INDEX idx2_user (id ASC, name ASC),
| FAMILY "primary" (id, name, rowid)
| )
(1 row)
即创建的索引具有唯一性,如
root@10.10.12.87:20158/defaultdb> create unique index idx3_user on users(name);
CREATE INDEX
root@10.10.12.87:20158/defaultdb> show create table users;
table_name | create_statement
-------------+--------------------------------------------------------------
users | CREATE TABLE public.users (
| id INT8 NULL,
| name VARCHAR(10) NULL,
| rowid INT8 NOT VISIBLE NOT NULL DEFAULT unique_rowid(),
| CONSTRAINT "primary" PRIMARY KEY (rowid ASC),
| INDEX idx_user (id ASC),
| INDEX idx2_user (id ASC, name ASC),
| UNIQUE INDEX idx3_user (name ASC),
| FAMILY "primary" (id, name, rowid)
| )
(1 row)
倒排索引(GIN索引)存储从容器列(如JSONB文档)中的值到保存该值的行的映射。它们用于加速包含搜索,例如,“显示该表中所有具有包含键-值对{“location”:“NYC”}的JSON列的行”。GIN索引是文献检索系统中常用的索引。
cockachdb在GIN索引中存储以下数据类型的内容:
root@10.10.12.87:20158/defaultdb> create table promo_codes(id int, rules jsonb);
CREATE TABLE
root@10.10.12.87:20158/defaultdb> create inverted index gin_promo_codes on promo_codes(rules);
CREATE INDEX
root@10.10.12.87:20158/defaultdb> create index gin2_promo_codes on promo_codes using gin(rules);
CREATE INDEX
root@10.10.12.87:20158/defaultdb> show create table promo_codes;
table_name | create_statement
--------------+--------------------------------------------------------------
promo_codes | CREATE TABLE public.promo_codes (
| id INT8 NULL,
| rules JSONB NULL,
| rowid INT8 NOT VISIBLE NOT NULL DEFAULT unique_rowid(),
| CONSTRAINT "primary" PRIMARY KEY (rowid ASC),
| INVERTED INDEX gin_promo_codes (rules),
| INVERTED INDEX gin2_promo_codes (rules),
| FAMILY "primary" (id, rules, rowid)
| )
(1 row)
空间索引存储空间对象的信息,但它们与任何其他索引类型用于相同的任务,即,
空间索引与其他类型的索引有如下区别:
root@10.10.12.87:20158/defaultdb> CREATE INDEX geom_idx_1 ON tmp_viz USING GIST(geom);
CREATE INDEX
root@10.10.12.87:20158/defaultdb> show create table tmp_viz;
table_name | create_statement
-------------+--------------------------------------------------------------
tmp_viz | CREATE TABLE public.tmp_viz (
| id INT8 NULL,
| geom GEOMETRY NULL,
| rowid INT8 NOT VISIBLE NOT NULL DEFAULT unique_rowid(),
| CONSTRAINT "primary" PRIMARY KEY (rowid ASC),
| INVERTED INDEX geom_idx_1 (geom),
| FAMILY "primary" (id, geom, rowid)
| )
(1 row)
覆盖索引是为了避免按索引查询回表,而把查询中除索引以外的字段也存储在索引里面,如对于查询select city from user1 where name=‘ABC’;
root@10.10.12.87:20158/defaultdb> create table user1(id int, name varchar(10), city varchar(10));
CREATE TABLE
root@10.10.12.87:20158/defaultdb> create index idx_user1 on user1(name) storing (city);
CREATE INDEX
root@10.10.12.87:20158/defaultdb> show create table user1;
table_name | create_statement
-------------+--------------------------------------------------------------
user1 | CREATE TABLE public.user1 (
| id INT8 NULL,
| name VARCHAR(10) NULL,
| city VARCHAR(10) NULL,
| rowid INT8 NOT VISIBLE NOT NULL DEFAULT unique_rowid(),
| CONSTRAINT "primary" PRIMARY KEY (rowid ASC),
| INDEX idx_user1 (name ASC) STORING (city),
| FAMILY "primary" (id, name, city, rowid)
| )
(1 row)
如果索引是按照某个递增的字段创建,那么随着数据的增加,索引的数据写入也会造成一个热点写的问题,此时我们希望把索引建成HASH分片索引,以此来解决写入的热点问题。
root@10.10.12.87:20158/defaultdb> create table events(id int, ts timestamp);
CREATE TABLE
root@10.10.12.87:20158/defaultdb> create index idx_events on events(ts) using hash WITH BUCKET_COUNT=4;
CREATE INDEX
root@10.10.12.87:20158/defaultdb> show create table events;
table_name | create_statement
-------------+------------------------------------------------------------------------------------------------------------------------------
events | CREATE TABLE public.events (
| id INT8 NULL,
| ts TIMESTAMP NULL,
| rowid INT8 NOT VISIBLE NOT NULL DEFAULT unique_rowid(),
| dbms_internal_ts_shard_4 INT4 NOT VISIBLE NOT NULL AS (mod(fnv32(dbms_internal.datums_to_bytes(ts)), 4:::INT8)) STORED,
| CONSTRAINT "primary" PRIMARY KEY (rowid ASC),
| INDEX idx_events (ts ASC) USING HASH WITH BUCKET_COUNT = 4,
| FAMILY "primary" (id, ts, rowid, dbms_internal_ts_shard_4),
| CONSTRAINT check_dbms_internal_ts_shard_4 CHECK (dbms_internal_ts_shard_4 IN (0:::INT8, 1:::INT8, 2:::INT8, 3:::INT8))
| )
(1 row)
部分索引允许指定要添加到索引中的行和列的子集。部分索引包括表中对创建索引时定义的布尔谓词表达式(即WHERE过滤器)求值为true的行子集。
root@10.10.12.87:20158/defaultdb> create index idx2_user1 on user1(name) where city='SH';
CREATE INDEX
root@10.10.12.87:20158/defaultdb> show create table user1;
table_name | create_statement
-------------+--------------------------------------------------------------
user1 | CREATE TABLE public.user1 (
| id INT8 NULL,
| name VARCHAR(10) NULL,
| city VARCHAR(10) NULL,
| rowid INT8 NOT VISIBLE NOT NULL DEFAULT unique_rowid(),
| CONSTRAINT "primary" PRIMARY KEY (rowid ASC),
| INDEX idx_user1 (name ASC) STORING (city),
| INDEX idx2_user1 (name ASC) WHERE city = 'SH':::STRING,
| FAMILY "primary" (id, name, city, rowid)
| )
(1 row)
表达式索引又称为函数索引,是针对字段的特定函数增加的索引,如substr、instr后的索引。
root@10.10.12.87:20158/defaultdb> create index idx3_users on user1(substr(city,1,5));
CREATE INDEX
root@10.10.12.87:20158/defaultdb> show create table user1;
table_name | create_statement
-------------+---------------------------------------------------------------
user1 | CREATE TABLE public.user1 (
| id INT8 NULL,
| name VARCHAR(10) NULL,
| city VARCHAR(10) NULL,
| rowid INT8 NOT VISIBLE NOT NULL DEFAULT unique_rowid(),
| CONSTRAINT "primary" PRIMARY KEY (rowid ASC),
| INDEX idx_user1 (name ASC) STORING (city),
| INDEX idx2_user1 (name ASC) WHERE city = 'SH':::STRING,
| INDEX idx3_users (substr(city, 1:::INT8, 5:::INT8) ASC),
| FAMILY "primary" (id, name, city, rowid)
| )
(1 row)