当前位置: 首页 > 工具软件 > pg_pathman > 使用案例 >

pg_pathman插件安装使用

仉姚石
2023-12-01

pg_pathman是postgresql的一个分区插件,支持PostgreSQL 9.5, 9.6, 10, 11, 12; 仅支持hash和range分区,一些分区的特性可以参考:https://github.com/postgrespro/pg_pathman的
Feature highlights段落

  1. 下载
    https://github.com/postgrespro/pg_pathman
  2. 安装
unzip pg_pathman-master.zip  #解压
cd pg_pathman-master/ #进入安装路径
make USE_PGXS=1  #编译
make USE_PGXS=1 install  #安装
shared_preload_libraries = 'pg_pathman'  #修改postgresql.conf配置参数,注意如果有其他插件,要注意顺序,如shared_preload_libraries = 'pg_stat_statements, pg_pathman'。
pg_ctl restart  #重启数据库
psql登陆要安装的库,执行以下
create extension pg_pathman ;
  1. 数据库自身实现分区示例:
    PG版本小于10的版本可以使用继承关系,每一个子表必须使用一个check约束
CREATE TABLE test (id SERIAL PRIMARY KEY, title TEXT);
CREATE TABLE test_1 (CHECK ( id >= 100 AND id < 200 )) INHERITS (test);
CREATE TABLE test_2 (CHECK ( id >= 200 AND id < 300 )) INHERITS (test);

PG版本10以上的可以使用以下本地分区语法:

CREATE TABLE test(id int4, value text) PARTITION BY RANGE(id);
CREATE TABLE test_1 PARTITION OF test FOR VALUES FROM (1) TO (10);
CREATE TABLE test_2 PARTITION OF test FOR VALUES FROM (10) TO (20);
  1. 通过pg_pathman创建一个hash分区表,并将原主表数据利用非阻塞插入到子表中
select pathman_version();  #查看版本
 pathman_version 
-----------------
 1.5.11
hank=# create table part_test(id int, info text, crt_time timestamp not null);    -- 分区列必须有not null约束  
CREATE TABLE

插入一批测试数据,模拟已经有数据了的主表
hank=# insert into part_test select id,md5(random()::text),clock_timestamp() + (id||' hour')::interval from generate_series(1,10000) t(id); 
INSERT 0 10000
hank=# select * from part_test limit 10;   
 id |               info               |          crt_time          
----+----------------------------------+----------------------------
  1 | 29ce4edc70dbfbe78912beb7c4cc95c2 | 2016-10-25 10:47:32.873879
  2 | e0990a6fb5826409667c9eb150fef386 | 2016-10-25 11:47:32.874048
  3 | d25f577a01013925c203910e34470695 | 2016-10-25 12:47:32.874059
  4 | 501419c3f7c218e562b324a1bebfe0ad | 2016-10-25 13:47:32.874065
  5 | 5e5e22bdf110d66a5224a657955ba158 | 2016-10-25 14:47:32.87407
  6 | 55d2d4fd5229a6595e0dd56e13d32be4 | 2016-10-25 15:47:32.874076
  7 | 1dfb9a783af55b123c7a888afe1eb950 | 2016-10-25 16:47:32.874081
  8 | 41eeb0bf395a4ab1e08691125ae74bff | 2016-10-25 17:47:32.874087
  9 | 83783d69cc4f9bb41a3978fe9e13d7fa | 2016-10-25 18:47:32.874092
 10 | affc9406d5b3412ae31f7d7283cda0dd | 2016-10-25 19:47:32.874097
(10 rows)
hank=# select create_hash_partitions('part_test'::regclass,'crt_time',20,false); 
hank=# \d+ part_test
                                             Table "hank.part_test"
  Column  |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description 
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
 id       | integer                     |           |          |         | plain    |              | 
 info     | text                        |           |          |         | extended |              | 
 crt_time | timestamp without time zone |           | not null |         | plain    |              | 
Child tables: part_test_0,
              part_test_1,
              part_test_10,
              part_test_11,
              part_test_12,
              part_test_13,
              part_test_14,
              part_test_15,
              part_test_16,
              part_test_17,
              part_test_18,
              part_test_19,
              part_test_2,
              part_test_3,
              part_test_4,
              part_test_5,
              part_test_6,
              part_test_7,
              part_test_8,
              part_test_9

hank=# select partition_table_concurrently('part_test'::regclass,
hank(#                              10000,
hank(#                              1.0);
NOTICE:  worker started, you can stop it with the following command: select public.stop_concurrent_part_task('part_test');
 partition_table_concurrently 
------------------------------
 
(1 row)

hank=# select count(*) from only part_test;   #迁移结束,主表无数据
 count 
-------
     0

禁用主表写入数据
hank=# select set_enable_parent('part_test'::regclass, false);
 set_enable_parent 
-------------------
 
(1 row)

查看执行计划,已经只扫描子表part_test_13
hank=# explain select * from part_test where crt_time = '2020-06-08 00:00:00'::timestamp;    
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Append  (cost=0.00..11.19 rows=1 width=45)
   ->  Seq Scan on part_test_13  (cost=0.00..11.19 rows=1 width=45)
         Filter: (crt_time = '2020-06-08 00:00:00'::timestamp without time zone)
(3 rows)

hank=# \d part_test_13
                        Table "hank.part_test_13"
  Column  |            Type             | Collation | Nullable | Default 
----------+-----------------------------+-----------+----------+---------
 id       | integer                     |           |          | 
 info     | text                        |           |          | 
 crt_time | timestamp without time zone |           | not null | 
Check constraints:
    "pathman_part_test_13_check" CHECK (get_hash_part_idx(timestamp_hash(crt_time), 20) = 13)
Inherits: part_test

由于篇幅原因,其他一些函数的使用示例见德哥博客:
https://github.com/digoal/blog/blob/master/201610/20161024_01.md
也可以参考官网做示例
https://github.com/postgrespro/pg_pathman
这里不再做相关实验。

 类似资料: