当前位置: 首页 > 工具软件 > cstore_fdw > 使用案例 >

cstore_fdw 列式存储

澹台冯浩
2023-12-01

os: ubuntu 16.04
db: postgresql 10.6

Cstore_fdw is developed by Citus Data and can be used in combination with Citus

版本

# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.5 LTS
Release:        16.04
Codename:       xenial
# 
# 
# su - postgres
$ psql -c "select version();"
                                                     version
-----------------------------------------------------------------------------------------------------------------
 PostgreSQL 10.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit
(1 row)

下载安装

安装 cstore_fdw 依赖

# apt-get install protobuf-c-compiler libprotobuf-c0-dev

由于本机 postgresql 10.6 是编译安装的,所以 cstore_fdw 的相关编译在 postgres 用户下进行即可

# su - postgres
$ wget https://github.com/citusdata/cstore_fdw/archive/v1.6.2.zip
$ unzip ./v1.6.2.zip
$ cd cstore_fdw-1.6.2/
$ 
$ 
$ make
protoc-c --c_out=. cstore.proto
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -fPIC --std=c99 -I. -I./ -I/usr/pgsql-10/include/server -I/usr/pgsql-10/include/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o cstore.pb-c.o cstore.pb-c.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -fPIC --std=c99 -I. -I./ -I/usr/pgsql-10/include/server -I/usr/pgsql-10/include/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o cstore_fdw.o cstore_fdw.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -fPIC --std=c99 -I. -I./ -I/usr/pgsql-10/include/server -I/usr/pgsql-10/include/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o cstore_writer.o cstore_writer.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -fPIC --std=c99 -I. -I./ -I/usr/pgsql-10/include/server -I/usr/pgsql-10/include/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o cstore_reader.o cstore_reader.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -fPIC --std=c99 -I. -I./ -I/usr/pgsql-10/include/server -I/usr/pgsql-10/include/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o cstore_metadata_serialization.o cstore_metadata_serialization.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -fPIC --std=c99 -I. -I./ -I/usr/pgsql-10/include/server -I/usr/pgsql-10/include/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o cstore_compression.o cstore_compression.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -fPIC -shared -o cstore_fdw.so cstore.pb-c.o cstore_fdw.o cstore_writer.o cstore_reader.o cstore_metadata_serialization.o cstore_compression.o -L/usr/pgsql-10/lib    -Wl,--as-needed -Wl,-rpath,'/usr/pgsql-10/lib',--enable-new-dtags  -lprotobuf-c
$ 
$ 
$ make install
/bin/mkdir -p '/usr/pgsql-10/lib'
/bin/mkdir -p '/usr/pgsql-10/share/extension'
/bin/mkdir -p '/usr/pgsql-10/share/extension'
/usr/bin/install -c -m 755  cstore_fdw.so '/usr/pgsql-10/lib/cstore_fdw.so'
/usr/bin/install -c -m 644 .//cstore_fdw.control '/usr/pgsql-10/share/extension/'
/usr/bin/install -c -m 644 .//cstore_fdw--1.6.sql .//cstore_fdw--1.5--1.6.sql .//cstore_fdw--1.4--1.5.sql .//cstore_fdw--1.3--1.4.sql .//cstore_fdw--1.2--1.3.sql .//cstore_fdw--1.1--1.2.sql .//cstore_fdw--1.0--1.1.sql  '/usr/pgsql-10/share/extension/'

修改配置文件,重启 postgresql

$ vi $PGDATA/postgresql.conf
shared_preload_libraries = 'cstore_fdw'

$ pg_ctl stop -m fast -D /data/pgsql-10/data
$ 
$ 
$ pg_ctl start -D /data/pgsql-10/data

$ psql
psql (10.6)
Type "help" for help.

postgres=# select * from pg_available_extensions where name like '%cstore%' order by name;

    name    | default_version | installed_version |                   comment
------------+-----------------+-------------------+---------------------------------------------
 cstore_fdw | 1.6             |                   | foreign-data wrapper for flat cstore access
(1 row)

使用

cstore_fdw 页面给出了一个例子.

# su - postgres
$ wget http://examples.citusdata.com/customer_reviews_1998.csv.gz;
wget http://examples.citusdata.com/customer_reviews_1999.csv.gz;

gzip -d customer_reviews_1998.csv.gz;
gzip -d customer_reviews_1999.csv.gz;

$ du -sh *.csv
97M     customer_reviews_1998.csv
190M    customer_reviews_1999.csv

$ psql
psql (10.6)
Type "help" for help.

postgres=# CREATE EXTENSION cstore_fdw;

postgres=# CREATE SERVER cstore_server FOREIGN DATA WRAPPER cstore_fdw;

postgres=# CREATE FOREIGN TABLE customer_reviews
(
    customer_id TEXT,
    review_date DATE,
    review_rating INTEGER,
    review_votes INTEGER,
    review_helpful_votes INTEGER,
    product_id CHAR(10),
    product_title TEXT,
    product_sales_rank BIGINT,
    product_group TEXT,
    product_category TEXT,
    product_subcategory TEXT,
    similar_product_ids CHAR(10)[]
)
SERVER cstore_server
OPTIONS(compression 'pglz');

postgres=# \d+ customer_reviews
                                            Foreign table "public.customer_reviews"
        Column        |      Type       | Collation | Nullable | Default | FDW options | Storage  | Stats target | Description 
----------------------+-----------------+-----------+----------+---------+-------------+----------+--------------+-------------
 customer_id          | text            |           |          |         |             | extended |              | 
 review_date          | date            |           |          |         |             | plain    |              | 
 review_rating        | integer         |           |          |         |             | plain    |              | 
 review_votes         | integer         |           |          |         |             | plain    |              | 
 review_helpful_votes | integer         |           |          |         |             | plain    |              | 
 product_id           | character(10)   |           |          |         |             | extended |              | 
 product_title        | text            |           |          |         |             | extended |              | 
 product_sales_rank   | bigint          |           |          |         |             | plain    |              | 
 product_group        | text            |           |          |         |             | extended |              | 
 product_category     | text            |           |          |         |             | extended |              | 
 product_subcategory  | text            |           |          |         |             | extended |              | 
 similar_product_ids  | character(10)[] |           |          |         |             | extended |              | 
Server: cstore_server
FDW options: (compression 'pglz')

加载数据

postgres=# COPY customer_reviews FROM '/home/postgres/customer_reviews_1998.csv' WITH CSV;
postgres=# COPY customer_reviews FROM '/home/postgres/customer_reviews_1999.csv' WITH CSV;

postgres=# ANALYZE customer_reviews;

postgres=# SELECT
    customer_id, review_date, review_rating, product_id, product_title
FROM
    customer_reviews
WHERE
    customer_id ='A27T7HVDXA3K2A' AND
    product_title LIKE '%Dune%' AND
    review_date >= '1998-01-01' AND
    review_date <= '1998-12-31';

postgres=# 
postgres=# 
postgres=# 
postgres=# SELECT
    width_bucket(length(product_title), 1, 50, 5) title_length_bucket,
    round(avg(review_rating), 2) AS review_average,
    count(*)
FROM
   customer_reviews
WHERE
    product_group = 'Book'
GROUP BY
    title_length_bucket
ORDER BY
    title_length_bucket;

postgres=# 
postgres=# explain verbose select count(1) from customer_reviews;
                                                                                                      QUERY PLAN                                                                                            
           
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=22031.30..22031.31 rows=1 width=8)
   Output: count(1)
   ->  Foreign Scan on public.customer_reviews  (cost=0.00..17625.04 rows=1762504 width=0)
         Output: customer_id, review_date, review_rating, review_votes, review_helpful_votes, product_id, product_title, product_sales_rank, product_group, product_category, product_subcategory, similar_product_ids
         CStore File: /data/pgsql-10/data/cstore_fdw/12328/74320
         CStore File Size: 105383393
(6 rows)


postgres=# \x
Expanded display is on.
postgres=# 
postgres=# 
postgres=# select * from pg_class pc where relname='customer_reviews';
-[ RECORD 1 ]-------+-----------------
relname             | customer_reviews
relnamespace        | 2200
reltype             | 74322
reloftype           | 0
relowner            | 10
relam               | 0
relfilenode         | 74320
reltablespace       | 0
relpages            | 12865
reltuples           | 1.7625e+06
relallvisible       | 0
reltoastrelid       | 0
relhasindex         | f
relisshared         | f
relpersistence      | p
relkind             | f
relnatts            | 12
relchecks           | 0
relhasoids          | f
relhaspkey          | f
relhasrules         | f
relhastriggers      | f
relhassubclass      | f
relrowsecurity      | f
relforcerowsecurity | f
relispopulated      | t
relreplident        | n
relispartition      | f
relfrozenxid        | 0
relminmxid          | 0
relacl              | 
reloptions          | 
relpartbound        | 

$ cd $PGDATA
$ tree cstore_fdw
cstore_fdw
└── 12328
    ├── 74320
    └── 74320.footer

1 directory, 2 files
$ ls -l ./cstore_fdw/12328/
total 102924
-rw------- 1 postgres postgres 105383393 Sep 10 15:13 74320
-rw------- 1 postgres postgres       239 Sep 10 15:13 74320.footer

参考:
https://github.com/citusdata/cstore_fdw

 类似资料: