http://blog.163.com/digoal@126/blog/static/163877040201062944945126/
http://blog.163.com/digoal@126/blog/static/16387704020122198214650/
http://blog.163.com/digoal@126/blog/static/1638770402011630102117658/
http://francs3.blog.163.com/blog/static/4057672720107541611270/
http://blog.163.com/dazuiba_008/blog/static/36334981201293052730729/
http://git.postgresql.org/gitweb/
pgfincore:预先将 表/索引 刷入/刷出os cache的规则。
[digoal@localhost pgfincore-0c89186]$ pg_config -V
pg_config: invalid argument: -V
Try "pg_config --help" for more information.
[digoal@localhost pgfincore-0c89186]$ which pg_config
/opt/pgsql_934/bin/pg_config
[digoal@localhost ~]$ cd
[digoal@localhost ~]$ vi .bash_profile
......
export PATH=/opt/pgsql_934/bin:$PATH
......
查看pg_config是否配置成功:
[digoal@localhost pgfincore-0c89186]$ pg_config
BINDIR = /opt/pgsql_934/bin
DOCDIR = /opt/pgsql_934/share/doc
HTMLDIR = /opt/pgsql_934/share/doc
INCLUDEDIR = /opt/pgsql_934/include
PKGINCLUDEDIR = /opt/pgsql_934/include
INCLUDEDIR-SERVER = /opt/pgsql_934/include/server
LIBDIR = /opt/pgsql_934/lib
PKGLIBDIR = /opt/pgsql_934/lib
LOCALEDIR = /opt/pgsql_934/share/locale
MANDIR = /opt/pgsql_934/share/man
SHAREDIR = /opt/pgsql_934/share
SYSCONFDIR = /opt/pgsql_934/etc
PGXS = /opt/pgsql_934/lib/pgxs/src/makefiles/pgxs.mk
CONFIGURE = '--prefix=/opt/pgsql934'
CC = gcc
CPPFLAGS = -D_GNU_SOURCE
CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv
CFLAGS_SL = -fpic
LDFLAGS = -L../../../src/common -Wl,--as-needed -Wl,-rpath,'/opt/pgsql934/lib',--enable-new-dtags
LDFLAGS_EX =
LDFLAGS_SL =
LIBS = -lpgport -lpgcommon -lz -lreadline -lcrypt -ldl -lm
VERSION = PostgreSQL 9.3.4
[root@localhost pgfincore-0c89186]# export PATH=/opt/pgsql_934/bin:$PATH
[digoal@localhost pgfincore-0c89186]$ cd /usr/local/rj/postgresql-9.3.4/contrib/pgfincore-0c89186
[digoal@localhost pgfincore-0c89186]$ cd /usr/local/rj/postgresql-9.3.4/contrib/pgfincore-0c89186
[digoal@localhost pgfincore-0c89186]$ gmake clean
root@localhost pgfincore-0c89186]# gmake
[root@localhost pgfincore-0c89186]# gmake install
[root@localhost pgfincore-0c89186]# cd /opt/pgsql_934/lib
[root@localhost lib]# ll
total 7132
......
-rwxr-xr-x. 1 root root 70205 Jun 11 13:29 pgfincore.so
......
digoal=# create extension pgfincore;
CREATE EXTENSION
查看pgfincore的函数都有哪些
postgres=# \dx+ pgfincore
Objects in extension "pgfincore"
Object Description
------------------------------------------------------------------------------
function pgfadvise(regclass,text,integer)
function pgfadvise_dontneed(regclass)
function pgfadvise_loader(regclass,integer,boolean,boolean,bit varying)
function pgfadvise_loader(regclass,text,integer,boolean,boolean,bit varying)
function pgfadvise_normal(regclass)
function pgfadvise_random(regclass)
function pgfadvise_sequential(regclass)
function pgfadvise_willneed(regclass)
function pgfincore(regclass)
function pgfincore(regclass,boolean)
function pgfincore(regclass,text,boolean)
function pgfincore_drawer(bit varying)
function pgsysconf()
function pgsysconf_pretty()
(14 rows)
查看系统的块大小。
digoal=# select * from pgsysconf();
os_page_size | os_pages_free | os_total_pages
--------------+---------------+----------------
4096 | 16871 | 253087
(1 row)
如果没有配置pg_config的话,可能会安装失败,在使用select pgfincore('test')的时候会报错“ERROR: fincore(7, 0, 8192, 0x1cac350): Function not implemented”。
postgres=# create table test (id int, info text);
CREATE TABLE
postgres=# insert into test select generate_series(1,10000), now();
INSERT 0 10000
postgres=# select * from pgfincore('test');
relpath | segment | os_page_size | rel_os_pages | pages_mem | group_mem | os_pages_free | databit | pages_dirty | group_dirty
------------------+---------+--------------+--------------+-----------+-----------+---------------+---------+-------------+-------------
base/12896/33248 | 0 | 4096 | 168 | 169 | 1 | 17982 | | |
(1 row)
告知系统,尽可能多的将表的数据文件刷入/刷出到内存中,但不是一下子就全部(不比pgfadvise_loader)。
注意page_mem这列:表示有多少系统文件会存放到os page cache中。
postgres=# create table pgtest(id bigint,firstname text,lastname text,corp text,post text,age int,crt_time timestamp without time zone,comment text);
CREATE TABLE
Time: 442.189 ms
postgres=# insert into pgtest(id,firstname,lastname,corp,post,age,crt_time,comment) select generate_series(1,10000000),'zhou','digoal'||generate_series(2,10000001),'sky-mobi','dba team leader',28,clock_timestamp(),'abcdefg'||generate_series(3,10000002);
INSERT 0 10000000
Time: 222696.545 ms
刚插入的表中会有一部分在shared buffer中:
postgres=# select * from pgfincore('pgtest'::regclass);
relpath | segment | os_page_size | rel_os_pages | pages_mem | group_mem | os_pages_free | databit | pages_dirty | group_dirty
--------------------+---------+--------------+--------------+-----------+-----------+---------------+---------+-------------+-------------
base/12896/33260 | 0 | 4096 | 262144 | 100812 | 15218 | 16483 | | |
base/12896/33260.1 | 1 | 4096 | 4524 | 67 | 4 | 16483 | | |
(2 rows)
刷出:
postgres=# select * from pgfadvise_dontneed('pgtest'::regclass);
relpath | os_page_size | rel_os_pages | os_pages_free
--------------------+--------------+--------------+---------------
base/12896/33260 | 4096 | 262144 | 119922
base/12896/33260.1 | 4096 | 4524 | 120019
(2 rows)
Time: 305.337 ms
postgres=# select * from pgfincore('pgtest'::regclass);
relpath | segment | os_page_size | rel_os_pages | pages_mem | group_mem | os_pages_free | databit | pages_dirty | group_dirty
--------------------+---------+--------------+--------------+-----------+-----------+---------------+---------+-------------+-------------
base/12896/33260 | 0 | 4096 | 262144 | 412 | 142 | 119494 | | |
base/12896/33260.1 | 1 | 4096 | 4524 | 0 | 0 | 119494 | | |
(2 rows)
Time: 10.001 ms
注意在使用pgfadvise_dontneed再刷出之后此时的pages_mem还剩下142,并没有全部刷出(与pgfadvise_loader刷出有所不同)。
查看select count(*)所需时间:
postgres=# select count(*) from pgtest;
count
----------
10000000
(1 row)
Time: 47356.265 ms
查看在运行select count(*)期间io使用情况,可以发现与磁盘有大量的io操作(bi, bo):
[pg93@localhost ~]$ vmstat -n 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 1 251072 223176 2592 650488 0 10 47 56 27 40 0 0 99 0 0
0 1 251072 219688 2592 654032 0 0 3512 0 354 579 0 4 77 19 0
0 1 251072 216216 2592 657356 0 0 3328 0 350 555 0 5 78 17 0
2 0 251072 212372 2600 661148 0 0 3832 12 341 557 0 5 77 18 0
2 0 251072 207784 2600 665548 0 0 4400 0 416 643 1 5 93 1 0
0 0 251072 203444 2600 669944 0 0 4360 0 420 640 0 6 94 0 0
0 0 251072 199476 2600 673744 0 0 3816 0 428 657 0 7 92 1 0
0 0 251072 194888 2600 678352 0 0 4616 0 413 633 0 6 93 1 0
1 1 251032 181992 2696 690800 140 0 12644 12 532 717 1 11 70 18 0
0 2 251032 131524 2696 740304 0 0 49528 0 946 940 10 39 0 51 0
0 2 251032 106724 2696 764272 0 0 23976 8408 607 450 4 21 0 75 0
0 2 251032 86760 2696 783884 0 0 19608 12288 821 878 4 16 0 79 0
0 2 251032 67664 2696 802600 0 0 18728 16384 768 377 2 19 0 79 0
0 3 251032 66820 2704 803564 0 0 15168 13872 526 333 2 14 0 84 0
0 3 251032 62976 2704 807428 0 0 3848 10000 351 325 1 2 0 97 0
0 3 251032 70788 2704 799768 0 0 3840 10768 359 241 0 2 0 98 0
0 3 251032 67068 2704 803604 0 0 3832 15672 457 303 1 2 0 97 0
2 3 251032 63348 2704 807040 0 0 3448 14424 457 481 1 3 0 96 0
0 3 251032 71532 2712 798576 0 0 3056 13216 664 702 1 6 0 93 0
0 3 251032 67812 2712 802952 0 0 4376 3616 343 153 1 2 0 97 0
0 3 251032 63720 2712 806788 0 0 3864 4480 385 224 2 1 0 97 0
0 3 251032 71408 2712 798832 0 0 3568 4696 289 164 1 3 0 96 0
0 3 251032 68184 2712 802472 0 0 3608 3600 421 523 0 3 0 97 0
0 4 251032 64340 2720 806172 24 0 3720 3880 439 399 2 1 0 97 0
0 4 251032 71152 2720 798244 0 0 3600 3012 407 443 2 3 0 95 0
0 3 251032 68556 2720 801936 0 0 3688 5388 395 157 1 5 0 94 0
0 3 251032 64712 2720 805620 0 0 3688 3824 423 540 1 1 0 98 0
0 2 251032 61116 2720 809176 0 0 3560 2556 361 333 1 1 0 98 0
1 0 251032 70912 2728 799576 0 0 13032 52 319 301 3 10 0 87 0
1 1 251032 71532 2728 798508 0 0 30056 0 605 566 9 21 0 70 0
0 2 251032 71408 2728 798756 0 0 22896 0 457 446 4 16 0 80 0
0 2 251032 71408 2728 798408 0 0 20120 0 433 422 4 16 0 80 0
0 2 251032 69052 2728 800824 0 0 5000 2900 350 237 2 1 0 97 0
1 2 251032 68680 2736 800960 0 0 34416 1068 778 677 8 28 0 64 0
0 2 251032 68928 2620 800692 0 0 14760 560 394 341 4 12 0 84 0
1 1 251032 68928 2620 800672 0 0 20968 504 445 428 3 16 0 80 0
0 1 251032 71284 2620 798044 0 0 30680 1064 743 708 5 29 0 66 0
0 3 251032 68308 2620 800928 0 0 8232 22732 580 218 2 9 0 89 0
0 4 251032 64092 2628 804992 0 0 4080 23476 623 175 1 6 0 93 0
1 3 251032 68556 2628 799956 0 0 3676 10124 358 367 0 4 0 96 0
0 3 251028 68680 2628 800312 0 0 3400 13280 456 700 1 2 0 97 0
0 3 251028 64464 2628 804440 0 0 4136 16272 486 369 0 3 0 97 0
1 3 251028 60744 2628 808124 0 0 3680 17160 452 197 1 2 0 97 0
0 3 251028 69672 2632 799416 0 0 3064 13800 457 581 0 2 0 98 0
1 2 251028 65084 2632 803776 0 0 4400 14224 492 290 1 3 0 96 0
1 3 251028 61240 2632 807616 0 0 3832 17104 495 306 0 2 0 98 0
0 3 251028 69300 2620 799500 0 0 3592 14464 449 555 1 3 0 96 0
0 3 251028 65208 2620 803488 0 0 4016 15072 468 449 0 2 0 98 0
0 3 251028 61612 2628 807088 0 0 3608 18444 478 146 1 2 0 97 0
0 3 251028 69300 2628 799508 0 0 4200 7976 420 454 2 4 0 94 0
3 1 251028 70912 2628 797856 0 0 19832 92 551 592 7 11 0 82 0
0 2 251028 71408 2628 796976 0 0 32144 0 653 614 11 22 0 67 0
0 2 251028 71408 2628 796664 0 0 26808 0 568 545 8 20 0 72 0
1 0 251028 71532 2636 795980 0 0 24888 12 654 581 10 20 0 70 0
0 3 251028 69292 2628 797288 0 0 7888 3348 682 546 4 13 0 83 0
1 1 251028 61240 2628 806288 0 0 9040 476 652 512 6 11 0 83 0
0 2 251028 71408 2628 796004 0 0 13256 600 641 536 6 17 0 77 0
2 1 251028 68060 2616 799044 0 0 10168 528 506 452 6 13 0 81 0
0 1 251028 71284 908 797132 0 0 21776 1316 1118 962 8 30 0 61 0
0 3 251028 71160 908 796732 0 0 17016 15884 1097 758 6 23 0 71 0
0 3 251028 68308 908 799460 0 0 2752 19464 649 674 0 4 0 96 0
0 3 251028 65084 908 802508 0 0 3048 17640 591 464 1 2 0 97 0
0 3 251028 63348 908 804236 0 0 1736 13632 469 569 0 2 0 98 0
0 3 251028 61364 916 806152 0 0 1880 23564 654 421 1 2 0 97 0
0 3 251028 70292 900 797148 0 0 2768 22424 751 390 1 6 0 93 0
0 3 251028 67688 900 799740 0 0 2592 18464 537 519 1 1 0 98 0
0 3 251028 65828 900 801624 0 0 1872 16928 573 838 0 3 0 97 0
0 3 251028 63472 900 803868 0 0 2280 13288 466 434 0 1 0 99 0
0 3 251028 61364 908 805900 0 0 2008 15612 474 532 0 2 0 98 0
0 2 251028 71160 892 796224 0 0 2096 10908 577 973 1 2 0 97 0
0 2 251028 71408 884 795048 0 0 18272 0 823 699 4 25 0 71 0
0 1 251028 71532 884 794464 0 0 15808 0 679 657 2 24 0 74 0
3 1 251028 69300 388 797480 0 0 10184 0 473 426 2 17 0 81 0
2 0 251028 71408 176 795316 0 0 9392 12 459 444 2 16 0 82 0
0 2 251028 71160 160 795504 0 0 13200 0 647 605 5 19 0 76 0
0 2 251028 69920 160 796696 0 0 12680 4096 576 542 4 26 0 70 0
1 1 251028 68680 148 797568 0 0 28032 24576 711 543 9 27 0 65 0
2 0 251028 70168 148 796040 0 0 23296 24604 609 526 11 25 0 65 0
0 1 251028 70416 156 795796 0 0 25856 24628 630 552 13 24 0 63 0
0 2 251028 71532 156 793244 0 0 21248 41240 561 470 10 21 0 69 0
0 1 251028 71408 148 794380 0 0 22656 0 495 476 9 19 0 72 0
2 2 251028 71036 148 794924 0 0 23552 24576 571 490 9 19 0 72 0
0 2 251028 70540 140 795352 0 0 23424 24576 551 483 9 20 0 71 0
1 0 251024 66696 140 798888 24 0 12532 12288 763 366 48 15 0 37 0
0 0 251016 69928 416 801204 40 0 2032 68 623 278 36 9 18 37 0
0 0 251016 69060 416 802252 0 0 0 0 38 71 1 0 99 0 0
0 0 251012 68068 416 803332 0 0 0 0 38 73 0 1 99 0 0
0 0 251012 66208 420 804420 0 0 4 24184 123 86 0 4 93 3 0
0 0 251012 65092 420 805500 0 0 0 0 38 72 0 0 100 0 0
0 0 251012 64100 428 806580 0 0 0 68 44 90 0 1 99 0 0
0 0 251012 62984 428 807660 0 0 0 0 39 74 0 1 99 0 0
postgres=# select * from pgfadvise_willneed('pgtest'::regclass);
relpath | os_page_size | rel_os_pages | os_pages_free
--------------------+--------------+--------------+---------------
base/12896/33260 | 4096 | 262144 | 15306
base/12896/33260.1 | 4096 | 4524 | 15801
(2 rows)
Time: 2344.132 ms
postgres=# select * from pgfincore('pgtest'::regclass);
relpath | segment | os_page_size | rel_os_pages | pages_mem | group_mem | os_pages_free | databit | pages_dirty | group_dirty
--------------------+---------+--------------+--------------+-----------+-----------+---------------+---------+-------------+-------------
base/12896/33260 | 0 | 4096 | 262144 | 157035 | 9064 | 17971 | | |
base/12896/33260.1 | 1 | 4096 | 4524 | 4524 | 1 | 17971 | | |
(2 rows)
Time: 55.911 ms
查看select count(*)所需时间:
postgres=# select count(*) from pgtest;
count
----------
10000000
(1 row)
Time: 7500.932 ms
[pg93@localhost ~]$ vmstat -n 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 250848 72440 944 802808 0 10 49 57 27 40 0 0 99 0 0
0 0 250844 72300 944 802908 0 0 0 0 51 94 1 0 99 0 0
0 0 250844 71664 944 803040 0 0 0 24 88 152 0 1 99 0 0
0 0 250844 71672 944 803228 0 0 0 0 34 94 0 0 100 0 0
0 0 250844 71672 952 803364 0 0 0 56 54 126 0 1 99 0 0
0 0 250844 71672 952 803396 0 0 0 0 22 84 0 0 100 0 0
0 1 250844 68944 944 806676 0 0 14820 0 716 293 40 18 26 15 0
1 0 250844 67340 724 810168 0 0 83072 0 1210 1286 17 40 0 43 0
0 1 250844 67712 724 809872 0 0 91008 0 1299 1418 21 38 0 40 0
0 1 250844 70192 732 807452 0 0 83328 16 1206 1255 15 39 0 46 0
1 1 250844 70812 200 807372 0 0 91776 0 1351 1394 13 49 0 39 0
1 1 250844 71556 200 806116 0 0 34416 0 2097 2948 8 54 0 38 0
0 1 250844 71556 200 806232 0 0 17048 0 2131 3112 12 47 0 41 0
0 0 250844 71308 200 806900 0 0 568 0 746 104 62 10 24 3 0
0 0 250844 71060 208 807072 0 0 0 16 40 95 0 0 100 0 0
0 0 250844 70936 208 807216 0 0 0 0 31 92 0 0 100 0 0
0 0 250844 70424 208 807308 0 0 0 24 71 127 1 2 97 0 0
查看当前pgtest表的page_mem:(163570+3)
postgres=# select * from pgfincore('pgtest'::regclass);
relpath | segment | os_page_size | rel_os_pages | pages_mem | group_mem | os_pages_free | databit | pages_dirty | group_dirty
--------------------+---------+--------------+--------------+-----------+-----------+---------------+---------+-------------+-------------
base/12896/33260 | 0 | 4096 | 262144 | 163570 | 32 | 16855 | | |
base/12896/33260.1 | 1 | 4096 | 4524 | 3 | 1 | 16855 | | |
(2 rows)
Time: 37.738 ms
postgres=# create table pgfincore_pgtest_snapshot as select * from pgfincore('pgtest'::regclass,true);
SELECT 2
Time: 705.283 ms
postgres=# select pgfadvise_loader('pgtest'::regclass,segment,false,true,bitnot(databit)) from (select segment,databit from pgfincore('pgtest'::regclass,true)) as t; pgfadvise_loader
-----------------------------------------
(base/12896/33260,4096,181633,0,163570)
(base/12896/33260.1,4096,181636,0,3)
(2 rows)
Time: 250.228 ms
postgres=# select * from pgfincore('pgtest'::regclass);
relpath | segment | os_page_size | rel_os_pages | pages_mem | group_mem | os_pages_free | databit | pages_dirty | group_dirty
--------------------+---------+--------------+--------------+-----------+-----------+---------------+---------+-------------+-------------
base/12896/33260 | 0 | 4096 | 262144 | 0 | 0 | 181638 | | |
base/12896/33260.1 | 1 | 4096 | 4524 | 0 | 0 | 181638 | | |
(2 rows)
Time: 9.127 ms
restore还原snapshot的os cache:
postgres=# select pgfadvise_loader('pgtest'::regclass,segment,true,true,databit) from pgfincore_pgtest_snapshot;
pgfadvise_loader
--------------------------------------------
(base/12896/33260,4096,17942,163570,98574)
(base/12896/33260.1,4096,17973,3,4521)
(2 rows)
Time: 7236.190 ms
postgres=# select * from pgfincore('pgtest'::regclass);
relpath | segment | os_page_size | rel_os_pages | pages_mem | group_mem | os_pages_free | databit | pages_dirty | group_dirty
--------------------+---------+--------------+--------------+-----------+-----------+---------------+---------+-------------+-------------
base/12896/33260 | 0 | 4096 | 262144 | 163570 | 32 | 18565 | | |
base/12896/33260.1 | 1 | 4096 | 4524 | 3 | 1 | 18565 | | |
(2 rows)
Time: 38.081 ms
===============附录====================================================================
===========
PgFincore
===========
--------------------------------------------------------------
A set of functions to manage pages in memory from PostgreSQL
--------------------------------------------------------------
A set of functions to handle low-level management of relations using mincore to
explore cache memory.
DESCRIPTION
===========
With PostgreSQL, each Table or Index is splitted in segments of (usually) 1GB,
and each segment is splitted in pages in memory then in blocks for the
filesystem.
Those functions let you know which and how many disk block from a relation are
in the page cache of the operating system. It can provide the result as a VarBit
and can be stored in a table. Then using this table, it is possible to restore
the page cache state for each block of the relation, even in another server,
thanks to Streaming Replication.
Other functions are used to set a *POSIX_FADVISE* flag on the entire relation
(each segment). The more usefull are probably *WILLNEED* and *DONTNEED* which
push and pop blocks of each segments of a relation from page cache,
respectively.
Each functions are call with at least a table name or an index name (or oid)
as a parameter and walk each segment of the relation.
DOWNLOAD
========
You can grab the latest code with git::
git clone git://git.postgresql.org/git/pgfincore.git
or
git://github.com/klando/pgfincore.git
And the project is on pgfoundry : http://pgfoundry.org/projects/pgfincore
INSTALL
=======
From source code::
make clean
make
su
make install
For PostgreSQL >= 9.1, log in your database and::
mydb=# CREATE EXTENSION pgfincore;
For other release, create the functions from the sql script (it should be in
your contrib directory)::
psql mydb -f pgfincore.sql
PgFincore is also shipped with Debian scripts to build your own package::
aptitude install debhelper postgresql-server-dev-all postgresql-server-dev-9.1
# or postgresql-server-dev-8.4|postgresql-server-dev-9.0
make deb
dpkg -i ../postgresql-9.1-pgfincore_1.1.1-1_amd64.deb
PgFincore is packaged for *RPM* at http://yum.postgresql.org/
PgFincore is packaged for *debian* at http://pgapt.debian.net/
EXAMPLES
========
Here are some examples of usage. If you want more details go to Documentation_
Get current state of a relation
-------------------------------
May be useful::
cedric=# select * from pgfincore('pgbench_accounts');
relpath | segment | os_page_size | rel_os_pages | pages_mem | group_mem | os_pages_free | databit
--------------------+---------+--------------+--------------+-----------+-----------+---------------+---------
base/11874/16447 | 0 | 4096 | 262144 | 262144 | 1 | 81016 |
base/11874/16447.1 | 1 | 4096 | 65726 | 65726 | 1 | 81016 |
(2 rows)
Time: 31.563 ms
Load a table or an index in OS Page Buffer
------------------------------------------
You may want to try to keep a table or an index into the OS Page Cache, or
preload a table before your well know big query is executed (reducing the query
time).
To do so, just execute the following query::
cedric=# select * from pgfadvise_willneed('pgbench_accounts');
relpath | os_page_size | rel_os_pages | os_pages_free
--------------------+--------------+--------------+---------------
base/11874/16447 | 4096 | 262144 | 169138
base/11874/16447.1 | 4096 | 65726 | 103352
(2 rows)
Time: 4462,936 ms
* The column *os_page_size* report that page size is 4KB.
* The column *rel_os_pages* is the number of pages of the specified file.
* The column *os_pages_free* is the number of free pages in memory (for caching).
Snapshot and Restore the OS Page Buffer state of a table or an index (or more)
------------------------------------------------------------------------------
You may want to restore a table or an index into the OS Page Cache as it was
while you did the snapshot. For example if you have to reboot your server, then
when PostgreSQL start up the first queries might be slower because neither
PostgreSQL or the OS have pages in their respective cache about the relations
involved in those first queries.
Executing a snapshot and a restore is very simple::
-- Snapshot
cedric=# create table pgfincore_snapshot as
cedric-# select 'pgbench_accounts'::text as relname,*,now() as date_snapshot
cedric-# from pgfincore('pgbench_accounts',true);
-- Restore
cedric=# select * from pgfadvise_loader('pgbench_accounts', 0, true, true,
(select databit from pgfincore_snapshot
where relname='pgbench_accounts' and segment = 0));
relpath | os_page_size | os_pages_free | pages_loaded | pages_unloaded
------------------+--------------+---------------+--------------+----------------
base/11874/16447 | 4096 | 80867 | 262144 | 0
(1 row)
Time: 35.349 ms
* The column *pages_loaded* report how many pages have been read to memory
(they may have already been in memoy)
* The column *pages_unloaded* report how many pages have been removed from
memory (they may not have already been in memoy);
SYNOPSIS
========
::
pgsysconf(OUT os_page_size bigint, OUT os_pages_free bigint,
OUT os_total_pages bigint)
RETURNS record
pgsysconf_pretty(OUT os_page_size text, OUT os_pages_free text,
OUT os_total_pages text)
RETURNS record
pgfadvise(IN relname regclass, IN fork text, IN action int,
OUT relpath text, OUT os_page_size bigint,
OUT rel_os_pages bigint, OUT os_pages_free bigint)
RETURNS setof record
pgfadvise_willneed(IN relname regclass,
OUT relpath text, OUT os_page_size bigint,
OUT rel_os_pages bigint, OUT os_pages_free bigint)
RETURNS setof record
pgfadvise_dontneed(IN relname regclass,
OUT relpath text, OUT os_page_size bigint,
OUT rel_os_pages bigint, OUT os_pages_free bigint)
RETURNS setof record
pgfadvise_normal(IN relname regclass,
OUT relpath text, OUT os_page_size bigint,
OUT rel_os_pages bigint, OUT os_pages_free bigint)
RETURNS setof record
pgfadvise_sequential(IN relname regclass,
OUT relpath text, OUT os_page_size bigint,
OUT rel_os_pages bigint, OUT os_pages_free bigint)
RETURNS setof record
pgfadvise_random(IN relname regclass,
OUT relpath text, OUT os_page_size bigint,
OUT rel_os_pages bigint, OUT os_pages_free bigint)
RETURNS setof record
pgfadvise_loader(IN relname regclass, IN fork text, IN segment int,
IN load bool, IN unload bool, IN databit varbit,
OUT relpath text, OUT os_page_size bigint,
OUT os_pages_free bigint, OUT pages_loaded bigint,
OUT pages_unloaded bigint)
RETURNS setof record
pgfadvise_loader(IN relname regclass, IN segment int,
IN load bool, IN unload bool, IN databit varbit,
OUT relpath text, OUT os_page_size bigint,
OUT os_pages_free bigint, OUT pages_loaded bigint,
OUT pages_unloaded bigint)
RETURNS setof record
pgfincore(IN relname regclass, IN fork text, IN getdatabit bool,
OUT relpath text, OUT segment int, OUT os_page_size bigint,
OUT rel_os_pages bigint, OUT pages_mem bigint,
OUT group_mem bigint, OUT os_pages_free bigint,
OUT databit varbit)
RETURNS setof record
pgfincore(IN relname regclass, IN getdatabit bool,
OUT relpath text, OUT segment int, OUT os_page_size bigint,
OUT rel_os_pages bigint, OUT pages_mem bigint,
OUT group_mem bigint, OUT os_pages_free bigint,
OUT databit varbit)
RETURNS setof record
pgfincore(IN relname regclass,
OUT relpath text, OUT segment int, OUT os_page_size bigint,
OUT rel_os_pages bigint, OUT pages_mem bigint,
OUT group_mem bigint, OUT os_pages_free bigint,
OUT databit varbit)
RETURNS setof record
DOCUMENTATION
=============
pgsysconf
---------
This function output size of OS blocks, number of free page in the OS Page Buffer.
::
cedric=# select * from pgsysconf();
os_page_size | os_pages_free | os_total_pages
--------------+---------------+----------------
4096 | 80431 | 4094174
pgsysconf_pretty
----------------
The same as above, but with pretty output.
::
cedric=# select * from pgsysconf_pretty();
os_page_size | os_pages_free | os_total_pages
--------------+---------------+----------------
4096 bytes | 314 MB | 16 GB
pgfadvise_WILLNEED
------------------
This function set *WILLNEED* flag on the current relation. It means that the
Operating Sytem will try to load as much pages as possible of the relation.
Main idea is to preload files on server startup, perhaps using cache hit/miss
ratio or most required relations/indexes.
::
cedric=# select * from pgfadvise_willneed('pgbench_accounts');
relpath | os_page_size | rel_os_pages | os_pages_free
--------------------+--------------+--------------+---------------
base/11874/16447 | 4096 | 262144 | 80650
base/11874/16447.1 | 4096 | 65726 | 80650
pgfadvise_DONTNEED
------------------
This function set *DONTNEED* flag on the current relation. It means that the
Operating System will first unload pages of the file if it need to free some
memory. Main idea is to unload files when they are not usefull anymore (instead
of perhaps more interesting pages)
::
cedric=# select * from pgfadvise_dontneed('pgbench_accounts');
relpath | os_page_size | rel_os_pages | os_pages_free
--------------------+--------------+--------------+---------------
base/11874/16447 | 4096 | 262144 | 342071
base/11874/16447.1 | 4096 | 65726 | 408103
pgfadvise_NORMAL
----------------
This function set *NORMAL* flag on the current relation.
pgfadvise_SEQUENTIAL
--------------------
This function set *SEQUENTIAL* flag on the current relation.
pgfadvise_RANDOM
----------------
This function set *RANDOM* flag on the current relation.
pgfadvise_loader
----------------
This function allow to interact directly with the Page Cache.
It can be used to load and/or unload page from memory based on a varbit
representing the map of the pages to load/unload accordingly.
Work with relation pgbench_accounts, segment 0, arbitrary varbit map::
-- Loading and Unloading
cedric=# select * from pgfadvise_loader('pgbench_accounts', 0, true, true, B'111000');
relpath | os_page_size | os_pages_free | pages_loaded | pages_unloaded
------------------+--------------+---------------+--------------+----------------
base/11874/16447 | 4096 | 408376 | 3 | 3
-- Loading
cedric=# select * from pgfadvise_loader('pgbench_accounts', 0, true, false, B'111000');
relpath | os_page_size | os_pages_free | pages_loaded | pages_unloaded
------------------+--------------+---------------+--------------+----------------
base/11874/16447 | 4096 | 408370 | 3 | 0
-- Unloading
cedric=# select * from pgfadvise_loader('pgbench_accounts', 0, false, true, B'111000');
relpath | os_page_size | os_pages_free | pages_loaded | pages_unloaded
------------------+--------------+---------------+--------------+----------------
base/11874/16447 | 4096 | 408370 | 0 | 3
pgfincore
---------
This function provide information about the file system cache (page cache).
::
cedric=# select * from pgfincore('pgbench_accounts');
relpath | segment | os_page_size | rel_os_pages | pages_mem | group_mem | os_pages_free | databit
--------------------+---------+--------------+--------------+-----------+-----------+---------------+---------
base/11874/16447 | 0 | 4096 | 262144 | 3 | 1 | 408444 |
base/11874/16447.1 | 1 | 4096 | 65726 | 0 | 0 | 408444 |
For the specified relation it returns:
* relpath : the relation path
* segment : the segment number analyzed
* os_page_size : the size of one page
* rel_os_pages : the total number of pages of the relation
* pages_mem : the total number of relation's pages in page cache.
(not the shared buffers from PostgreSQL but the OS cache)
* group_mem : the number of groups of adjacent pages_mem
* os_page_free : the number of free page in the OS page cache
* databit : the varbit map of the file, because of its size it is useless to output
Use pgfincore('pgbench_accounts',true) to activate it.
DEBUG
=====
You can debug the PgFincore with the following error level: *DEBUG1* and
*DEBUG5*.
For example::
set client_min_messages TO debug1; -- debug5 is only usefull to trace each block
LIMITATIONS
===========
* PgFincore needs mincore() and POSIX_FADVISE.
* PgFincore has a limited mode when POSIX_FADVISE is not provided by the platform.
* PgFincore needs PostgreSQL >= 8.3
* PgFincore does not work on windows.
SEE ALSO
========
2ndQuadrant, PostgreSQL Expertise, developement, training and 24x7 support:
http://2ndQuadrant.fr