需要准备三台Centos7
的服务、和一个虚拟
IP。
三台服务器都已经配置好静态IP。
没有配的可以参考:https://blog.csdn.net/qq_37481017/article/details/118732932
主机名 | IP | PostgreSQL版本 | Pgpool2版本 |
---|---|---|---|
server1 | 192.168.159.181 | 13.6 | 4.2.8 |
server2 | 192.168.159.182 | 13.6 | 4.2.8 |
server3 | 192.168.159.183 | 13.6 | 4.2.8 |
VIP | 192.168.159.184 | / | / |
以下所有操作需要在所有机器上执行、操作用户是root
修改机器主机名。
hostnamectl set-hostname server1
修改hosts文件、增加以下配置
vim /etc/hosts
# 增加以下配置
192.168.159.181 server1
192.168.159.182 server2
192.168.159.183 server3
yum install libtermcap-devel ncurses-devel libevent-devel readline-devel pcre-devel gcc openssl openssl-devel per perl wget -y
useradd -m -U postgres
# 修改postgres用户密码
passwd postgres
# 查看防火墙状态
firewall-cmd --state
# 临时停止防火墙
systemctl stop firewalld.service
# 禁止防火墙开机启动
systemctl disable firewalld.service
mkdir -p /app/software
reboot
PostgreSQL官网:https://www.postgresql.org/
从官网下载指定版本安装包,上传到指定位置。这里以/app/software目录为例。
以下操作需要在所有机器上执行、操作用户是root
PostgreSQL13.6版本构建需要make版本是3.80或者更高版本。先检查机器make版本是否符合要求
make --version
tar -zxvf postgresql-13.6.tar.gz
cd postgresql-13.6
# prefix指定安装到哪里
./configure --prefix=/app/postgresql-13.6
make world && make install-world
vim /etc/profile
# 增加以下配置
LD_LIBRARY_PATH=/app/postgresql-13.6/lib
export LD_LIBRARY_PATH
PATH=/app/postgresql-13.6/bin:$PATH
export PATH
MANPATH=/app/postgresql-13.6/share/man:$MANPATH
export MANPATH
# 刷新环境变量
source /etc/profile
# 验证环境变量配置是否成功
pg_config
chown -R postgres:postgres /app
以下操作只需要在
server1
机器执行、以postgres用户
su postgres
[postgres@server1 bin]$ /app/postgresql-13.6/bin/initdb -D /app/postgresql-13.6/data -W
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "zh_CN.UTF-8".
The default database encoding has accordingly been set to "UTF8".
initdb: could not find suitable text search configuration for locale "zh_CN.UTF-8"
The default text search configuration will be set to "simple".
Data page checksums are disabled.
# 输入超级用户密码
Enter new superuser password:
# 确认密码
Enter it again:
creating directory /app/postgresql-13.6/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... PRC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
/app/postgresql-13.6/bin/pg_ctl -D /app/postgresql-13.6/data -l logfile start
vim /app/postgresql-13.6/data/postgresql.conf
# 修改以下配置
listen_addresses = '*'
port = 5432
wal_level = replica
archive_mode = on
archive_command = 'cp "%p" "/app/postgresql-13.6/archivedir/%f"'
max_wal_senders = 10
max_replication_slots = 10
hot_standby = on
wal_log_hints = on
vim /app/postgresql-13.6/data/pg_hba.conf
# 修改以下配置
host all all 0.0.0.0/0 trust
host replication all 0.0.0.0/0 trust
mkdir -p /app/postgresql-13.6/archivedir && mkdir -p /app/postgresql-13.6/logs
# -D 指定数据存放目录 -l 指定日志目录文件
[postgres@server1 bin]$ /app/postgresql-13.6/bin/pg_ctl start -D /app/postgresql-13.6/data/ -l /app/postgresql-13.6/logs/postgres.log
waiting for server to start.... done
server started
[postgres@server1 bin]$ /app/postgresql-13.6/bin/psql -U postgres -p 5432
psql (13.6)
Type "help" for help.
postgres=# SET password_encryption = 'md5';
SET
postgres=# CREATE ROLE pgpool WITH LOGIN;
CREATE ROLE
postgres=# CREATE ROLE repl WITH REPLICATION LOGIN;
CREATE ROLE
postgres=# \password pgpool
Enter new password for user "pgpool":
Enter it again:
postgres=# \password repl
Enter new password for user "repl":
Enter it again:
postgres=# \password postgres
Enter new password for user "postgres":
Enter it again:
postgres=# GRANT pg_monitor TO pgpool;
GRANT ROLE
postgres=# \q
官网地址:https://pgpool.net/mediawiki/index.php/Main_Page
从官网下载指定版本安装包,上传到指定位置。这里以/app/software目录为例。
以下操作需要在所有机器上执行、操作用户
postgres
# 在postgres用户目录下创建
mkdir ~/.ssh && chmod 700 ~/.ssh && cd ~/.ssh
# 生成公钥和私钥
ssh-keygen -t rsa -f id_rsa_pgpool
# 远程拷贝给对方机器
ssh-copy-id -i id_rsa_pgpool.pub postgres@server1
ssh-copy-id -i id_rsa_pgpool.pub postgres@server2
ssh-copy-id -i id_rsa_pgpool.pub postgres@server3
vim /app/postgresql-13.6/.pgpass
# 主机名:端口:库名:用户名:密码
# 增加以下配置
server1:5432:replication:repl:repl
server2:5432:replication:repl:repl
server3:5432:replication:repl:repl
server1:5432:postgres:postgres:postgres
server2:5432:postgres:postgres:postgres
server3:5432:postgres:postgres:postgres
chmod 600 /app/postgresql-13.6/.pgpass
tar -zxvf pgpool-II-4.2.8.tar.gz
cd /app/software/pgpool-II-4.2.8
# --prefix指定安装目录 --with-pgsql指定postgresql的目录
./configure --prefix=/app/pgpool2-4.2.8 --with-pgsql=/app/postgresql-13.6/
make && make install
安装pgpool-recovery、在使用在线恢复时需要
cd /app/software/pgpool-II-4.2.8/src/sql/pgpool-recovery
make && make install
配置pgpool_node_id文件、server1是0、server2是1、server3是2
vim /app/pgpool2-4.2.8/etc/pgpool_node_id
0
以下操作全部在server1机器上执行、postgres用户
# 复制failover故障转移和follow_primary故障转移后备机和新的主同步
cp /app/pgpool2-4.2.8/etc/failover.sh.sample /app/pgpool2-4.2.8/etc/failover.sh && \
cp /app/pgpool2-4.2.8/etc/follow_primary.sh.sample /app/pgpool2-4.2.8/etc/follow_primary.sh && \
cd /app/pgpool2-4.2.8/etc/ && \
chmod 700 failover.sh follow_primary.sh
vim /app/pgpool2-4.2.8/etc/failover.sh
# 修改以下配置
PGHOME=/app/postgresql-13.6
vim /app/pgpool2-4.2.8/etc/follow_primary.sh
# 修改以下配置
PGHOME=/app/postgresql-13.6
ARCHIVEDIR=/app/postgresql-13.6/archivedir
REPLUSER=repl
PCP_USER=pgpool
PGPOOL_PATH=/app/pgpool2-4.2.8/bin
primary_conninfo = 'host=${NEW_PRIMARY_NODE_HOST} port=${NEW_PRIMARY_NODE_PORT} user=${REPLUSER} application_name=${NODE_HOST} passfile=''/app/postgresql-13.6/.pgpass'''
cp /app/pgpool2-4.2.8/etc/pcp.conf.sample /app/pgpool2-4.2.8/etc/pcp.conf
# 生成pgpool用户和密码
echo 'pgpool:'`/app/pgpool2-4.2.8/bin/pg_md5 pgpool` >> /app/pgpool2-4.2.8/etc/pcp.conf
echo 'localhost:9898:pgpool:pgpool' > ~/.pcppass && chmod 600 ~/.pcppass
# 复制在线恢复脚本
cp /app/pgpool2-4.2.8/etc/recovery_1st_stage.sample /app/postgresql-13.6/data/recovery_1st_stage && cp /app/pgpool2-4.2.8/etc/pgpool_remote_start.sample /app/postgresql-13.6/data/pgpool_remote_start && chmod 700 /app/postgresql-13.6/data/recovery_1st_stage /app/postgresql-13.6/data/pgpool_remote_start
vim /app/postgresql-13.6/data/recovery_1st_stage
# 修改以下配置
PGHOME=/app/postgresql-13.6
ARCHIVEDIR=/app/postgresql-13.6/archivedir
primary_conninfo = 'host=${PRIMARY_NODE_HOST} port=${PRIMARY_NODE_PORT} user=${REPLUSER} application_name=${DEST_NODE_HOST} passfile=''/app/postgresql-13.6/.pgpass'''
vim /app/postgresql-13.6/data/pgpool_remote_start
# 修改以下配置
PGHOME=/app/postgresql-13.6
# 执行
psql template1 -c "CREATE EXTENSION pgpool_recovery"
cp /app/pgpool2-4.2.8/etc/pgpool.conf.sample-stream /app/pgpool2-4.2.8/etc/pgpool.conf
vim /app/pgpool2-4.2.8/etc/pgpool.conf
# 修改以下配置
listen_addresses = '*'
port = 9999
sr_check_user = 'pgpool'
sr_check_password = ''
health_check_period = 5
health_check_timeout = 30
health_check_user = 'pgpool'
health_check_password = ''
health_check_max_retries = 3
backend_hostname0 = 'server1'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/app/postgresql-13.6/data'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_application_name0 = 'server1'
backend_hostname1 = 'server2'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/app/postgresql-13.6/data'
backend_flag1 = 'ALLOW_TO_FAILOVER'
backend_application_name1 = 'server2'
backend_hostname2 = 'server3'
backend_port2 = 5432
backend_weight2 = 1
backend_data_directory2 = '/app/postgresql-13.6/data'
backend_flag2 = 'ALLOW_TO_FAILOVER'
backend_application_name2 = 'server3'
failover_command = '/app/pgpool2-4.2.8/etc/failover.sh %d %h %p %D %m %H %M %P %r %R %N %S'
follow_primary_command = '/app/pgpool2-4.2.8/etc/follow_primary.sh %d %h %p %D %m %H %M %P %r %R'
recovery_user = 'postgres'
recovery_password = ''
recovery_1st_stage_command = 'recovery_1st_stage'
pid_file_name = '/app/pgpool2-4.2.8/pgpool.pid'
enable_pool_hba = on
use_watchdog = on
delegate_IP = '192.168.159.184'
# ens33是你的网卡名
if_up_cmd = '/usr/bin/sudo /sbin/ip addr add $_IP_$/24 dev ens33 label ens33:0'
if_down_cmd = '/usr/bin/sudo /sbin/ip addr del $_IP_$/24 dev ens33'
arping_cmd = '/usr/bin/sudo /usr/sbin/arping -U $_IP_$ -w 1 -I ens33'
if_cmd_path = '/sbin'
arping_path = '/usr/sbin'
hostname0 = 'server1'
wd_port0 = 9000
pgpool_port0 = 9999
hostname1 = 'server2'
wd_port1 = 9000
pgpool_port1 = 9999
hostname2 = 'server3'
wd_port2 = 9000
pgpool_port2 = 9999
wd_lifecheck_method = 'heartbeat'
wd_interval = 10
heartbeat_hostname0 = 'server1'
heartbeat_port0 = 9694
heartbeat_device0 = ''
heartbeat_hostname1 = 'server2'
heartbeat_port1 = 9694
heartbeat_device1 = ''
heartbeat_hostname2 = 'server3'
heartbeat_port2 = 9694
heartbeat_device2 = ''
wd_heartbeat_keepalive = 2
wd_heartbeat_deadtime = 30
wd_escalation_command = '/app/pgpool2-4.2.8/etc/escalation.sh'
log_destination = 'stderr'
logging_collector = on
log_directory = '/app/pgpool2-4.2.8/logs'
log_filename = 'pgpool-%Y-%m-%d_%H%M%S.log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 10MB
cp /app/pgpool2-4.2.8/etc/pool_hba.conf.sample /app/pgpool2-4.2.8/etc/pool_hba.conf
vim /app/pgpool2-4.2.8/etc/pool_hba.conf
# 修改以下配置
host replication all 0.0.0.0/0 md5
host all all 0.0.0.0/0 md5
# 生成pgpool和postgres密码文件
/app/pgpool2-4.2.8/bin/pg_md5 -f /app/pgpool2-4.2.8/etc/pgpool.conf -m -u pgpool pgpool
/app/pgpool2-4.2.8/bin/pg_md5 -f /app/pgpool2-4.2.8/etc/pgpool.conf -m -u postgres postgres
# 复制看门狗的脚本
cp /app/pgpool2-4.2.8/etc/escalation.sh.sample /app/pgpool2-4.2.8/etc/escalation.sh && chmod 700 escalation.sh
vim /app/pgpool2-4.2.8/etc/escalation.sh
# 修改以下配置
VIP=192.168.159.184
DEVICE=ens33
需要所有机器都执行
mkdir -p /app/pgpool2-4.2.8/logs
cd /app/pgpool2-4.2.8/etc
scp escalation.sh failover.sh follow_primary.sh pgpool.conf pcp.conf pool_hba.conf pool_passwd postgres@server2:/app/pgpool2-4.2.8/etc/
cd ~
scp .pcppass postgres@server2:/home/postgres/
切换成root、需要每台机器都修改
su root
vim /etc/sudoers
postgres ALL=NOPASSWD: /sbin/ip
postgres ALL=NOPASSWD:/usr/sbin/arping
需要每台机器都修改、操作用户是root
vim /usr/lib/systemd/system/pgpool.service
[Unit]
Description=Pgpool-II
After=syslog.target network.target
[Service]
Type=forking
User=postgres
Group=postgres
PIDFile=/app/pgpool2-4.2.8/pgpool.pid
ExecStart=/app/pgpool2-4.2.8/bin/pgpool -f /app/pgpool2-4.2.8/etc/pgpool.conf -F /app/pgpool2-4.2.8/etc/pcp.conf
ExecStop=/app/pgpool2-4.2.8/bin/pgpool -f /app/pgpool2-4.2.8/etc/pgpool.conf -F /app/pgpool2-4.2.8/etc/pcp.conf -m fast stop
ExecReload=/app/pgpool2-4.2.8/bin/pgpool -f /app/pgpool2-4.2.8/etc/pgpool.conf -F /app/pgpool2-4.2.8/etc/pcp.conf reload
[Install]
WantedBy=multi-user.target
# 启动服务
systemctl start pgpool
# 设置开机自启
systemctl enable pgpool
[root@server1 system]# systemctl status pgpool
● pgpool.service - Pgpool-II
Loaded: loaded (/usr/lib/systemd/system/pgpool.service; disabled; vendor preset: disabled)
Active: active (running) since 一 2022-05-23 19:17:08 CST; 4s ago
Main PID: 30782 (pgpool)
Memory: 131.2M
CGroup: /system.slice/pgpool.service
├─30782 /app/pgpool2-4.2.8/bin/pgpool -f /app/pgpool2-4.2.8/etc/pgpool.conf -F /app/pgpoo...
├─30783 pgpool: PgpoolLogger
└─30785 pgpool: watchdog
5月 23 19:17:08 server1 systemd[1]: Started Pgpool-II.
5月 23 19:17:08 server1 systemd[1]: Starting Pgpool-II...
su postgres
cd /app/pgpool2-4.2.8/bin
# 使用在线恢复脚本加入postgres备机 后面的1和2就是你的pgpool_node_id
./pcp_recovery_node -h 192.168.159.184 -p 9898 -U pgpool -n 1
Password:
pcp_recovery_node -- Command Successful
[postgres@server1 bin]$ ./pcp_recovery_node -h 192.168.159.184 -p 9898 -U pgpool -n 2
Password:
pcp_recovery_node -- Command Successful
psql -h 192.168.159.184 -p 9999 -U pgpool postgres -c "show pool_nodes"
# 可以看到现在server1是postgres主服务、2和3同步1的数据
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+----------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
0 | server1 | 5432 | up | 0.333333 | primary | 0 | true | 0 | | | 2022-05-23 19:21:47
1 | server2 | 5432 | up | 0.333333 | standby | 0 | false | 0 | streaming | async | 2022-05-23 19:21:47
2 | server3 | 5432 | up | 0.333333 | standby | 0 | false | 0 | streaming | async | 2022-05-23 19:21:47
(3 rows)
/app/pgpool2-4.2.8/bin/pcp_watchdog_info -h 192.168.159.184 -p 9898 -U pgpool
# 可以看到现在server1是主。2和3是备
server1:9999 Linux server1 server1 9999 9000 4 LEADER
server2:9999 Linux server2 server2 9999 9000 7 STANDBY
server3:9999 Linux server3 server3 9999 9000 7 STANDBY
停掉server1服务、然后再查看watchdog状态。
systemctl stop pgpool
# 现在是server2是主,3是备。1是宕机
server2:9999 Linux server2 server2 9999 9000 4 LEADER
server1:9999 Linux server1 server1 9999 9000 10 SHUTDOWN
server3:9999 Linux server3 server3 9999 9000 7 STANDBY
再启动server1服务、查看状态。
# server2是主,1和3是备
server2:9999 Linux server2 server2 9999 9000 4 LEADER
server1:9999 Linux server1 server1 9999 9000 7 STANDBY
server3:9999 Linux server3 server3 9999 9000 7 STANDBY
psql -h 192.168.159.184 -p 9999 -U pgpool postgres -c "show pool_nodes"
# 现在server1是主
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+----------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
0 | server1 | 5432 | up | 0.333333 | primary | 0 | true | 0 | | | 2022-05-23 19:26:31
1 | server2 | 5432 | up | 0.333333 | standby | 0 | false | 0 | streaming | async | 2022-05-23 19:26:31
2 | server3 | 5432 | up | 0.333333 | standby | 0 | false | 0 | streaming | async | 2022-05-23 19:26:31
停掉server1的postgresql服务。
/app/postgresql-13.6/bin/pg_ctl stop -D /app/postgresql-13.6/data/
# 可以看到现在server2是主,server3是备。1是宕机
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+----------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
0 | server1 | 5432 | down | 0.333333 | standby | 0 | false | 0 | | | 2022-05-23 19:27:42
1 | server2 | 5432 | up | 0.333333 | primary | 0 | false | 0 | | | 2022-05-23 19:27:42
2 | server3 | 5432 | up | 0.333333 | standby | 0 | true | 0 | streaming | async | 2022-05-23 19:27:47
# 查看单节点信息。server3是t、备状态
[postgres@server1 bin]$ psql -h server3 -p 5432 -U pgpool postgres -c "select pg_is_in_recovery()"
pg_is_in_recovery
-------------------
t
(1 row)
# server2是f、主节点
[postgres@server1 bin]$ psql -h server2 -p 5432 -U pgpool postgres -c "select pg_is_in_recovery()"
pg_is_in_recovery
-------------------
f
(1 row)
# 查看server2的同步信息
[postgres@server1 bin]$ psql -h server2 -p 5432 -U pgpool postgres -c "select * from pg_stat_replication" -x
-[ RECORD 1 ]----+------------------------------
pid | 31103
usesysid | 16385
usename | repl
application_name | server3
client_addr | 192.168.159.183
client_hostname |
client_port | 47512
backend_start | 2022-05-23 19:27:43.700107+08
backend_xmin |
state | streaming
sent_lsn | 0/60001B8
write_lsn | 0/60001B8
flush_lsn | 0/60001B8
replay_lsn | 0/60001B8
write_lag |
flush_lag |
replay_lag |
sync_priority | 0
sync_state | async
reply_time | 2022-05-23 19:29:34.022772+08
# 使用在线恢复脚本恢复server1
[postgres@server1 bin]$ ./pcp_recovery_node -h 192.168.159.184 -p 9898 -U pgpool -n 0
Password:
pcp_recovery_node -- Command Successful
# 可以看到server2是主,1和3是备
[postgres@server1 bin]$ psql -h 192.168.159.184 -p 9999 -U pgpool postgres -c "show pool_nodes"
Password for user pgpool:
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+----------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
0 | server1 | 5432 | up | 0.333333 | standby | 0 | false | 0 | streaming | async | 2022-05-23 19:31:02
1 | server2 | 5432 | up | 0.333333 | primary | 0 | false | 0 | | | 2022-05-23 19:27:42
2 | server3 | 5432 | up | 0.333333 | standby | 0 | true | 0 | streaming | async | 2022-05-23 19:27:47
(3 rows)