KingbaseES使用sys_backup.sh脚本init初始化配置文件常见错误处理

法和硕
2023-12-01

KingbaseES使用sys_backup.sh脚本init初始化配置文件常见错误处理:

sys_backup.sh脚本按照如下顺序寻找初始化配置文件:

[kingbase@postgres ~]$ sh -x sys_backup.sh init
+++ readlink -f sys_backup.sh
++ dirname /home/kingbase/sys_backup.sh
+ script_locate_folder=/home/kingbase
+ '[' '!' -f /home/kingbase/sys_backup.conf ']'
+ '[' '!' -f /home/kingbase/../share/sys_backup.conf ']'
+ /bin/echo 'ERROR: sys_backup.conf does not exist'
ERROR: sys_backup.conf does not exist
+ exit 1

使用sh -x可以看到,sys_backup.sh第一步先在执行命令当前目录查找sys_backup.conf配置文件,然后再去查找sys_backup.conf文件.
优先使用的也是当前目录的sys_backup.conf配置文件

[kingbase@postgres share]$ ls -l /home/kingbase/
总用量 2343320
-rw-rw-r--  1 kingbase kingbase        529 1月  10 15:28 ltrace.out
-rw-rw-r--  1 kingbase kingbase        252 12月  6 16:48 my_test.c
-rw-rw-r--  1 kingbase kingbase      10403 12月 13 18:21 nohup.out
drwx------ 23 kingbase kingbase       4096 1月  19 13:38 ora_data
-rw-rw-r--  1 kingbase kingbase       2201 1月  10 15:26 strace.out
-rw-------  1 kingbase kingbase       8192 11月 10 19:05 sys_control
-rw-rw-r--  1 kingbase kingbase     774611 12月 13 18:14 sys_menu.sql
drwxrwxr-x  3 kingbase kingbase         16 11月 10 01:12 V8R6C6B21

[kingbase@postgres share]$ pwd
/home/kingbase/V8R6C6B21/ES/V8/Server/share
[kingbase@postgres share]$ ls -l sys_backup.conf 
-rw-rw-r-- 1 kingbase kingbase 1957 1月  16 18:32 sys_backup.conf
[kingbase@postgres share]$

关于sys_backup.conf配置文件:

参数名参数说明
_target_db_style选择性选项,可选single或cluster或single-pro。 single对应单机模式的目标数据库实例,cluster对应集群模式的目标数据库实例,single-pro对应集群模式的每个DB节点独立备份
_one_db_ip一个数据库节点的IP或主机名,支持主机名、IPv4、IPv6地址
_repo_ipREPO备份节点的IP或主机名,支持主机名、IPv4、IPv6地址
_stanza_name备份服务器的标签,仅在物理备份范畴内使用
_os_user_name操作系统的用户名
_repo_path实际保存备份集的目录
_repo_retention_full_count保存全量备份的数目,超过此数目的全量备份将被自动移除
_crond_full_days自动执行全量备份的间隔天数,0表示不执行
_crond_diff_days自动执行差异备份的间隔天数,0表示不执行
_crond_incr_days自动执行增量备份的间隔天数,0表示不执行
_crond_full_hour自动执行全量备份的时间点,2表示凌晨2点
_crond_diff_hour自动执行差异备份的时间点,3表示凌晨3点
_crond_incr_hour自动执行增量备份的时间点,4表示凌晨4点
_band_width网络限速,单位固定为 MB/s, 默认为0,代表不限速,配置文件仅接受纯数字
_os_ip_cmd操作系统常见命令ip的全路径文件名
_os_rm_cmd操作系统常见命令rm的全路径文件名
_os_sed_cmd操作系统常见命令sed的全路径文件名
_os_grep_cmd操作系统常见命令grep的全路径文件名
_single_data_dir单机数据库节点的数据目录
_single_bin_dir单机数据库节点的二进制目录
_single_db_user单机数据库节点的数据库登录用户名
_single_db_port单机数据库节点的端口
_use_scmd使用通讯协议,默认使用securecmdd,可选使用ssh
_start_fast是否快速启动备份,立即生成checkpoint,默认y
_compress_type是否在备份时使用压缩存储,默认为none不压缩
_non_archived_space在init过程中,检查未归档的WAL的容量,如果超过设置值,报错并退出init过程。单位固定为MB,可选 128 ~ 1024
# 示例sys_backup.conf配置文件

_target_db_style="single"
_one_db_ip="192.168.57.10"             # 需要进行备份的数据库节点IP地址.
_repo_ip="192.168.57.10"               # 备份文件存放的IP地址.可以为异机.
_stanza_name="kingbase"                # 物理备份的标签.
_os_user_name="kingbase"               # 安装数据库系统用户名.
_repo_path="/home/kingbase/kbbr_repo"  # 备份存放路径.
_repo_retention_full_count=5           # 保留的全备数.默认为5.保留最近时间的5个全备.
_crond_full_days=7                     # 执行全备备份的间隔天数,0表示不执行
_crond_diff_days=0                     # 执行差异备份的间隔天数,0表示不执行
_crond_incr_days=1                     # 执行增量备份的间隔天数,0表示不执行
_crond_full_hour=2                     # 执行全量备份的时间点,2表示凌晨2点执行.
_crond_diff_hour=3                     # 执行差异备份的时间点,3表示凌晨3点执行.
_crond_incr_hour=4                     # 执行增量备份的时间点,4表示凌晨4点执行.
_band_width=0                     
_os_ip_cmd="/sbin/ip"
_os_rm_cmd="/bin/rm"
_os_sed_cmd="/bin/sed"
_os_grep_cmd="/bin/grep"
_single_data_dir="/home/kingbase/ora_data"
_single_bin_dir="/home/kingbase/V8R6C6B21/ES/V8/Server/bin"
_single_db_user="system"
_single_db_port="54321"
_use_scmd=off
_start_fast=y
_compress_type=none
_non_archived_space=1024

执行sys_backup.sh init 初始化配置文件:

场景1: ERROR: [087]: archive_mode must be enabled

未配置 archive_mode=of、archive_command='' 参数

[kingbase@postgres ~]$ sys_backup.sh init
# pre-condition: check the non-archived WAL files
The authenticity of host '192.168.57.10 (192.168.57.10)' can't be established.
ECDSA key fingerprint is SHA256:jbiSTLyWOSvk9P6Rrum0V/H5PE52Fz48bO69ttVGvjU.
ECDSA key fingerprint is MD5:87:25:68:6f:0d:d4:b3:30:33:f1:2c:94:26:f3:5f:be.
Are you sure you want to continue connecting (yes/no)? yes
root@192.168.57.10's password: 
# generate single sys_rman.conf...DONE
# update single archive_command with sys_rman.archive-push...DONE
# create stanza and check...(maybe 60+ seconds)
ERROR: check stanza failed, check log file /home/kingbase/V8R6C6B21/ES/V8/Server/log/sys_rman_check.log

[kingbase@postgres ~]$ cat /home/kingbase/V8R6C6B21/ES/V8/Server/log/sys_rman_check.log
2023-01-19 16:01:57.397 P00   INFO: check command begin 2.27: --archive-timeout=600 --band-width=0 --config=/home/kingbase/kbbr_repo/sys_rman.conf --exec-id=17446-c4a40c02 --log-level-console=info --log-level-file=info --log-path=/home/kingbase/V8R6C6B21/ES/V8/Server/log --log-subprocess --kb1-path=/home/kingbase/ora_data --kb1-port=54321 --kb1-user=system --process-max=4 --repo1-path=/home/kingbase/kbbr_repo --stanza=kingbase
ERROR: [087]: archive_mode must be enabled
2023-01-19 16:02:09.841 P00   INFO: check command end: aborted with exception [087]

解决方法:

# 在kingbase.conf文件配置
archive_mode=on
archive_command=''

场景2: ERROR: Configured repo-path [/home/kingbase/kbbr_repo] already exists

解决方法:

_repo_path=/home/kingbase/kbbr_repo 目录已经存在.
1.修改 _repo_path=='新的目录'
2.删除已存在的目录 _repo_path='/home/kingbase/kbbr_repo' 重新初始化.

场景3: HINT: the kb1-path and kb1-port settings likely reference different clusters.

[kingbase@postgres ~]$ sys_backup.sh init
# pre-condition: check the non-archived WAL files
root@192.168.57.10's password: 
# generate single sys_rman.conf...DONE
# update single archive_command with sys_rman.archive-push...DONE
# create stanza and check...(maybe 60+ seconds)
ERROR: create stanza failed, check log file /home/kingbase/V8R6C6B21/ES/V8/Server/log/sys_rman_stanza-create.log
[kingbase@postgres ~]$ cat /home/kingbase/V8R6C6B21/ES/V8/Server/log/sys_rman_stanza-create.log
2023-01-19 16:01:48.647 P00   INFO: stanza-create command begin 2.27: --band-width=0 --config=/home/kingbase/kbbr_repo/sys_rman.conf --exec-id=17435-9cf03a4d --log-level-console=info --log-level-file=info --log-path=/home/kingbase/V8R6C6B21/ES/V8/Server/log --log-subprocess --kb1-path=/home/kingbase/ora_data --kb1-port=54321 --kb1-user=system --process-max=4 --repo1-path=/home/kingbase/kbbr_repo --stanza=kingbase
2023-01-19 16:01:57.158 P00   INFO: stanza-create for stanza 'kingbase' on repo1
2023-01-19 16:01:57.351 P00   INFO: stanza-create command end: completed successfully (8751ms)
2023-01-19 16:15:19.487 P00   INFO: stanza-create command begin 2.27: --band-width=0 --config=/home/kingbase/kbbr_repo/sys_rman.conf --exec-id=18215-51df97d6 --log-level-console=info --log-level-file=info --log-path=/home/kingbase/V8R6C6B21/ES/V8/Server/log --log-subprocess --kb1-path=/home/kingbase/ora_data --kb1-port=54321 --kb1-user=system --process-max=4 --repo1-path=/home/kingbase/kbbr_repo --stanza=kingbase
2023-01-19 16:15:30.568 P00   INFO: stanza-create for stanza 'kingbase' on repo1
2023-01-19 16:15:31.069 P00   INFO: stanza-create command end: completed successfully (11587ms)
2023-01-19 16:24:43.158 P00   INFO: stanza-create command begin 2.27: --band-width=0 --config=/home/kingbase/kbbr_repo/sys_rman.conf --exec-id=18872-d06fff97 --log-level-console=info --log-level-file=info --log-path=/home/kingbase/V8R6C6B21/ES/V8/Server/log --log-subprocess --kb1-path=/home/kingbase/ora_data --kb1-port=54321 --kb1-user=system --process-max=4 --repo1-path=/home/kingbase/kbbr_repo --stanza=kingbase
2023-01-19 16:24:43.599 P00   INFO: stanza-create for stanza 'kingbase' on repo1
2023-01-19 16:24:43.703 P00   INFO: stanza-create command end: completed successfully (602ms)
2023-01-19 16:38:40.530 P00   INFO: stanza-create command begin 2.27: --band-width=0 --config=/home/kingbase/kbbr_repo/sys_rman.conf --exec-id=19774-3cfc3082 --log-level-console=info --log-level-file=info --log-path=/home/kingbase/V8R6C6B21/ES/V8/Server/log --log-subprocess --kb1-path=/home/kingbase/ora_data --kb1-port=54321 --kb1-user=system --process-max=4 --repo1-path=/home/kingbase/kbbr_repo --stanza=kingbase
2023-01-19 16:38:52.681 P00   INFO: stanza-create for stanza 'kingbase' on repo1
2023-01-19 16:38:53.040 P00   INFO: stanza-create command end: completed successfully (12555ms)
2023-01-19 16:42:55.417 P00   INFO: stanza-create command begin 2.27: --band-width=0 --config=/home/kingbase/kbbr_repo/sys_rman.conf --exec-id=20238-e619c305 --log-level-console=info --log-level-file=info --log-path=/home/kingbase/V8R6C6B21/ES/V8/Server/log --log-subprocess --kb1-path=/home/kingbase/ora_data --kb1-port=54321 --kb1-user=system --process-max=4 --repo1-path=/home/kingbase/kbbr_repo --stanza=kingbase
2023-01-19 16:43:00.698 P00   INFO: stanza-create for stanza 'kingbase' on repo1
2023-01-19 16:43:00.841 P00   INFO: stanza-create command end: completed successfully (5430ms)
2023-01-19 16:51:57.076 P00   INFO: stanza-create command begin 2.27: --band-width=0 --config=/home/kingbase/kbbr_repo/sys_rman.conf --exec-id=20987-a35b7df2 --log-level-console=info --log-level-file=info --log-path=/home/kingbase/V8R6C6B21/ES/V8/Server/log --log-subprocess --kb1-path=/home/kingbase/ora_data --kb1-port=54321 --kb1-user=system --process-max=4 --repo1-path=/home/kingbase/kbbr_repo --stanza=kingbase
ERROR: [058]: version '12' and path '/home/kingbase/./ora_data' queried from cluster do not match version '12' and '/home/kingbase/ora_data' read from '/home/kingbase/ora_data/global/sys_control'
       HINT: the kb1-path and kb1-port settings likely reference different clusters.
2023-01-19 16:52:11.082 P00   INFO: stanza-create command end: aborted with exception [058]

解决方法:

由于数据库使用相对路径启动,改为使用绝对路径启动数据库.

[kingbase@postgres ~]$ ps -ef|grep kingbase| grep -v grep
root     10486 10288  0 13:38 pts/0    00:00:00 su - kingbase
kingbase 10487 10486  0 13:38 pts/0    00:00:01 -bash
root     15073 10325  0 15:30 pts/2    00:00:00 su - kingbase
kingbase 15074 15073  0 15:30 pts/2    00:00:00 -bash
kingbase 18637     1  0 16:24 ?        00:00:00 /home/kingbase/V8R6C6B21/ES/V8/KESRealPro/V008R006C006B0021/Server/bin/kingbase -D ora_data
kingbase 18638 18637  0 16:24 ?        00:00:00 kingbase: logger   
kingbase 18641 18637  0 16:24 ?        00:00:00 kingbase: checkpointer   
kingbase 18642 18637  0 16:24 ?        00:00:02 kingbase: background writer   
kingbase 18643 18637  0 16:24 ?        00:00:00 kingbase: walwriter   
kingbase 18644 18637  0 16:24 ?        00:00:00 kingbase: autovacuum launcher   
kingbase 18645 18637  0 16:24 ?        00:00:00 kingbase: archiver   last was 0000000100000000000000E8.00000028.backup
kingbase 18646 18637  0 16:24 ?        00:00:00 kingbase: stats collector   
kingbase 18647 18637  0 16:24 ?        00:00:00 kingbase: ksh writer   
kingbase 18648 18637  0 16:24 ?        00:00:02 kingbase: ksh collector   
kingbase 18649 18637  0 16:24 ?        00:00:00 kingbase: kwr collector   
kingbase 18650 18637  0 16:24 ?        00:00:00 kingbase: job bgworker   
kingbase 18651 18637  0 16:24 ?        00:00:00 kingbase: logical replication launcher   
kingbase 19322 10487  0 16:34 pts/0    00:00:00 ps -ef
[kingbase@postgres ~]$ 

# 使用绝对路径启动数据库
[kingbase@postgres ~]$ sys_ctl -D /home/kingbase/ora_data/ start
waiting for server to start....2023-01-19 16:35:00.108 CST [19369] LOG:  sepapower extension initialized
2023-01-19 16:35:00.253 CST [19369] LOG:  starting KingbaseES V008R006C006B0021 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
2023-01-19 16:35:00.254 CST [19369] LOG:  listening on IPv4 address "0.0.0.0", port 54321
2023-01-19 16:35:00.254 CST [19369] LOG:  listening on IPv6 address "::", port 54321
2023-01-19 16:35:00.263 CST [19369] LOG:  listening on Unix socket "/tmp/.s.KINGBASE.54321"
2023-01-19 16:35:00.511 CST [19369] LOG:  redirecting log output to logging collector process
2023-01-19 16:35:00.511 CST [19369] HINT:  Future log output will appear in directory "sys_log".
 done
server started
[kingbase@postgres ~]$ ps -ef|grep kingbase| grep -v grep
root     10486 10288  0 13:38 pts/0    00:00:00 su - kingbase
kingbase 10487 10486  0 13:38 pts/0    00:00:01 -bash
root     15073 10325  0 15:30 pts/2    00:00:00 su - kingbase
kingbase 15074 15073  0 15:30 pts/2    00:00:00 -bash
kingbase 19369     1 11 16:34 ?        00:00:00 /home/kingbase/V8R6C6B21/ES/V8/KESRealPro/V008R006C006B0021/Server/bin/kingbase -D /home/kingbase/ora_data
kingbase 19370 19369  0 16:34 ?        00:00:00 kingbase: logger   
kingbase 19373 19369  0 16:34 ?        00:00:00 kingbase: checkpointer   
kingbase 19374 19369  0 16:34 ?        00:00:00 kingbase: background writer   
kingbase 19375 19369  0 16:34 ?        00:00:00 kingbase: walwriter   
kingbase 19376 19369  0 16:34 ?        00:00:00 kingbase: autovacuum launcher   
kingbase 19377 19369  0 16:34 ?        00:00:00 kingbase: archiver   
kingbase 19378 19369  0 16:34 ?        00:00:00 kingbase: stats collector   
kingbase 19379 19369  0 16:34 ?        00:00:00 kingbase: ksh writer   
kingbase 19380 19369  0 16:34 ?        00:00:00 kingbase: ksh collector   
kingbase 19381 19369  0 16:34 ?        00:00:00 kingbase: kwr collector   
kingbase 19382 19369  1 16:34 ?        00:00:00 kingbase: job bgworker   
kingbase 19383 19369  0 16:34 ?        00:00:00 kingbase: logical replication launcher   
kingbase 19389 10487  0 16:35 pts/0    00:00:00 ps -ef

# 再次执行init 初始化成功
[kingbase@postgres ~]$ sys_backup.sh init
# pre-condition: check the non-archived WAL files
root@192.168.57.10's password: 
# generate single sys_rman.conf...DONE
# update single archive_command with sys_rman.archive-push...DONE
# create stanza and check...(maybe 60+ seconds)
# create stanza and check...DONE
# initial first full backup...(maybe several minutes)
# initial first full backup...DONE
# Initial sys_rman OK.
'sys_backup.sh start' should be executed when need back-rest feature.

场景4: ERROR: can not access the host /tmp/sys_rman by root

[kingbase@postgres ~]$ sys_backup.sh init
# pre-condition: check the non-archived WAL files
ERROR: can not access the 192.168.57.10 /tmp/sys_rman by root

解决方法: 修改sshd_config 文件参数 PermitRootLogin yes,重启sshd服务.

# 查看 sshd_config 文件 PermitRootLogin 参数
[root@postgres ~]# cat /etc/ssh/sshd_config | grep PermitRootLogin 
PermitRootLogin no
# the setting of "PermitRootLogin without-password".

# 更改 sshd_config 文件 PermitRootLogin yes 
# 重启sshd服务 systemctl restart sshd.service
在次执行 sys_backup.sh init 成功

[kingbase@postgres ~]$ sys_backup.sh init
# pre-condition: check the non-archived WAL files
root@192.168.57.10's password: 
# generate single sys_rman.conf...DONE
# update single archive_command with sys_rman.archive-push...DONE
# create stanza and check...(maybe 60+ seconds)
# create stanza and check...DONE
# initial first full backup...(maybe several minutes)
# initial first full backup...DONE
# Initial sys_rman OK.
'sys_backup.sh start' should be executed when need back-rest feature.

[kingbase@postgres ~]$ sys_backup.sh start
# pre-condition: check the non-archived WAL files
Enable some sys_rman in crontab-daemon
0 2 */7 * * kingbase /home/kingbase/V8R6C6B21/ES/V8/Server/bin/sys_rman --config=/home/kingbase/kbbr_repo/sys_rman.conf --stanza=kingbase --archive-copy --type=full backup >> /home/kingbase/V8R6C6B21/ES/V8/Server/log/sys_rman_backup_full.log 2>&1
0 4 */1 * * kingbase /home/kingbase/V8R6C6B21/ES/V8/Server/bin/sys_rman --config=/home/kingbase/kbbr_repo/sys_rman.conf --stanza=kingbase --archive-copy --type=incr backup >> /home/kingbase/V8R6C6B21/ES/V8/Server/log/sys_rman_backup_incr.log 2>&1
[kingbase@postgres ~]$

除以上常见错误外,如果遇到新的错误,可以使用sh -x 执行脚本,输出详细过程,便于定位解决问题.

关于sshd_config文件PermitRootLogin参数:

注意:不同的操作系统sshd_config文件PermitRootLogin参数默认值均不相同.

参数类别是否允许ssh登陆登录方式交互shell
yes允许没有限制没有限制
without-password允许除密码以外没有限制
forced-commands-only允许仅允许使用密钥仅允许已授权的命令
no不允许N/AN/A
prohibit-password不允许不允许密码登录允许免密
 类似资料: