开发同事跟我说,测试环境的greenplun突然连接不上了,于是我登陆进去服务器,发现没有greenplun进程了,问开发同事是否有对greenplumn有过改动之类的,他们说没有动过,这就奇了怪了,咋回事呢?
[gpadmin@00_mdw ~]$ gpstart
20170517:10:53:59:017586 gpstart:00_mdw:gpadmin-[INFO]:-Starting gpstart with args:
20170517:10:53:59:017586 gpstart:00_mdw:gpadmin-[INFO]:-Gathering information and validating the environment...
20170517:10:53:59:017586 gpstart:00_mdw:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 4.3.10.0 build commit: f413ff3b006655f14b6b9aa217495ec94da5c96c'
20170517:10:53:59:017586 gpstart:00_mdw:gpadmin-[INFO]:-Greenplum Catalog Version: '201310150'
20170517:10:53:59:017586 gpstart:00_mdw:gpadmin-[INFO]:-Starting Master instance in admin mode
20170517:10:54:01:017586 gpstart:00_mdw:gpadmin-[CRITICAL]:-Failed to start Master instance in admin mode
20170517:10:54:01:017586 gpstart:00_mdw:gpadmin-[CRITICAL]:-Error occurred: non-zero rc: 1
Command was: 'env GPSESSID=0000000000 GPERA=None $GPHOME/bin/pg_ctl -D /home/gpadmin/gpdata/gpmaster/gpseg-1 -l /home/gpadmin/gpdata/gpmaster/gpseg-1/pg_log/startup.log -w -t 600 -o " -p 5432 -b 1 -z 0 --silent-mode=true -i -M master -C -1 -x 0 -c gp_role=utility " start'
rc=1, stdout='waiting for server to start...... stopped waiting
', stderr='pg_ctl: PID file "/home/gpadmin/gpdata/gpmaster/gpseg-1/postmaster.pid" does not exist
pg_ctl: could not start server
Examine the log output.
'
[gpadmin@00_mdw ~]$
2017-05-16 11:18:20.666964 CST,,,p16542,th251283232,,,,0,,,seg-1,,,,,"LOG","00000","removing all temporary files",,,,,,,,"RemovePgTempFiles","fd.c",1873,
2017-05-16 11:18:20.692596 CST,,,p16542,th251283232,,,,0,,,seg-1,,,,,"LOG","00000","temporary files using default filespace",,,,,,,,"primaryMirrorPopulateFilespaceInfo","primary_mirror_mode.c",2569,
2017-05-16 11:18:20.693209 CST,,,p16542,th251283232,,,,0,,,seg-1,,,,,"LOG","00000","transaction files using default pg_system filespace",,,,,,,,"primaryMirrorPopulateFilespaceInfo","primary_mirror_mode.c",2629,
2017-05-16 13:27:17.059691 CST,,,p16630,th930637600,,,,0,,,seg-1,,,,,"LOG","00000","removing all temporary files",,,,,,,,"RemovePgTempFiles","fd.c",1873,
2017-05-16 13:27:17.062897 CST,,,p16630,th930637600,,,,0,,,seg-1,,,,,"LOG","00000","temporary files using default filespace",,,,,,,,"primaryMirrorPopulateFilespaceInfo","primary_mirror_mode.c",2569,
2017-05-16 13:27:17.063528 CST,,,p16630,th930637600,,,,0,,,seg-1,,,,,"LOG","00000","transaction files using default pg_system filespace",,,,,,,,"primaryMirrorPopulateFilespaceInfo","primary_mirror_mode.c",2629,
2017-05-17 10:53:59.610428 CST,,,p17597,th695740192,,,,0,,,seg-1,,,,,"LOG","00000","removing all temporary files",,,,,,,,"RemovePgTempFiles","fd.c",1873,
2017-05-17 10:53:59.643630 CST,,,p17597,th695740192,,,,0,,,seg-1,,,,,"LOG","00000","temporary files using default filespace",,,,,,,,"primaryMirrorPopulateFilespaceInfo","primary_mirror_mode.c",2569,
2017-05-17 10:53:59.644220 CST,,,p17597,th695740192,,,,0,,,seg-1,,,,,"LOG","00000","transaction files using default pg_system filespace",,,,,,,,"primaryMirrorPopulateFilespaceInfo","primary_mirror_mode.c",2629,
博客来源地址:http://blog.csdn.net/mchdba/article/details/72383684,作者为mchdba黄杉,谢绝转载。
[gpadmin@00_mdw pg_log]$ ll -t
total 740
-rw-------. 1 gpadmin gpadmin 386 May 17 11:24 gpdb-2017-05-17_112454.csv
-rw-------. 1 gpadmin gpadmin 3951 May 17 11:24 startup.log
-rw-------. 1 gpadmin gpadmin 384 May 17 10:53 gpdb-2017-05-17_105359.csv
-rw-------. 1 gpadmin gpadmin 384 May 16 13:27 gpdb-2017-05-16_132717.csv
-rw-------. 1 gpadmin gpadmin 384 May 16 11:18 gpdb-2017-05-16_111820.csv
-rw-------. 1 gpadmin gpadmin 30004 May 16 11:17 gpdb-2017-05-16_000000.csv
-rw-------. 1 gpadmin gpadmin 0 May 15 00:00 gpdb-2017-05-15_000000.csv
-rw-------. 1 gpadmin gpadmin 0 May 14 00:00 gpdb-2017-05-14_000000.csv
-rw-------. 1 gpadmin gpadmin 0 May 13 00:00 gpdb-2017-05-13_000000.csv
-rw-------. 1 gpadmin gpadmin 0 May 12 00:00 gpdb-2017-05-12_000000.csv
-rw-------. 1 gpadmin gpadmin 0 May 11 00:00 gpdb-2017-05-11_000000.csv
-rw-------. 1 gpadmin gpadmin 0 May 10 00:00 gpdb-2017-05-10_000000.csv
-rw-------. 1 gpadmin gpadmin 13073 May 9 21:14 gpdb-2017-05-09_000000.csv
-rw-------. 1 gpadmin gpadmin 18458 May 8 11:38 gpdb-2017-05-08_000000.csv
-rw-------. 1 gpadmin gpadmin 0 May 7 00:00 gpdb-2017-05-07_000000.csv
[gpadmin@00_mdw pg_log]$ more gpdb-2017-05-17_112454.csv
2017-05-17 11:24:54.936656 CST,,,p17681,th-400611552,,,,0,,,seg-1,,,,,"LOG","F0000","invalid authentication method ""127.0.0.1/28""",,,,,"line 87 of configuration file ""/home/gpadmin/gpdata/gpmaster/gpseg-1/pg_hba.conf""",,0,,"hba.c",1095,
2017-05-17 11:24:54.936871 CST,,,p17681,th-400611552,,,,0,,,seg-1,,,,,"FATAL","XX000","could not load pg_hba.conf",,,,,,,0,,"postmaster.c",1529,
[gpadmin@00_mdw pg_log]$
看到gpdb-2017-05-17_112454.csv文件里面描述的很清晰,是pg_hba.conf配置文件有误,然后去找配置文件/home/gpadmin/gpdata/gpmaster/gpseg-1/pg_hba.conf,注释掉报错的那一行【line 87 of configuration file 】“127.0.0.1/28"”
#local all all 127.0.0.1/28 trust
[gpadmin@00_mdw pg_log]$ gpstart
20170517:11:28:20:017745 gpstart:00_mdw:gpadmin-[INFO]:-Starting gpstart with args:
20170517:11:28:20:017745 gpstart:00_mdw:gpadmin-[INFO]:-Gathering information and validating the environment...
20170517:11:28:20:017745 gpstart:00_mdw:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 4.3.10.0 build commit: f413ff3b006655f14b6b9aa217495ec94da5c96c'
20170517:11:28:20:017745 gpstart:00_mdw:gpadmin-[INFO]:-Greenplum Catalog Version: '201310150'
20170517:11:28:20:017745 gpstart:00_mdw:gpadmin-[INFO]:-Starting Master instance in admin mode
20170517:11:28:21:017745 gpstart:00_mdw:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20170517:11:28:21:017745 gpstart:00_mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20170517:11:28:21:017745 gpstart:00_mdw:gpadmin-[INFO]:-Setting new master era
20170517:11:28:21:017745 gpstart:00_mdw:gpadmin-[INFO]:-Master Started...
20170517:11:28:21:017745 gpstart:00_mdw:gpadmin-[INFO]:-Shutting down master
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[WARNING]:-Skipping startup of segment marked down in configuration: on 02_sdw directory /home/gpadmin/gpdata/gpdatam1/gpseg0 <<<<<
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[WARNING]:-Skipping startup of segment marked down in configuration: on 02_sdw directory /home/gpadmin/gpdata/gpdatam2/gpseg1 <<<<<
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[WARNING]:-Skipping startup of segment marked down in configuration: on 01_sdw directory /home/gpadmin/gpdata/gpdatam1/gpseg4 <<<<<
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[WARNING]:-Skipping startup of segment marked down in configuration: on 01_sdw directory /home/gpadmin/gpdata/gpdatam2/gpseg5 <<<<<
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:---------------------------
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:-Master instance parameters
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:---------------------------
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:-Database = template1
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:-Master Port = 5432
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:-Master directory = /home/gpadmin/gpdata/gpmaster/gpseg-1
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:-Timeout = 600 seconds
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:-Master standby = Off
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:---------------------------------------
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:-Segment instances that will be started
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:---------------------------------------
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- Host Datadir Port Role
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 01_sdw /home/gpadmin/gpdata/gpdatap1/gpseg0 40000 Primary
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 01_sdw /home/gpadmin/gpdata/gpdatap2/gpseg1 40001 Primary
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 02_sdw /home/gpadmin/gpdata/gpdatap1/gpseg2 40000 Primary
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 03_sdwm /home/gpadmin/gpdata/gpdatam1/gpseg2 50000 Mirror
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 02_sdw /home/gpadmin/gpdata/gpdatap2/gpseg3 40001 Primary
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 03_sdwm /home/gpadmin/gpdata/gpdatam2/gpseg3 50001 Mirror
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 03_sdwm /home/gpadmin/gpdata/gpdatap1/gpseg4 40000 Primary
20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 03_sdwm /home/gpadmin/gpdata/gpdatap2/gpseg5 40001 Primary
Continue with Greenplum instance startup Yy|Nn (default=N):
> y
20170517:11:28:25:017745 gpstart:00_mdw:gpadmin-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait...
...
20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:-Process results...
20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:-----------------------------------------------------
20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:- Successful segment starts = 8
20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:- Failed segment starts = 0
20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[WARNING]:-Skipped segment starts (segments are marked down in configuration) = 4 <<<<<<<<
20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:-----------------------------------------------------
20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:-
20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:-Successfully started 8 of 8 segment instances, skipped 4 other segments
20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:-----------------------------------------------------
20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[WARNING]:-****************************************************************************
20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[WARNING]:-There are 4 segment(s) marked down in the database
20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[WARNING]:-To recover from this current state, review usage of the gprecoverseg
20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[WARNING]:-management utility which will recover failed segment instance databases.
20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[WARNING]:-****************************************************************************
20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:-Starting Master instance 00_mdw directory /home/gpadmin/gpdata/gpmaster/gpseg-1
20170517:11:28:29:017745 gpstart:00_mdw:gpadmin-[INFO]:-Command pg_ctl reports Master 00_mdw instance active
20170517:11:28:30:017745 gpstart:00_mdw:gpadmin-[INFO]:-No standby master configured. skipping...
20170517:11:28:30:017745 gpstart:00_mdw:gpadmin-[WARNING]:-Number of segments not attempted to start: 4
20170517:11:28:30:017745 gpstart:00_mdw:gpadmin-[INFO]:-Check status of database with gpstate utility
[gpadmin@00_mdw pg_log]$