当前位置: 首页 > 工具软件 > GAP system > 使用案例 >

DataGuard问题集2 -- MRP0一直显示WAIT_FOR_GAP

荣德厚
2023-12-01

DataGuard问题集2
– MRP0一直显示WAIT_FOR_GAP

现象:
MRP0一直显示WAIT_FOR_GAP的状态。

有了上次的教训,这次看到MRP0一直是WAIT_FOR_GAP的状态,先看看StandbyREDO是否正确构建了,答案是完全没问题。

接下来确认各种信息:
Standby端的结果:

SQL> select process,status from gv$managed_standby where process like 'MRP%';

PROCESS    STATUS
--------------------------- ------------------------------------
MRP0    WAIT_FOR_GAP★★


SQL> SELECT PROCESS,CLIENT_PROCESS C_PROCESS,STATUS,THREAD#,SEQUENCE#,BLOCK#,BLOCKS FROM GV$MANAGED_S

PROCESS    C_PROCESS  STATUS    THREAD#  SEQUENCE#   BLOCK#     BLOCKS
---------- ---------- ---------- ---------- ---------- ---------- ----------
ARCH   ARCH       CLOSING  2  2009 1  23
DGRD   N/A      ALLOCATED  0     0 0   0
・・・
RFS   LGWR       IDLE  2  2010   745932   1
MRP0   N/A      WAIT_FOR_G  2  2002 0   0★★

===>MRP0在等待应用Thread2Seq 2002.

SQL> select inst_id,thread#,sequence#,status from gv$managed_standby where process='MRP0'; 
INST_ID THREAD# SEQUENCE# STATUS 2  2  2002  WAIT_FOR_GAP  ★ 

===>MRP0在等待应用Thread2Seq 2002.

SQL> SELECT al.thrd "Thread", almax "Last Seq Received", lhmax "Last Seq Applied" FROM (select thread# thrd, MAX(sequence#) almax FROM v$archived_log WHERE resetlogs_change#=(SELECT resetlogs_change# FROM v$database) GROUP BY thread#) al, (SELECT thread# thrd, MAX(sequence#) lhmax FROM v$log_history WHERE resetlogs_change#=(SELECT resetlogs_change# FROM v$database) GROUP BY thread#) lh WHERE al.thrd = lh.thrd; 

Thread Last Seq Received Last Seq Applied 
1  2127  2117  
2  2015  2006  ★ 
3  1980  1971

===>
从上面的command结果,可以确认Thread 2 Sequence 2006已经应用上了。

Standby altert log

2021-03-05T17:25:16.476492+09:00
TT02 (PID:142753): LOG_FILE_NAME_CONVERT is not defined, stop clearing ORLs
2021-03-05T17:25:16.501732+09:00
PR00 (PID:142592): Media Recovery Waiting for T-2.S-2002
PR00 (PID:142592): Fetching gap from T-2.S-2002 to T-2.S-2003 ★
2021-03-05T17:25:17.005665+09:00
Completed: ALTER DATABASE RECOVER  managed standby database instances all disconnect  ★
2021-03-05T17:27:10.969534+09:00
PR00 (PID:142592): FAL: Failed to request gap sequence
PR00 (PID:142592):  GAP - thread 2 sequence 2002-2003 ★
PR00 (PID:142592):  DBID 4274345291 branch 1012493323
PR00 (PID:142592): FAL: All defined FAL servers have been attempted
PR00 (PID:142592): -------------------------------------------------------------------------
PR00 (PID:142592): Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization
PR00 (PID:142592): parameter is defined to a value that's sufficiently large
PR00 (PID:142592): enough to maintain adequate log switch information to resolve
PR00 (PID:142592): archived redo log gaps.
PR00 (PID:142592): -------------------------------------------------------------------------

===>MRP在等待Thread2 Seq2002,2003

我们知道Oracle数据库中,Standby端的REDO应用是按照SCN顺序应用的,所以已经应用完了Thread2 Seq2006,接下来应该应用2007,但是却在等待应用旧的REDO Thread2 Seq2002,这两者是矛盾的。这种情况跟数据库本身的应用逻辑是矛盾的,可以判断出是BUG。

STANDBY: Datafiles Checkpoint not Updated at Standby Database when Media Recover is running (Doc ID 29056767.8)
原因:18c开始,把log切换的检查点(checkpoint)更新改换成基于时间的检查点更新。time-based checkpoint的处理有问题,产生本次操作。这次的bug还会产生下面的现象。
A level 1 incremental backup on the standby may skip all the datafiles.

问题发生前的回避方法:
在Standby端设置:
set: alter system set “_time_based_rcv_ckpt_target”=0;
alter system set “_time_based_rcv_ckpt_target”=0 sid=’*’; (RAC)
设置后,MRP需要重启。

这个设置是改换成12.2及之前版本的checkpoint动作。

这个设置只会避免这个BUG的发生。但是已经发生了,就需要ROLL FORWARD这个Standby,跳过有问题的这段。或是重新建Standby。

谢谢

 类似资料: