当前位置: 首页 > 工具软件 > GAP system > 使用案例 >

DataGuard利用backup incremental from scn解决gap,11.2.0.4 v$archive_gap不显示存在的gap

萧懿轩
2023-12-01

1.问题现象

客户一套双节点primary 11.2.0.4库,备库为单实例11.2.0.4

备库alert中有如下gap报错,但是查询v$archive_gap视图并没有条目输出:

FAL[client]: Failed to request gap sequence
 GAP - thread 1 sequence 365-368
 DBID 353046371 branch 984508005
FAL[client]: All defined FAL servers have been attempted.
------------------------------------------------------------
Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization
parameter is defined to a value that's sufficiently large
enough to maintain adequate log switch information to resolve
archivelog gaps.
------------------------------------------------------------

经过查询Mos,此为命中如下bug,在11.2.0.1确认受影响,受影响版本为12.1.0.1以下版本,在12.1.0.1中被修复。

Bug 10072528 - V$ARCHIVE_GAP may not detect archive gap when physical standby is open read only (Doc ID 10072528.8)

确认显示为bug之后,我们可以通过v$managed_standby视图来确认mrp进程是否在等待gap,是否真的存在gap,查询如下:

SQL> select process,pid,status,thread#,sequence# from v$managed_standby;

PROC      PID STATUS       THREAD# SEQUENCE#
---- -------- ------------ ------- ---------
ARCH     9906 CONNECTED          0         0
ARCH     9908 CONNECTED          0         0
ARCH     9910 CONNECTED          0         0
ARCH     9912 CONNECTED          0         0
MRP0    11068 WAIT_FOR_GAP       1       365
RFS     11118 IDLE               0         0
RFS     11100 IDLE               1       371
RFS     11102 IDLE               0         0

确认确实存在gap,接下来解决gap问题。

2.问题解决

经过查询确认,主库序号365归档已经删除,无法恢复,所以需要通过增量备份来解决standby gap问题。

(1)取消备库日志应用

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

(2)设置源端log_archive_dest_state_2为defer

SQL> ALTER SYSTEM SET log_archive_dest_state_2=defer scope=both;

设置该参数,停止主库发送日志给备库,防止由于在恢复期间备库启停导致主库异常。

(3)physical standby确认应用日志scn

SQL> SELECT TO_CHAR(CURRENT_SCN) FROM V$DATABASE;

(4)使用上一步SCN源端增量备份数据库

RMAN> backup incremental from scn 'scn number' database format 'backup path' tag 'tag';

(5)使用ftp/scp等方式传输增量备份到备库

(6)备库重启数据库至nomount状态,恢复standby控制文件

RMAN> shutdown immediate;
RMAN> startup nomount;
RMAN> restore standby controlfile from 'incremental backup path';
RMAN> alter database mount;

(7)catalog注册增量备份文件

RMAN> catalog start with '/rman/';

(8)确认数据文件位置正确

发现数据文件Size均为0M,这是由于控制文件记录的数据文件为primary数据文件名称,由于data_file_name_convert参数,所以路径显示是正确的,但是datafile name均为primary名称,所以此时无法进行recover database,将会报错RMAN-06094

RMAN> recover database noredo;

Starting recover at 12-SEP-19
using channel ORA_DISK_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 09/12/2019 23:40:46
RMAN-06094: datafile 1 must be restored
RMAN> report schema;

RMAN-06139: WARNING: control file is not current for REPORT SCHEMA
Report of database schema for database with db_unique_name HONORDG

List of Permanent Datafiles
===========================
File Size(MB) Tablespace           RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1    0        SYSTEM               ***     +DGDATA/honordg/datafile/system.256.984507931
2    0        SYSAUX               ***     +DGDATA/honordg/datafile/sysaux.257.984507931
3    0        UNDOTBS1             ***     +DGDATA/honordg/datafile/undotbs1.258.984507931
4    0        USERS                ***     +DGDATA/honordg/datafile/users.259.984507931
5    0        EXAMPLE              ***     +DGDATA/honordg/datafile/example.265.984508009
7    0        DUPTEST              ***     +DGDATA/honordg/datafile/duptest.287.995727159
8    0        MANTBS               ***     +DGDATA/honordg/datafile/mantbs.292.996075985

List of Temporary Files
=======================
File Size(MB) Tablespace           Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
1    20       TEMP                 32767       +DGDATA/honor/tempfile/temp.264.984508007

(9)catalog注册正确的standby数据库数据文件

RMAN> catalog start with '+dgdata/honordg/datafile';

searching for all files that match the pattern +dgdata/honordg/datafile

List of Files Unknown to the Database
=====================================
File Name: +dgdata/HONORDG/DATAFILE/SYSTEM.259.1013873459
File Name: +dgdata/HONORDG/DATAFILE/SYSAUX.260.1013873473
File Name: +dgdata/HONORDG/DATAFILE/UNDOTBS1.261.1013873489
File Name: +dgdata/HONORDG/DATAFILE/EXAMPLE.262.1013873495
File Name: +dgdata/HONORDG/DATAFILE/MANTBS.263.1013873503
File Name: +dgdata/HONORDG/DATAFILE/USERS.264.1013873511
File Name: +dgdata/HONORDG/DATAFILE/DUPTEST.265.1013873511

Do you really want to catalog the above files (enter YES or NO)? yes
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: +dgdata/HONORDG/DATAFILE/SYSTEM.259.1013873459
File Name: +dgdata/HONORDG/DATAFILE/SYSAUX.260.1013873473
File Name: +dgdata/HONORDG/DATAFILE/UNDOTBS1.261.1013873489
File Name: +dgdata/HONORDG/DATAFILE/EXAMPLE.262.1013873495
File Name: +dgdata/HONORDG/DATAFILE/MANTBS.263.1013873503
File Name: +dgdata/HONORDG/DATAFILE/USERS.264.1013873511
File Name: +dgdata/HONORDG/DATAFILE/DUPTEST.265.1013873511

(10)切换standby数据文件至正确位置

RMAN> list copy of database;

List of Datafile Copies
=======================

Key     File S Completion Time Ckp SCN    Ckp Time       
------- ---- - --------------- ---------- ---------------
22      1    A 12-SEP-19       10621997   22-NOV-19      
        Name: +DGDATA/honordg/datafile/system.259.1013873459
        Tag: TAG20190717T153043

23      2    A 12-SEP-19       10621997   22-NOV-19      
        Name: +DGDATA/honordg/datafile/sysaux.260.1013873473
        Tag: TAG20190717T153043

24      3    A 12-SEP-19       10621997   22-NOV-19      
        Name: +DGDATA/honordg/datafile/undotbs1.261.1013873489
        Tag: TAG20190717T153043

27      4    A 12-SEP-19       10621997   22-NOV-19      
        Name: +DGDATA/honordg/datafile/users.264.1013873511
        Tag: TAG20190717T153043

3       4    X 22-DEC-18       1729750    22-DEC-18      
        Name: /rman/users01.dbf
        Tag: TAG20181222T155659

25      5    A 12-SEP-19       10621997   22-NOV-19      
        Name: +DGDATA/honordg/datafile/example.262.1013873495
        Tag: TAG20190717T153043

28      7    A 12-SEP-19       10621997   22-NOV-19      
        Name: +DGDATA/honordg/datafile/duptest.265.1013873511
        Tag: TAG20190717T153043

26      8    A 12-SEP-19       10621997   22-NOV-19      
        Name: +DGDATA/honordg/datafile/mantbs.263.1013873503
        Tag: TAG20190717T153043


RMAN> switch database to copy;

datafile 1 switched to datafile copy "+DGDATA/honordg/datafile/system.259.1013873459"
datafile 2 switched to datafile copy "+DGDATA/honordg/datafile/sysaux.260.1013873473"
datafile 3 switched to datafile copy "+DGDATA/honordg/datafile/undotbs1.261.1013873489"
datafile 4 switched to datafile copy "+DGDATA/honordg/datafile/users.264.1013873511"
datafile 5 switched to datafile copy "+DGDATA/honordg/datafile/example.262.1013873495"
datafile 7 switched to datafile copy "+DGDATA/honordg/datafile/duptest.265.1013873511"
datafile 8 switched to datafile copy "+DGDATA/honordg/datafile/mantbs.263.1013873503"


--如果未指定noredo,则以open read only打开数据之前需要primary设置参数
--log_archive_dest_state_2为enable,使standby数据库可以读取源端联机日志情况,打开数据库
--否则打开时将报ORA-10458 ORA-01152 ORA-01110 ORA-10458
RMAN> recover database noredo;

Starting recover at 12-SEP-19
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=32 device type=DISK

Finished recover at 12-SEP-19

(11)只读打开standby数据库,开启应用,primary设置log_archive_dest_state_2为enable

Primary SQL> ALTER SYSTEM SET log_archive_dest_state_2=enable scope=both sid='*';
Standby SQL> ALTER DATABASE OPEN READ ONLY;
Standby SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;

3.参考文档

RMAN-06094 or RMAN-06571 During Recovery or Switch to Copy at Standby Site (Doc ID 1339439.1)
Bug 10072528 - V$ARCHIVE_GAP may not detect archive gap when physical standby is open read only (Doc ID 10072528.8)

 

 类似资料: