Oracle11.2.0.4.0的版本，RAC ADG备库mrp进程出现应用日志卡主的问题

问题原因分析:

1、查看当前mrp进程的状态，当前在应用日志14450

SYS@hddbsz2> select process,pid from V$managed_standby where process like '%MRP%';

PROCESS PID
--------- ----------
MRP0 14450

2、对mrp进程进行10046跟踪

查看mrp进程一直在等待parallel recovery slave next change，等待并行子进程的信息回复

oradebug setospid 41045
oradebug unlimit
oradebug Event 10046 trace name context forever, level 12
oradebug Event 10046 trace name context off
oradebug tracefile_name

3、查看日志

/u01/app/oracle/diag/rdbms/hddbsz/hddbsz2/trace/hddbsz2_mrp0_14450.trc

4、这个问题由于BUG17695685产生的，ADG打上补丁，重新开启应用日志进程，发现正常应用了

MOS参考：
Bug 17695685 - Hang in Active Dataguard Database with RAC (Doc ID 17695685.8)

Bug 17695685 Hang in Active Dataguard Database with RAC
This note gives a brief overview of bug 17695685.
The content was last updated on: 08-MAR-2022

Affects:

Product (Component)   Oracle Server (Rdbms)
Range of versions believed to be affected   Versions >= 11.2
Versions confirmed as being affected
11.2.0.4
11.2.0.3
Platforms affected   Generic (all / most platforms affected)

Fixed:

The fix for 17695685 is first included in
12.1.0.1 (Base Release)
11.2.0.4.190716 (Jul 2019) Database Patch Set Update (DB PSU)
11.2.0.4.190716 Exadata Database Bundle Patch (Jul 2019)
11.2.0.4.190716 (Jul 2019) Bundle Patch for Windows Platforms

Description

This bug is only relevant when using Real Application Clusters (RAC)
Rediscovery:
- There is hang in Active Dataguard Database (ADG)
- MRP or its slave waits for a buffer with state: MEDIA_RCV

For example a hanganalyze trace might show something like:

Oracle session identified by:
{
instance: 1
os id: 31520
process id: 82, <.....> (PR0J)
session id: 982
session serial #: 1157
}
which is waiting for 'gc buffer busy release' with wait info:
{
p1: 'file#'=0x3
p2: 'block#'=0x1399a
p3: 'class#'=0x1c
time in wait: 0.567673 sec
heur. time in wait: 40 min 7 sec
timeout after: 0.432327 sec
wait id: 35008931
blocking: 2737 sessions
current sql: <none>
short stack:
ksedsts()+465<- ...
<-semtimedop()+10<-skgpwwait()+160<-ksliwat()+2022<-kslwaitctx()+163<-kslwait()+141
<-kclwlr()+535<-kcbzfc()+656<-kcbr_media_apply()+1782<-krp_slave_apply()+284<-krp_slave_main()

GLOBAL CACHE ELEMENT DUMP (address: 0xf7e7c360):
id1: 0x1399a id2: 0x3 pkey: INVALID block: (3/80282)
lock: X rls: 0x7 acq: 0x0 latch: 3
flags: 0x20 fair: 255 recovery: 0 fpin: 'kclwh2'
bscn: 0x2.3927d9d3 bctx: (nil) write: 0 scan: 0x0
lcp: (nil) lnk: [NULL] lch: [0x31f91f2b0,0x31f91f2b0]
seq: 89 hist: 54 113 238 180 113 238 180 113 238 180 113 238 180 113 238 180
113 238 180 113
LIST OF BUFFERS LINKED TO THIS GLOBAL CACHE ELEMENT:
flg: 0x00280400 lflg: 0x8 state: MEDIA_RCV tsn: 2 tsh: 4 waiters: 4

- There are waiters for "Media Recovery" buffer.

- This issue may also sometime present as "buffer deadlock" / "gc buffer busy acquire" waits in RAC

Please note: The above is a summary description only. Actual symptoms can vary. Matching to any symptoms here does not confirm that you are encountering this problem. For questions about this bug please consult Oracle Support.

References

Bug:17695685 (This link will only work for PUBLISHED bugs)
Note:245840.1 Information on the sections in this article