监控 Oracle 数据库还原/恢复进度-之路教程

数据字典视图

所有 RMAN 操作都会有一个相应的数据库会话。
因此，我们可以查询数据字典以检查其进度。

1. RMAN 会话

SQL> -- RMAN sessions
set linesize 100 trimspool on
COLUMN sid      FORMAT 9999
COLUMN serial#  ALIAS SER# FORMAT 99999
COLUMN spid  FORMAT 9999
COLUMN username FORMAT a10
COLUMN status   FORMAT a2
COLUMN program  FORMAT a32
COLUMN logon_time form a15
COLUMN module form a30
COLUMN action form a35
COLUMN process form a14
SELECT
       s.sid ,
       s.serial# "ser#",
       s.username,
       to_char(s.logon_time,'DD-MM-RR hh24:mi') logon_time,
       s.osuser,
       s.process,
       p.spid ,
       s.machine,
       substr(s.status,1,1) status,
       s.program
FROM v$session s, v$process p
WHERE upper(s.program) like '%RMAN%'
AND   s.paddr = p.addr (+)
ORDER by s.logon_time, s.sid
/

2. 完成工作的百分比

以 5 分钟的间隔运行以下查询至少 3 次以查看进度/更改。

SQL>set echo on feedback on
    column path format a50
    set header off
    select
           sl.sofar, sl.totalwork,
           round(sl.sofar/sl.totalwork*100,2) "% Complete"
    from   v$session_longops sl, v$session s, v$process p
    where  p.addr = s.paddr
    and    sl.sid=s.sid
    and    sl.serial#=s.serial#
    and    opname LIKE 'RMAN%'
    and    opname NOT LIKE '%aggregate%'
    and    totalwork != 0
    and    sofar <> totalwork;

3. 会话等待

是否有任何会话在等待，它在等待什么？
以 5 分钟的间隔运行以下查询至少 3 次以查看进度/更改。

set linesize 200 trimspool on
col event form a25
col p1text form a15
col p1 form 999999
col p2text form a15
col p2 form 999999
col p3text form a10
col p3 form 9999
col waited form 9999
col waiting form 9999
select sid, event, p1text, p1, p2text, p2, p3text, p3,
wait_time waited, seconds_in_wait waiting
from gv$session_wait
where event not like 'SQL*Net%'
and event not like '%timer%'
and event not like 'rdbms%'
and event not like 'pipe%'
and event not like 'DIAG%'
and event not like 'Streams AQ%'
and event not like 'VKTM%'
and state = 'WAITING'
order by seconds_in_wait
/

4. 恢复进度

恢复进展如何？
V$RECOVERY_PROGRESS 仅在 RECOVERY 正在进行时填充。
还原操作不会填充此视图。
因此，如果我们认为恢复过程很慢，它真的处于恢复阶段，还是仍在从 RMAN 备份恢复？

这是恢复进度的示例：

22:27:38 SQL> select START_TIME,TYPE,ITEM,UNITS,SOFAR,TOTAL from v$recovery_progress;
START_TIME                  TYPE            ITEM                             UNITS                         SOFAR      TOTAL
--------------------------- --------------- -------------------------------- ------------------------ ---------- ---------
12-nov-14 16:08:10          Media Recovery  Average Apply Rate               KB/sec                        29713          0
12-nov-14 16:08:10          Media Recovery  Redo Applied                     Megabytes                    660747          0
12-nov-14 16:08:10          Media Recovery  Last Applied Redo                SCN+Time                          0          0
12-nov-14 11:28:16          Media Recovery  Checkpoint Time per Log          Seconds                           6          6
12-nov-14 11:28:16          Media Recovery  Standby Apply Lag                Seconds                           0          0

重做恢复率由多种因素决定：

PARALLEL_EXECUTION_MESSAGE_SIZE
这个参数的默认值可能不够大，因此考虑增加到它的最大操作系统相关值：

SQL> show parameter PARALLEL_EXECUTION_MESSAGE_SIZE
SQL> alter system set parallel_execution_message_size=65535 scope=spfile;

此参数更改需要重新启动/重新安装数据库。

硬件级别的本地 I/O 速率请咨询系统管理员/硬件供应商。
恢复并行性这是操作系统相关的 Oracle 将启动所需数量的并行进程来执行此任务。
如果你觉得需要手动指定，那么命令是：

SQL> RECOVER datafile x,y,z parallel (degree 32);

或者

SQL> recover parallel 32;

如果在上述调整后，重做应用率仍然不可接受，那么我们可以暂时将 db_block_checking 设置为 false 以尝试提高恢复性能。

RMAN 日志

默认情况下，RMAN 操作的结果写入标准输出。
没有默认的日志文件。
我们需要使用 SPOOL TRACE 选项捕获结果，或者将标准输出重定向到文件。

恢复数据文件会话的示例：

RMAN> spool trace to res5.out
RMAN> restore datafile 5;
RMAN> spool trace off

$ cat res5.out
RMAN> restore datafile 5;
Starting restore at 27 DEC 2011 14:05:03 [1]
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=143 devtype=DISK [2]
channel ORA_DISK_1: starting datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
restoring datafile 00005 to /opt/app/oracle/oradata/ORA102/example01.dbf [3]
channel ORA_DISK_1: reading from backup piece /opt/app/oracle/fra/ORA102/backupset/2011_12_27/o1_mf_nnndf_TAG20111227T122122_7hl7dm2n_.bkp [4]
channel ORA_DISK_1: restored backup piece 1
piece handle=/opt/app/oracle/fra/ORA102/backupset/2011_12_27/o1_mf_nnndf_TAG20111227T122122_7hl7dm2n_.bkp tag=TAG20111227T122122
channel ORA_DISK_1: restore complete, elapsed time: 00:00:15 [5]
Finished restore at 27 DEC 2011 14:05:19 [6]

从上面我们可以看出：

恢复开始的日期和时间
v$session 中的数据库会话 ID - 143
文件将恢复到的数据文件编号和名称
备份名称和标签
恢复此数据文件所用的时间，以及使用的通道
恢复完成的日期和时间

恢复会话示例：

RMAN> recover datafile 5;
Starting recover at 27 DEC 2011 14:05:55
using channel ORA_DISK_1
starting media recovery
channel ORA_DISK_1: starting archive log restore to default destination [1]
channel ORA_DISK_1: restoring archive log [2]
archive log thread=1 sequence=77
...
channel ORA_DISK_1: restoring archive log
archive log thread=1 sequence=89
channel ORA_DISK_1: reading from backup piece /opt/app/oracle/fra/ORA102/backupset/2011_12_27/o1_mf_annnn_TAG20111227T135926_7hlf4jrk_.bkp
channel ORA_DISK_1: restored backup piece 1
piece handle=/opt/app/oracle/fra/ORA102/backupset/2011_12_27/o1_mf_annnn_TAG20111227T135926_7hlf4jrk_.bkp tag=TAG20111227T135926
channel ORA_DISK_1: restore complete, elapsed time: 00:01:35
...
channel default: deleting archive log(s)
archive log filename=/opt/app/oracle/fra/ORA102/archivelog/2011_12_27/o1_mf_1_88_7hlfmkr4_.arc recid=116 stamp=770998049
channel default: deleting archive log(s) [3]
archive log filename=/opt/app/oracle/fra/ORA102/archivelog/2011_12_27/o1_mf_1_89_7hlfmbd8_.arc recid=104 stamp=770998043
media recovery complete, elapsed time: 00:00:01 [4]
Finished recover at 27 DEC 2011 14:07:32 [5]

归档日志恢复到默认归档目的地。在恢复之前，我们必须确保有空间可用于恢复这些归档日志。
如果归档日志不在磁盘上，则从备份中恢复它们。
恢复完成后，RMAN 会自动将它们从磁盘中删除。
恢复此数据文件所需的时间。
恢复完成的日期和时间。

监控 Oracle 数据库还原/恢复进度

在生产环境中，使用 RMAN 进行备份和恢复是一项常见任务。
因此，了解监视还原/恢复操作的提示和技术并确定它是否确实在工作、缓慢或者挂起非常重要。

通常，还原所用的时间应与备份所用的时间大致相同，甚至更长。
因此，如果备份需要 10 个小时才能完成，那么恢复到同一主机至少需要 10 个小时。

另一个好的指标是确定我们之前的还原/恢复操作的持续时间。
监控日志和视图并观察变化率。
还原和恢复操作非常耗费资源，因此了解进程是在工作还是挂起很重要。

用户管理的恢复日志

这是用户管理恢复的示例。
我们通过 SQL*Plus 执行恢复：

$ sqlplus
SQL*Plus: Release 10.2.0.5.0 - Production on Wed Dec 28 09:59:41 2011
Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.
Enter user-name: / as sysdba
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, Data Mining and Real Application Testing options
SQL> alter database datafile 5 offline; [1]
Database altered.
SQL> recover datafile 5; [2]
ORA-00279: change 2989857 generated at 12/27/2011 12:50:00 needed for thread 1
ORA-00289: suggestion :
/opt/app/oracle/fra/ORA102/archivelog/2011_12_28/o1_mf_1_79_7hnmjl2y_.arc
ORA-00280: change 2989857 for thread 1 is in sequence #79 [3]

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
ORA-00279: change 2989860 generated at 12/27/2011 12:50:01 needed for thread 1
ORA-00289: suggestion :
/opt/app/oracle/fra/ORA102/archivelog/2011_12_28/o1_mf_1_80_7hnmjmbc_.arc
ORA-00280: change 2989860 for thread 1 is in sequence #80
ORA-00278: log file
'/opt/app/oracle/fra/ORA102/archivelog/2011_12_28/o1_mf_1_79_7hnmjl2y_.arc' no
longer needed for this recovery [4]

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
auto [5]
ORA-00279: change 2989874 generated at 12/27/2011 12:50:39 needed for thread 1
ORA-00289: suggestion :
/opt/app/oracle/fra/ORA102/archivelog/2011_12_28/o1_mf_1_81_7hnmjmkf_.arc
ORA-00280: change 2989874 for thread 1 is in sequence #81
ORA-00278: log file
'/opt/app/oracle/fra/ORA102/archivelog/2011_12_28/o1_mf_1_80_7hnmjmbc_.arc' no
longer needed for this recovery
...
ORA-00279: change 2991001 generated at 12/27/2011 12:58:00 needed for thread 1
ORA-00289: suggestion :
/opt/app/oracle/fra/ORA102/archivelog/2011_12_28/o1_mf_1_87_7hnmjoc7_.arc
ORA-00280: change 2991001 for thread 1 is in sequence #87
ORA-00278: log file
'/opt/app/oracle/fra/ORA102/archivelog/2011_12_28/o1_mf_1_86_7hnmjkm0_.arc' no
longer needed for this recovery

Log applied.
Media recovery complete. [6]
SQL>
SQL> alter database datafile 5 online; [7]
Database altered.

使数据文件脱机以准备从用户管理的备份中恢复
从用户管理的备份中恢复数据文件后，将其恢复
恢复此文件所需的第一个归档日志
我们按下了 ENTER 键，因此要求 Oracle 应用请求的日志
如果有很多归档日志要应用并且它们都在归档目录中，请使用 Oracle 的 AUTO 选项来应用其余所需的归档日志。否则，我们将需要手动指定请求的每个存档日志，或者在提示输入每个存档日志时按 ENTER
恢复现已完成
将数据文件置于联机状态，从而使其可再次使用

媒体管理日志

如果从磁带恢复，请确认它确实是从磁带恢复，而不是等待媒体管理器为请求提供服务。
磁带是忙还是闲？
请媒体管理支持团队确认从磁带读取数据的速率。

之路教程 https://onitr oad .com

操作系统实用程序

正在恢复的文件的大小应该增加，直到其实际大小。
随着 Oracle 更新文件，时间戳也应该发生变化。
使用诸如“ls -lt”之类的操作系统实用程序来查看此信息。

$ ls -ltr [full path and file name being restored]

例如：

$ ls -ltr /database/db251/asbs/BLOB_DOC_IMAGES_B12.dbf

我们还可以使用操作系统实用程序（例如 vmstat、sar 和 iostat）来监控资源利用率。
硬件是否满负荷工作？
瓶颈在哪里？
主机上是否有其他 I/O 密集型操作发生？
如果需要，请安装 Oracle 的 OSWatcher 实用程序以获取更多信息。

如果文件正在恢复到 ASM，我们还应该能够检查它在 ASM 中的存在。
但请注意，我们可能只有在完全恢复后才能在 ASM 中看到它。

警报日志

恢复操作

只有 RMAN 还原操作会写入 alert.log。
用户管理的恢复会话不会出现在 alert.log 中，因为它们是在 Oracle 之外执行的。

如果我们在 RMAN RESTORE 操作期间在 alert.log 中看到损坏错误，请不要惊慌。
在恢复数据文件之前，RMAN 将检查它的存在和有效性。
如果文件已经在磁盘上但无效或者损坏，我们会报告它。
例如，在 Sun Sep 27 08:25:54 2015，我们看到数据文件 1043 被报告损坏：

Hex dump of (file 1043, block 1) in trace file /backup/claprd01/diag/rdbms/claprd01/CLAPRD01/trace/CLAPRD01_ora_14287316.trc
Corrupt block relative dba: 0x05000001 (file 1043, block 1)
Bad header found during kcvxfh v8
Data in bad block:
type: 0 format: 2 rdba: 0x05000001
last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x00000001
check value in block header: 0x4a7
computed block checksum: 0x0
Reading datafile '+SYSFILES/epicaccess50.dbf' for corruption at rdba: 0x05000001 (file 1043, block 1)
Reread (file 1043, block 1) found same corrupt data (no logical check)
Sun Sep 27 08:25:54 2015

稍后我们看到文件从有效备份中恢复，并且此数据文件上没有报告进一步的损坏：

Sun Sep 27 08:39:08 2015
Full restore complete of datafile 1043 +SYSFILES/epicaccess50.dbf.  Elapsed time: 1:27:39
 checkpoint is 90364602816
 last deallocation scn is 45006515743

恢复操作

所有恢复会话，无论是用户管理的还是 RMAN 的，也将写入 alert.log。
这是 RMAN 恢复会话的示例：

Tue Dec 27 14:05:55 EST 2011
alter database recover datafile list clear
Tue Dec 27 14:05:55 EST 2011Completed: alter database recover datafile list clear
Tue Dec 27 14:05:55 EST 2011
alter database recover if needed
 datafile 5
Media Recovery Start
 parallel recovery started with 2 processesORA-279 signalled during: alter database recover if needed
 datafile 5
...
Tue Dec 27 14:05:56 EST 2011
The input backup piece /opt/app/oracle/fra/ORA102/backupset/2011_12_27/o1_mf_annnn_TAG20111227T135926_7hlf4jrk_.bkp is in compressed format.Tue Dec 27 14:07:23 EST 2011 [1]
Archivelog restore complete. Elapsed time: 0:00:01 [2]
Archivelog restore complete. Elapsed time: 0:00:01 
...
Tue Dec 27 14:07:31 EST 2011
alter database recover logfile '/opt/app/oracle/fra/ORA102/archivelog/2011_12_27/o1_mf_1_77_7hlfmdc7_.arc'
...
Tue Dec 27 14:07:31 EST 2011
Media Recovery Log /opt/app/oracle/fra/ORA102/archivelog/2011_12_27/o1_mf_1_77_7hlfmdc7_.arc
ORA-279 signalled during: alter database recover logfile '/opt/app/oracle/fra/ORA102/archivelog/2011_12_27 /o1_mf_1_77_7hlfmdc7_.arc' [3]
...
Tue Dec 27 14:07:31 EST 2011
Media Recovery Log /opt/app/oracle/fra/ORA102/archivelog/2011_12_27/o1_mf_1_87_7hlfmk96_.arc
Tue Dec 27 14:07:31 EST 2011
Recovery of Online Redo Log: Thread 1 Group 1 Seq 88 Reading mem 0  Mem# 0: /opt/app/oracle/oradata/ORA102/redo01.log [4]
Tue Dec 27 14:07:31 EST 2011
Recovery of Online Redo Log: Thread 1 Group 2 Seq 89 Reading mem 0
  Mem# 0: /opt/app/oracle/oradata/ORA102/redo02.log
Tue Dec 27 14:07:31 EST 2011
Media Recovery Complete (ORA102)[5]

这实际上是一个压缩的备份。此信息仅显示在 alert.log 中，而不显示在 RMAN 还原日志中。
恢复归档日志所用的时间。
ORA-279 是信息性的 - 确认恢复所需的归档日志。
为了完全恢复，Oracle 还需要应用在线日志中的重做。
恢复结束。

日期：2020-09-17 00:11:25 来源：oir作者：oir

←MySQL 管理集群日志文件

MySQL 8.0：持久变量→

数据字典视图

1. RMAN 会话

2. 完成工作的百分比

3. 会话等待

4. 恢复进度

RMAN 日志

用户管理的恢复日志

媒体管理日志

操作系统实用程序

警报日志

恢复操作

恢复操作

目录