Oracle数据恢复顾问(DRA)

参考:Data Recovery Advisor - RMAN command line example (文档 ID 762339.1)

数据恢复顾问(Data Recovery Advisor)是一款Oracle数据库工具。该工具会自动诊断数据故障,确定并提出相应修复方案,并执行客户要求的修复。以上修复是基于数据故障是一个硬盘上持久性数据的损坏或者丢失。

数据恢复顾问会在遇到问题时,自动采集数据故障信息,并且,协助执行对于故障的修复。你可以手动修复一个数据故障,或者要求数据恢复顾问来为你执行修复动作。

DRA可以通过企业管理器(EM)Grid Control和/或者恢复管理器(RMAN)来访问。本文将概述在RMAN中使用的命令,结合Oracle MOS和自己的测试。

在RMAN中,有四种DRA的命令:

List Failure - 列出先前执行过的故障评估结果。可能的话,重新验证现有故障并关闭它们。Advise Failure - 提出手动和自动修复方案。Repair Failure - 通过运行由ADVISE FAILURE建议的最佳修复方案来自动修复故障。完成之后会重新验证现有故障。Change Failure - 使你可以改变故障的状态。

下面通过两个例子来说明DRA工具的用法

在测试之前使用RMAN对数据库进行全备

场景一、模拟控制文件丢失

关闭数据库,mv掉controlfile

[ora_tst@test rman]$ mv /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf.bak

启动数据库,报错

SQL> startup
ORACLE instance started.

Total System Global Area 1068937216 bytes
Fixed Size                  2166536 bytes
Variable Size             427819256 bytes
Database Buffers          624951296 bytes
Redo Buffers               14000128 bytes
ORA-00205: error in identifying control file, check alert log for more info

告警日志中错误提示:

ALTER DATABASE   MOUNT
ORA-00210: cannot open the specified control file
ORA-00202: control file: '/u01/oracle/TEST/db/apps_st/data/cntrl01.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory

很容易可以发现是由于cntrl01.dbf丢失导致数据库无法mount

下面通过DRA来检测故障,并修复

[ora_tst@test ~]$ rman target /

Recovery Manager: Release 11.1.0.7.0 - Production on Mon Apr 21 13:35:09 2014

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

connected to target database: TEST (not mounted)

RMAN> list failure;

using target database control file instead of recovery catalog
List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
782        CRITICAL OPEN      21-APR-14     Control file /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf is missing

通过list failure命令可以发现故障所在,使用list failure ### detail;  ( where ### equlas the failure number)可以查看故障的详细信息。

RMAN> list failure 782 detail;

List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
782        CRITICAL OPEN      21-APR-14     Control file /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf is missing
  Impact: Database cannot be mounted

下面可以通过advise failure;命令让Oracle告诉我们遇到这个故障,应该怎么做

RMAN> advise failure;

List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
782        CRITICAL OPEN      21-APR-14     Control file /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf is missing
  Impact: Database cannot be mounted

analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=383 device type=DISK
RMAN-06495: must explicitly specify DBID with SET DBID command
analyzing automatic repair options complete

Mandatory Manual Actions
========================
no manual actions available

Optional Manual Actions
=======================
no manual actions available

Automated Repair Options
========================
Option Repair Description
------ ------------------
1      Use a multiplexed copy to restore control file /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf 
  Strategy: The repair includes complete media recovery with no data loss
  Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_2401635629.hm

advise failure命令提示,我们可以通过拷贝冗余的controlfile来恢复出cntrl01.dbf,并且Oracle在/u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_2401635629.hm中给出具体的恢复脚本

恢复脚本,我们还可以通过repair failure preview命令来获得

RMAN> repair failure preview;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_2401635629.hm

contents of repair script:
   # restore control file using multiplexed copy
   restore controlfile from '/u01/oracle/TEST/db/apps_st/data/cntrl02.dbf';
   sql 'alter database mount';

Oracle提示我们运行

   restore controlfile from '/u01/oracle/TEST/db/apps_st/data/cntrl02.dbf';
   sql 'alter database mount';

命令来恢复cntrl01.dbf

执行上述命令

RMAN> restore controlfile from '/u01/oracle/TEST/db/apps_st/data/cntrl02.dbf';

Starting restore at 21-APR-14
using channel ORA_DISK_1

channel ORA_DISK_1: copied control file copy
output file name=/u01/oracle/TEST/db/apps_st/data/cntrl01.dbf
output file name=/u01/oracle/TEST/db/apps_st/data/cntrl02.dbf
output file name=/u01/oracle/TEST/db/apps_st/data/cntrl03.dbf
Finished restore at 21-APR-14

RMAN> sql 'alter database mount';

sql statement: alter database mount
released channel: ORA_DISK_1

这里我是手工执行的脚本,也可以通过

RMAN> repair failure;

来自动修复故障。

数据库起到了mount状态,说明故障修复成功。

RMAN> sql 'alter database open';

sql statement: alter database open

场景二、数据文件丢失

关闭数据库

[ora_tst@test trace]$ mv /u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf /u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf.bak

SQL> startup
ORA-32004: obsolete and/or deprecated parameter(s) specified
ORACLE instance started.

Total System Global Area 1068937216 bytes
Fixed Size                  2166536 bytes
Variable Size             427819256 bytes
Database Buffers          624951296 bytes
Redo Buffers               14000128 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 401 - see DBWR trace file
ORA-01110: data file 401: '/u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf'

启动数据库时报错,查看告警日志,如下:

ALTER DATABASE OPEN
Errors in file /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/trace/TEST_dbw0_27581.trc:
ORA-01157: cannot identify/lock data file 401 - see DBWR trace file
ORA-01110: data file 401: '/u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory

下面通过DRA来查看故障,并解决

RMAN> list failure;

using target database control file instead of recovery catalog
List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
825        HIGH     OPEN      21-APR-14     One or more non-system datafiles are missing

RMAN> list failure 825 detail;

List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
825        HIGH     OPEN      21-APR-14     One or more non-system datafiles are missing
  Impact: See impact for individual child failures
  List of child failures for parent failure ID 825
  Failure ID Priority Status    Time Detected Summary
  ---------- -------- --------- ------------- -------
  828        HIGH     OPEN      21-APR-14     Datafile 401: '/u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf' is missing
    Impact: Some objects in tablespace APPS_TS_TX_DATA might be unavailable

通过命令很明显的可以发现故障所在。

RMAN> advise failure;

List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
825        HIGH     OPEN      21-APR-14     One or more non-system datafiles are missing
  Impact: See impact for individual child failures
  List of child failures for parent failure ID 825
  Failure ID Priority Status    Time Detected Summary
  ---------- -------- --------- ------------- -------
  828        HIGH     OPEN      21-APR-14     Datafile 401: '/u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf' is missing
    Impact: Some objects in tablespace APPS_TS_TX_DATA might be unavailable

analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=382 device type=DISK
analyzing automatic repair options complete

Mandatory Manual Actions
========================
no manual actions available

Optional Manual Actions
=======================
1. If file /u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf was unintentionally renamed or moved, restore it
2. If a standby database is available, then consider a Data Guard switchover or failover

Automated Repair Options
========================
Option Repair Description
------ ------------------
1      Restore and recover datafile 401 
  Strategy: The repair includes complete media recovery with no data loss
  Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_556356707.hm

Oracle告诉我们/u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf文件renamed或者moved需要恢复,Restore and recover datafile 401。

RMAN> repair failure preview;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_556356707.hm

contents of repair script:
   # restore and recover datafile
   restore datafile 401;
   recover datafile 401;

可通过

   restore datafile 401;
   recover datafile 401;

来恢复datafile 401。本次测试通过repair failure;命令来自动修改故障

RMAN> repair failure;

 Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_556356707.hm

contents of repair script:
   # restore and recover datafile
   restore datafile 401;
   recover datafile 401;

Do you really want to execute the above repair (enter YES or NO)? YES
" YES" is an invalid response - please re-enter.

Do you really want to execute the above repair (enter YES or NO)? YES
executing repair script

Starting restore at 21-APR-14
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00401 to /u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf
channel ORA_DISK_1: reading from backup piece /u01/oracle/TEST/db/tech_st/11.1.0/dbs/0ep69g7a_1_1
channel ORA_DISK_1: piece handle=/u01/oracle/TEST/db/tech_st/11.1.0/dbs/0ep69g7a_1_1 tag=TAG20140421T110305
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:05:35
Finished restore at 21-APR-14

Starting recover at 21-APR-14
using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 22 is already on disk as file /u01/PROD/db/apps_st/data/archive/ARC1_22_825013351.arc
channel ORA_DISK_1: starting archived log restore to default destination
channel ORA_DISK_1: restoring archived log
archived log thread=1 sequence=21
channel ORA_DISK_1: reading from backup piece /u01/oracle/TEST/db/tech_st/11.1.0/dbs/0fp69laa_1_1
channel ORA_DISK_1: piece handle=/u01/oracle/TEST/db/tech_st/11.1.0/dbs/0fp69laa_1_1 tag=TAG20140421T123001
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
archived log file name=/u01/PROD/db/apps_st/data/archive/ARC1_21_825013351.arc thread=1 sequence=21
media recovery complete, elapsed time: 00:00:01
Finished recover at 21-APR-14
repair failure complete

Do you want to open the database (enter YES or NO)? YES
database opened

 

场景三、日志组丢失

关闭数据库

删除非当前日志组所有的日志文件

[ora_tst@test trace]$ rm -f /u01/oracle/TEST/db/apps_st/data/log01b.dbf

[ora_tst@test trace]$ rm -f /u01/oracle/TEST/db/apps_st/data/log01a.dbf

SQL> startup
ORA-32004: obsolete and/or deprecated parameter(s) specified
ORACLE instance started.

Total System Global Area 1068937216 bytes
Fixed Size                  2166536 bytes
Variable Size             427819256 bytes
Database Buffers          624951296 bytes
Redo Buffers               14000128 bytes
Database mounted.
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/u01/oracle/TEST/db/apps_st/data/log01b.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
ORA-00312: online log 1 thread 1: '/u01/oracle/TEST/db/apps_st/data/log01a.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

启动数据库时报错

使用DRA修复上述故障

RMAN> list failure;

using target database control file instead of recovery catalog
List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
999        CRITICAL OPEN      21-APR-14     Redo log group 1 is unavailable
1005       HIGH     OPEN      21-APR-14     Redo log file /u01/oracle/TEST/db/apps_st/data/log01a.dbf is missing
1002       HIGH     OPEN      21-APR-14     Redo log file /u01/oracle/TEST/db/apps_st/data/log01b.dbf is missing

RMAN> advise failure;

List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
999        CRITICAL OPEN      21-APR-14     Redo log group 1 is unavailable
1005       HIGH     OPEN      21-APR-14     Redo log file /u01/oracle/TEST/db/apps_st/data/log01a.dbf is missing
1002       HIGH     OPEN      21-APR-14     Redo log file /u01/oracle/TEST/db/apps_st/data/log01b.dbf is missing

analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=381 device type=DISK
analyzing automatic repair options complete

Mandatory Manual Actions
========================
no manual actions available

Optional Manual Actions
=======================
1. If file /u01/oracle/TEST/db/apps_st/data/log01a.dbf was unintentionally renamed or moved, restore it
2. If file /u01/oracle/TEST/db/apps_st/data/log01b.dbf was unintentionally renamed or moved, restore it
3. If a standby database is available, then consider a Data Guard switchover or failover

Automated Repair Options
========================
Option Repair Description
------ ------------------
1      Perform incomplete database recovery to SCN 5965141836565 
  Strategy: The repair includes point-in-time recovery with some data loss
  Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_3499717585.hm

RMAN> repair failure preview;

Strategy: The repair includes point-in-time recovery with some data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_3499717585.hm

contents of repair script:
   # database point-in-time recovery
   restore database until scn 5965141836565;
   recover database until scn 5965141836565;
   alter database open resetlogs;

RMAN> repair failure;

Strategy: The repair includes point-in-time recovery with some data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_3499717585.hm

contents of repair script:
   # database point-in-time recovery
   restore database until scn 5965141836565;
   recover database until scn 5965141836565;
   alter database open resetlogs;

Do you really want to execute the above repair (enter YES or NO)? YES
executing repair script

Starting restore at 21-APR-14
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to /u01/oracle/TEST/db/apps_st/data/system01.dbf
channel ORA_DISK_1: restoring datafile 00002 to /u01/oracle/TEST/db/apps_st/data/system02.dbf
channel ORA_DISK_1: restoring datafile 00003 to /u01/oracle/TEST/db/apps_st/data/system03.dbf
channel ORA_DISK_1: restoring datafile 00004 to /u01/oracle/TEST/db/apps_st/data/system04.dbf
channel ORA_DISK_1: restoring datafile 00005 to /u01/oracle/TEST/db/apps_st/data/system05.dbf
channel ORA_DISK_1: restoring datafile 00006 to /u01/oracle/TEST/db/apps_st/data/ctxd01.dbf
channel ORA_DISK_1: restoring datafile 00007 to /u01/oracle/TEST/db/apps_st/data/owad01.dbf
channel ORA_DISK_1: restoring datafile 00008 to /u01/oracle/TEST/db/apps_st/data/a_queue02.dbf
channel ORA_DISK_1: restoring datafile 00009 to /u01/oracle/TEST/db/apps_st/data/odm.dbf

................................

 

channel ORA_DISK_1: restoring datafile 00407 to /u01/oracle/TEST/db/apps_st/data/a_ref02.dbf
channel ORA_DISK_1: reading from backup piece /u01/oracle/TEST/db/tech_st/11.1.0/dbs/0ep69g7a_1_1
channel ORA_DISK_1: piece handle=/u01/oracle/TEST/db/tech_st/11.1.0/dbs/0ep69g7a_1_1 tag=TAG20140421T110305
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 01:16:04
Finished restore at 21-APR-14

Starting recover at 21-APR-14
using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 21 is already on disk as file /u01/PROD/db/apps_st/data/archive/ARC1_21_825013351.arc
archived log for thread 1 with sequence 22 is already on disk as file /u01/PROD/db/apps_st/data/archive/ARC1_22_825013351.arc
archived log for thread 1 with sequence 23 is already on disk as file /u01/PROD/db/apps_st/data/archive/ARC1_23_825013351.arc
archived log for thread 1 with sequence 24 is already on disk as file /u01/PROD/db/apps_st/data/archive/ARC1_24_825013351.arc
archived log file name=/u01/PROD/db/apps_st/data/archive/ARC1_21_825013351.arc thread=1 sequence=21
archived log file name=/u01/PROD/db/apps_st/data/archive/ARC1_22_825013351.arc thread=1 sequence=22
archived log file name=/u01/PROD/db/apps_st/data/archive/ARC1_23_825013351.arc thread=1 sequence=23
media recovery complete, elapsed time: 00:00:19
Finished recover at 21-APR-14

database opened
repair failure complete

                       

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章