Oracle數據恢復顧問(DRA)

參考:Data Recovery Advisor - RMAN command line example (文檔 ID 762339.1)

數據恢復顧問(Data Recovery Advisor)是一款Oracle數據庫工具。該工具會自動診斷數據故障,確定並提出相應修復方案,並執行客戶要求的修復。以上修復是基於數據故障是一個硬盤上持久性數據的損壞或者丟失。

數據恢復顧問會在遇到問題時,自動採集數據故障信息,並且,協助執行對於故障的修復。你可以手動修復一個數據故障,或者要求數據恢復顧問來爲你執行修復動作。

DRA可以通過企業管理器(EM)Grid Control和/或者恢復管理器(RMAN)來訪問。本文將概述在RMAN中使用的命令,結合Oracle MOS和自己的測試。

在RMAN中,有四種DRA的命令:

List Failure - 列出先前執行過的故障評估結果。可能的話,重新驗證現有故障並關閉它們。Advise Failure - 提出手動和自動修復方案。Repair Failure - 通過運行由ADVISE FAILURE建議的最佳修復方案來自動修復故障。完成之後會重新驗證現有故障。Change Failure - 使你可以改變故障的狀態。

下面通過兩個例子來說明DRA工具的用法

在測試之前使用RMAN對數據庫進行全備

場景一、模擬控制文件丟失

關閉數據庫,mv掉controlfile

[ora_tst@test rman]$ mv /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf.bak

啓動數據庫,報錯

SQL> startup
ORACLE instance started.

Total System Global Area 1068937216 bytes
Fixed Size                  2166536 bytes
Variable Size             427819256 bytes
Database Buffers          624951296 bytes
Redo Buffers               14000128 bytes
ORA-00205: error in identifying control file, check alert log for more info

告警日誌中錯誤提示:

ALTER DATABASE   MOUNT
ORA-00210: cannot open the specified control file
ORA-00202: control file: '/u01/oracle/TEST/db/apps_st/data/cntrl01.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory

很容易可以發現是由於cntrl01.dbf丟失導致數據庫無法mount

下面通過DRA來檢測故障,並修復

[ora_tst@test ~]$ rman target /

Recovery Manager: Release 11.1.0.7.0 - Production on Mon Apr 21 13:35:09 2014

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

connected to target database: TEST (not mounted)

RMAN> list failure;

using target database control file instead of recovery catalog
List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
782        CRITICAL OPEN      21-APR-14     Control file /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf is missing

通過list failure命令可以發現故障所在,使用list failure ### detail;  ( where ### equlas the failure number)可以查看故障的詳細信息。

RMAN> list failure 782 detail;

List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
782        CRITICAL OPEN      21-APR-14     Control file /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf is missing
  Impact: Database cannot be mounted

下面可以通過advise failure;命令讓Oracle告訴我們遇到這個故障,應該怎麼做

RMAN> advise failure;

List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
782        CRITICAL OPEN      21-APR-14     Control file /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf is missing
  Impact: Database cannot be mounted

analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=383 device type=DISK
RMAN-06495: must explicitly specify DBID with SET DBID command
analyzing automatic repair options complete

Mandatory Manual Actions
========================
no manual actions available

Optional Manual Actions
=======================
no manual actions available

Automated Repair Options
========================
Option Repair Description
------ ------------------
1      Use a multiplexed copy to restore control file /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf 
  Strategy: The repair includes complete media recovery with no data loss
  Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_2401635629.hm

advise failure命令提示,我們可以通過拷貝冗餘的controlfile來恢復出cntrl01.dbf,並且Oracle在/u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_2401635629.hm中給出具體的恢復腳本

恢復腳本,我們還可以通過repair failure preview命令來獲得

RMAN> repair failure preview;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_2401635629.hm

contents of repair script:
   # restore control file using multiplexed copy
   restore controlfile from '/u01/oracle/TEST/db/apps_st/data/cntrl02.dbf';
   sql 'alter database mount';

Oracle提示我們運行

   restore controlfile from '/u01/oracle/TEST/db/apps_st/data/cntrl02.dbf';
   sql 'alter database mount';

命令來恢復cntrl01.dbf

執行上述命令

RMAN> restore controlfile from '/u01/oracle/TEST/db/apps_st/data/cntrl02.dbf';

Starting restore at 21-APR-14
using channel ORA_DISK_1

channel ORA_DISK_1: copied control file copy
output file name=/u01/oracle/TEST/db/apps_st/data/cntrl01.dbf
output file name=/u01/oracle/TEST/db/apps_st/data/cntrl02.dbf
output file name=/u01/oracle/TEST/db/apps_st/data/cntrl03.dbf
Finished restore at 21-APR-14

RMAN> sql 'alter database mount';

sql statement: alter database mount
released channel: ORA_DISK_1

這裏我是手工執行的腳本,也可以通過

RMAN> repair failure;

來自動修復故障。

數據庫起到了mount狀態,說明故障修復成功。

RMAN> sql 'alter database open';

sql statement: alter database open

場景二、數據文件丟失

關閉數據庫

[ora_tst@test trace]$ mv /u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf /u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf.bak

SQL> startup
ORA-32004: obsolete and/or deprecated parameter(s) specified
ORACLE instance started.

Total System Global Area 1068937216 bytes
Fixed Size                  2166536 bytes
Variable Size             427819256 bytes
Database Buffers          624951296 bytes
Redo Buffers               14000128 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 401 - see DBWR trace file
ORA-01110: data file 401: '/u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf'

啓動數據庫時報錯,查看告警日誌,如下:

ALTER DATABASE OPEN
Errors in file /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/trace/TEST_dbw0_27581.trc:
ORA-01157: cannot identify/lock data file 401 - see DBWR trace file
ORA-01110: data file 401: '/u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory

下面通過DRA來查看故障,並解決

RMAN> list failure;

using target database control file instead of recovery catalog
List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
825        HIGH     OPEN      21-APR-14     One or more non-system datafiles are missing

RMAN> list failure 825 detail;

List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
825        HIGH     OPEN      21-APR-14     One or more non-system datafiles are missing
  Impact: See impact for individual child failures
  List of child failures for parent failure ID 825
  Failure ID Priority Status    Time Detected Summary
  ---------- -------- --------- ------------- -------
  828        HIGH     OPEN      21-APR-14     Datafile 401: '/u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf' is missing
    Impact: Some objects in tablespace APPS_TS_TX_DATA might be unavailable

通過命令很明顯的可以發現故障所在。

RMAN> advise failure;

List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
825        HIGH     OPEN      21-APR-14     One or more non-system datafiles are missing
  Impact: See impact for individual child failures
  List of child failures for parent failure ID 825
  Failure ID Priority Status    Time Detected Summary
  ---------- -------- --------- ------------- -------
  828        HIGH     OPEN      21-APR-14     Datafile 401: '/u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf' is missing
    Impact: Some objects in tablespace APPS_TS_TX_DATA might be unavailable

analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=382 device type=DISK
analyzing automatic repair options complete

Mandatory Manual Actions
========================
no manual actions available

Optional Manual Actions
=======================
1. If file /u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf was unintentionally renamed or moved, restore it
2. If a standby database is available, then consider a Data Guard switchover or failover

Automated Repair Options
========================
Option Repair Description
------ ------------------
1      Restore and recover datafile 401 
  Strategy: The repair includes complete media recovery with no data loss
  Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_556356707.hm

Oracle告訴我們/u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf文件renamed或者moved需要恢復,Restore and recover datafile 401。

RMAN> repair failure preview;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_556356707.hm

contents of repair script:
   # restore and recover datafile
   restore datafile 401;
   recover datafile 401;

可通過

   restore datafile 401;
   recover datafile 401;

來恢復datafile 401。本次測試通過repair failure;命令來自動修改故障

RMAN> repair failure;

 Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_556356707.hm

contents of repair script:
   # restore and recover datafile
   restore datafile 401;
   recover datafile 401;

Do you really want to execute the above repair (enter YES or NO)? YES
" YES" is an invalid response - please re-enter.

Do you really want to execute the above repair (enter YES or NO)? YES
executing repair script

Starting restore at 21-APR-14
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00401 to /u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf
channel ORA_DISK_1: reading from backup piece /u01/oracle/TEST/db/tech_st/11.1.0/dbs/0ep69g7a_1_1
channel ORA_DISK_1: piece handle=/u01/oracle/TEST/db/tech_st/11.1.0/dbs/0ep69g7a_1_1 tag=TAG20140421T110305
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:05:35
Finished restore at 21-APR-14

Starting recover at 21-APR-14
using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 22 is already on disk as file /u01/PROD/db/apps_st/data/archive/ARC1_22_825013351.arc
channel ORA_DISK_1: starting archived log restore to default destination
channel ORA_DISK_1: restoring archived log
archived log thread=1 sequence=21
channel ORA_DISK_1: reading from backup piece /u01/oracle/TEST/db/tech_st/11.1.0/dbs/0fp69laa_1_1
channel ORA_DISK_1: piece handle=/u01/oracle/TEST/db/tech_st/11.1.0/dbs/0fp69laa_1_1 tag=TAG20140421T123001
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
archived log file name=/u01/PROD/db/apps_st/data/archive/ARC1_21_825013351.arc thread=1 sequence=21
media recovery complete, elapsed time: 00:00:01
Finished recover at 21-APR-14
repair failure complete

Do you want to open the database (enter YES or NO)? YES
database opened

 

場景三、日誌組丟失

關閉數據庫

刪除非當前日誌組所有的日誌文件

[ora_tst@test trace]$ rm -f /u01/oracle/TEST/db/apps_st/data/log01b.dbf

[ora_tst@test trace]$ rm -f /u01/oracle/TEST/db/apps_st/data/log01a.dbf

SQL> startup
ORA-32004: obsolete and/or deprecated parameter(s) specified
ORACLE instance started.

Total System Global Area 1068937216 bytes
Fixed Size                  2166536 bytes
Variable Size             427819256 bytes
Database Buffers          624951296 bytes
Redo Buffers               14000128 bytes
Database mounted.
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/u01/oracle/TEST/db/apps_st/data/log01b.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
ORA-00312: online log 1 thread 1: '/u01/oracle/TEST/db/apps_st/data/log01a.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

啓動數據庫時報錯

使用DRA修復上述故障

RMAN> list failure;

using target database control file instead of recovery catalog
List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
999        CRITICAL OPEN      21-APR-14     Redo log group 1 is unavailable
1005       HIGH     OPEN      21-APR-14     Redo log file /u01/oracle/TEST/db/apps_st/data/log01a.dbf is missing
1002       HIGH     OPEN      21-APR-14     Redo log file /u01/oracle/TEST/db/apps_st/data/log01b.dbf is missing

RMAN> advise failure;

List of Database Failures
=========================

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
999        CRITICAL OPEN      21-APR-14     Redo log group 1 is unavailable
1005       HIGH     OPEN      21-APR-14     Redo log file /u01/oracle/TEST/db/apps_st/data/log01a.dbf is missing
1002       HIGH     OPEN      21-APR-14     Redo log file /u01/oracle/TEST/db/apps_st/data/log01b.dbf is missing

analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=381 device type=DISK
analyzing automatic repair options complete

Mandatory Manual Actions
========================
no manual actions available

Optional Manual Actions
=======================
1. If file /u01/oracle/TEST/db/apps_st/data/log01a.dbf was unintentionally renamed or moved, restore it
2. If file /u01/oracle/TEST/db/apps_st/data/log01b.dbf was unintentionally renamed or moved, restore it
3. If a standby database is available, then consider a Data Guard switchover or failover

Automated Repair Options
========================
Option Repair Description
------ ------------------
1      Perform incomplete database recovery to SCN 5965141836565 
  Strategy: The repair includes point-in-time recovery with some data loss
  Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_3499717585.hm

RMAN> repair failure preview;

Strategy: The repair includes point-in-time recovery with some data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_3499717585.hm

contents of repair script:
   # database point-in-time recovery
   restore database until scn 5965141836565;
   recover database until scn 5965141836565;
   alter database open resetlogs;

RMAN> repair failure;

Strategy: The repair includes point-in-time recovery with some data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_3499717585.hm

contents of repair script:
   # database point-in-time recovery
   restore database until scn 5965141836565;
   recover database until scn 5965141836565;
   alter database open resetlogs;

Do you really want to execute the above repair (enter YES or NO)? YES
executing repair script

Starting restore at 21-APR-14
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to /u01/oracle/TEST/db/apps_st/data/system01.dbf
channel ORA_DISK_1: restoring datafile 00002 to /u01/oracle/TEST/db/apps_st/data/system02.dbf
channel ORA_DISK_1: restoring datafile 00003 to /u01/oracle/TEST/db/apps_st/data/system03.dbf
channel ORA_DISK_1: restoring datafile 00004 to /u01/oracle/TEST/db/apps_st/data/system04.dbf
channel ORA_DISK_1: restoring datafile 00005 to /u01/oracle/TEST/db/apps_st/data/system05.dbf
channel ORA_DISK_1: restoring datafile 00006 to /u01/oracle/TEST/db/apps_st/data/ctxd01.dbf
channel ORA_DISK_1: restoring datafile 00007 to /u01/oracle/TEST/db/apps_st/data/owad01.dbf
channel ORA_DISK_1: restoring datafile 00008 to /u01/oracle/TEST/db/apps_st/data/a_queue02.dbf
channel ORA_DISK_1: restoring datafile 00009 to /u01/oracle/TEST/db/apps_st/data/odm.dbf

................................

 

channel ORA_DISK_1: restoring datafile 00407 to /u01/oracle/TEST/db/apps_st/data/a_ref02.dbf
channel ORA_DISK_1: reading from backup piece /u01/oracle/TEST/db/tech_st/11.1.0/dbs/0ep69g7a_1_1
channel ORA_DISK_1: piece handle=/u01/oracle/TEST/db/tech_st/11.1.0/dbs/0ep69g7a_1_1 tag=TAG20140421T110305
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 01:16:04
Finished restore at 21-APR-14

Starting recover at 21-APR-14
using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 21 is already on disk as file /u01/PROD/db/apps_st/data/archive/ARC1_21_825013351.arc
archived log for thread 1 with sequence 22 is already on disk as file /u01/PROD/db/apps_st/data/archive/ARC1_22_825013351.arc
archived log for thread 1 with sequence 23 is already on disk as file /u01/PROD/db/apps_st/data/archive/ARC1_23_825013351.arc
archived log for thread 1 with sequence 24 is already on disk as file /u01/PROD/db/apps_st/data/archive/ARC1_24_825013351.arc
archived log file name=/u01/PROD/db/apps_st/data/archive/ARC1_21_825013351.arc thread=1 sequence=21
archived log file name=/u01/PROD/db/apps_st/data/archive/ARC1_22_825013351.arc thread=1 sequence=22
archived log file name=/u01/PROD/db/apps_st/data/archive/ARC1_23_825013351.arc thread=1 sequence=23
media recovery complete, elapsed time: 00:00:19
Finished recover at 21-APR-14

database opened
repair failure complete

                       

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章