Oracle corrupt block(壞塊) 詳解

一. 壞塊說明


1.1 相關鏈接


       在看壞塊之前,先看幾個相關的鏈接,在後面的說明中,會用到鏈接中的一些內容。


 


ORA-600 各個參數含義說明


http://blog.csdn.net/tianlesoftware/article/details/6645809


 


Oracle 不同故障的恢復方案


http://blog.csdn.net/tianlesoftware/article/details/6106178


 


Current online Redo 和 Undo 損壞的處理方法


http://blog.csdn.net/tianlesoftware/article/details/6261475


 


Oracle DBV 工具 說明


http://blog.csdn.net/tianlesoftware/article/details/5015164


 


Oracle BBED 工具 說明


http://blog.csdn.net/tianlesoftware/article/details/5006580


 


Oracle bbed 五個 實用示例


http://blog.csdn.net/tianlesoftware/article/details/6684505


 


Oracle datafile block 格式 說明


http://blog.csdn.net/tianlesoftware/article/details/6654786


 


MOS 上的相關文檔:


       RMAN: Block-Level Media Recovery - Concept & Example [ID 144911.1]    


       FAQ:Physical Corruption [ID 403747.1]


       MasterNote for Handling Oracle Database Corruption Issues [ID 1088018.1]


       HandlingOracle Block Corruptions in Oracle7/8/8i/9i/10g/11g [ID 28814.1]


       ExtractingData from a Corrupt Table using ROWID Range Scans in Oracle8 and higher [ID61685.1]








官網的鏈接:


      ValidatingDatabase Files and Backups


       http://download.oracle.com/docs/cd/E11882_01/backup.112/e10642/rcmvalid.htm#CHDCEHFD


 


      Performing Block Media Recovery


       http://download.oracle.com/docs/cd/E11882_01/backup.112/e10642/rcmblock.htm#CHDCBIIB


 


老熊的一篇文章:


       Oracle怎樣標記壞塊及一次數據恢復


       http://www.laoxiong.net/how_to_mark_corruption_block_and_recovery.html


 


1.2  block corruption 分類


For purposes of the paper we will categorize corruption under three general areasand give best practices for prevention, detection and repair for each:


        Memory corruption


        Logical corruption(soft corruption)


       Media corruption(Physicalcorruption)








       Physicalor structural corruption can be defined as damage to internal data structureswhich do not allow Oracle software to find user data within the database.        Logical corruption involves Oracle beingable to find the data, but the data values are incorrect as far as the end useris concerned.


       Physica lcorruption due to hardware or software can occur in two general places -- inmemory (including various IO buffers and the Oracle buffer cache) or on disk.Operator error such as overwriting a file can also be defined as a physicalcorruption.   Logical corruption on theother hand is usually due to end-user error or non-robust(?) applicationdesign. A small physical corruption such as a single bit flip may be mistakenfor a logical error. 








1.3 查看blockcorruption


       可以通過v$database_block_corruption 查看database 的corruption。 官網對該視圖的定義如下:


       V$DATABASE_BLOCK_CORRUPTION displaysinformation about database blocks that were corrupted after the last backup.


 


SYS@dave2(db2)>desc v$database_block_corruption


 Name                                     Null?    Type


 ------------------------------------------------- ----------------------------


 FILE#                                             NUMBER


 BLOCK#                                            NUMBER


 BLOCKS                                            NUMBER


 CORRUPTION_CHANGE#                                 NUMBER


 CORRUPTION_TYPE                                    VARCHAR2(9)


 


SYS@dave2(db2)> select * fromv$database_block_corruption;


no rows selected


 


       這裏沒有壞塊,爲了演示這個效果,我用BBED 製造一個壞塊,然後在用bbed 恢復過來。


 


先確定block:


SYS@dave2(db2)> select * from dvd;


 


JOB


--------------------------------------------------------------------------------


DMM  is DBA!


dmme like Oracle!


 


SYS@dave2(db2)>select


 2  rowid,


 3  dbms_rowid.rowid_relative_fno(rowid)rel_fno,


  4 dbms_rowid.rowid_block_number(rowid)blockno,


 5 dbms_rowid.rowid_row_number(rowid) rowno


 6  from dvd;


 


ROWID                 REL_FNO    BLOCKNO     ROWNO


------------------ ---------- --------------------


AAAN9hAAGAAAAAcAAA          6         28          0


AAAN9hAAGAAAAAcAAB          6         28          1


 


用bbed 修改block 28:


[oracle@db2 ~]$ bbed parfile=/u01/bbed.par


Password: 


 


BBED: Release 2.0.0.0.0 - LimitedProduction on Mon Aug 15 22:15:10 2011


Copyright (c) 1982, 2005, Oracle.  All rights reserved.


*************!!! For Oracle Internal Use only !!! ***************


BBED> set dba 6,28


       DBA             0x0180001c(25165852 6,28)


BBED> map                  


 File: /u01/app/oracle/oradata/dave2/dave01.dbf(6)


 Block: 28                                   Dba:0x0180001c


------------------------------------------------------------


 KTBData Block (Table/Cluster)


 struct kcbh, 20 bytes                      @0       


 struct ktbbh, 96 bytes                     @20      


 struct kdbh, 14 bytes                      @124     


 struct kdbt[1], 4 bytes                    @138     


 sb2kdbr[2]                               @142     


 ub1freespace[8005]                       @146     


 ub1rowdata[37]                           @8151    


 ub4tailchk                               @8188    


 


BBED> d /voffset 0 count 128


 File: /u01/app/oracle/oradata/dave2/dave01.dbf(6)


 Block: 28     Offsets:    0 to  127 Dba:0x0180001c


-------------------------------------------------------


 06a20000 1c008001 f4a90780 00000104 l.¢......ô©......


 f5b40000 01000000 61df0000 f4a90780 lõ´......aß..ô©..


 00000000 03003200 19008001 03001000 l......2.........


 3b0a0000 00000000 00000000 00800000 l;...............


 dda90780 00000000 00000000 00000000 lÝ©..............


 00000000 00000000 00000000 00000000 l................


 0000000000000000 00000000 00000000 l ................


 00000000 00000000 00000000 00010200 l................


 


 <16 bytes per line>


 


BBED> modify /x 12345678 offset 0


 File: /u01/app/oracle/oradata/dave2/dave01.dbf(6)


 Block: 28               Offsets:    0to  127           Dba:0x0180001c


------------------------------------------------------------------------


 12345678 1c008001 f4a90780 00000104 f5b4000001000000 61df0000 f4a90780 


 00000000 03003200 19008001 03001000 3b0a000000000000 00000000 00800000 


 dda90780 00000000 00000000 00000000 0000000000000000 00000000 00000000 


 00000000 00000000 00000000 00000000 0000000000000000 00000000 00010200 


 


 <32 bytes per line>


 


BBED> sum apply


Check value for File 6, Block 28:


current = 0x5ab7, required = 0x5ab7


 


SYS@dave2(db2)>alter system flush buffer_cache;


System altered.


 


SYS@dave2(db2)> select * from dvd;


select * from dvd


*


ERROR at line 1:


ORA-01578: ORACLE data block corrupted(file # 6, block # 28)


ORA-01110: data file 6:'/u01/app/oracle/oradata/dave2/dave01.dbf'


 


這裏提示塊有錯誤,我們查看下一下:v$database_block_corruption:


 


SYS@dave2(db2)> select * fromv$database_block_corruption;


no rows selected


 


       這裏顯示爲空,但是我們之前看該視圖的定義的時候,說是自上次備份以來的壞塊。所以這裏我們驗證一下:


RMAN> backup validate datafile 6;


 


Starting backupat 15-AUG-11


using target database control file insteadof recovery catalog


allocated channel: ORA_DISK_1


channel ORA_DISK_1: sid=141 devtype=DISK


channel ORA_DISK_1: starting full datafilebackupset


channel ORA_DISK_1: specifying datafile(s)in backupset


input datafile fno=00006 name=/u01/app/oracle/oradata/dave2/dave01.dbf


channel ORA_DISK_1: backup set complete,elapsed time: 00:00:02


Finished backup at 15-AUG-11


 


在次select 查詢:


SYS@dave2(db2)>  select * from v$database_block_corruption;


 


FILE#    BLOCK#     BLOCKSCORRUPTION_CHANGE# CORRUPTIO


---------- ---------- ---------------------------- ---------


        6         15          1                  0 CORRUPT


         6        28          1                  0 CORRUPT


 


這次就查詢到結果了。 視圖顯示block的狀態爲corrupt。 對於該類型,共有一下幾種:


       (1)ALL ZERO:Block header on disk contained only zeros. The block may be valid ifit was never filled and if it is in an Oracle7 file. The buffer will bereformatted to the Oracle8 standard for an empty block.


       (2)FRACTURED: Block header looks reasonable, but the front and back of the blockare different versions.


       (3)CHECKSUM: optional check value shows that the block is not self-consistent.It is impossible to determine exactly why the check value fails, but itprobably fails because sectors in the middle of the block are from differentversions.


       (4)CORRUPT: Block is wrongly identified or is not a data block (for example,the data block address is missing)


       (5)LOGICAL: Specifies the range is for logically corrupt blocks.CORRUPTION_CHANGE# will have a nonzero value.


 


現在用BBED還原block:


BBED> revert


All changes made in this session will berolled back. Proceed? (Y/N) y


Reverted file'/u01/app/oracle/oradata/dave2/dave01.dbf', block 28


Warning: contents of previous BIFILE willbe lost. Proceed? (Y/N) y


 


BBED> sum apply


Check value for File 6, Block 28:


current = 0xb4f5, required = 0xb4f5


 


在次查詢,block 正常:


SYS@dave2(db2)> alter system flushbuffer_cache;


System altered.


 


SYS@dave2(db2)> select * from dvd;


JOB


--------------------------------------------------------------------------------


DMM  is DBA!


dmme like Oracle!


 


但是注意我們的v$database_block_corruption 視圖:


SYS@dave2(db2)> select * fromv$database_block_corruption;


FILE#    BLOCK#     BLOCKSCORRUPTION_CHANGE# CORRUPTIO


---------- ---------- ---------------------------- ---------


        6         15          1                  0 CORRUPT


        6         28          1                  0 CORRUPT


 


corruption block的信息還存在裏面。 之前經過該視圖和備份有關,我們用rman validate 一下datafile,之後就ok了。


 


RMAN> backup validate datafile 6;


 


Starting backupat 15-AUG-11


using channel ORA_DISK_1


channel ORA_DISK_1: starting full datafilebackupset


channel ORA_DISK_1: specifying datafile(s)in backupset


input datafile fno=00006name=/u01/app/oracle/oradata/dave2/dave01.dbf


channel ORA_DISK_1: backup set complete,elapsed time: 00:00:01


Finished backup at 15-AUG-11


 


SYS@dave2(db2)>select * from v$database_block_corruption;


 


FILE#    BLOCK#     BLOCKSCORRUPTION_CHANGE# CORRUPTIO


---------- ---------- ---------------------------- ---------


        6         15          1                  0 CORRUPT


 


--說明,我們修改的是block 28,它已經消失了。block 15 不是我們這本次測試修改的。 它是歷史遺留問題。這裏就不討論了。


 


如果用rmanvalidate 之後還沒有消失,可能是oracle的bug,參考:


       V$Database_Block_CorruptionDoes not clear after Block Recover Command [ID 422889.1]


 


1.4 使用RMAN 驗證和recover corruption


       之前有整理相關的文章,參考:


               RMAN 驗證 數據文件 和 備份 的有效性


              http://blog.csdn.net/tianlesoftware/article/details/6460464


       


       RMAN的默認validate 只驗證物理壞塊,不驗證邏輯壞塊。


 


驗證所有datafile 和歸檔是否有物理壞塊:


RMAN>backup validate database archivelogall;


或者


RMAN>backup validate database;


 


驗證所有datafile 和歸檔的物理壞塊和邏輯壞塊:


RMAN>backup validate check logicaldatabase archivelog all;


或者:


RMAN>backup validate check logicaldatabase;


 


       注意: 如果加上了archivelog all,就必須要有歸檔文件存在。


 


在Oracle 11g裏還可以單獨驗證某一個數據塊: 


       RMAN>VALIDATE DATAFILE 1 BLOCK 10;


 


       對於物理壞塊,我們可以通過recover database 或者recover datafile 來解決,但是對於邏輯壞塊這種方法就不行。 在後面單獨講到邏輯壞塊時有說明。對於邏輯壞塊可以嘗試對對象進行重建,如重建索引,重建表在導入數據。


       


       對於物理壞塊,如果不使用recover,那麼塊上的data 基本是丟失的。 我們可以採用相關的event或者通過rowid來跳過block,導出數據。 要保證塊上數據不丟失,就需要通過RMAN有效的備份來進行recover。


 


相關的recover 命令如下:


RMAN>blockrecover datafile 6 block 15;


或者:


RMAN> blockrecover corruption list;


該命令recover 的所有block 來自v$database_block_corruption視圖。


 


RMAN : Block-Level Media Recovery - Concept& Example [ID 144911.1]


 


二. Memory Corruption


2.1 Background


       Oracle allocates both shared and private memory. Shared memory is allocated when an Oracle instance starts and all processes(or threads) connecting to an Oracle database can access it. Oracle softwaredefines how this shared resource is accessed to prevent multiple processes fromsimultaneously writing to the same address. It also has to recover any incomplete changes made to memory by aprocess that dies abnormally.  The amountof shared memory allocated is static in size and is only freed when theinstance is shut down.  Private memory isallocated and freed as needed by each process (or thread) at the OS level. 


       Corruptionis more likely to occur within shared memory than private memory so we focusattention to the structures and algorithms used within shared memory (alsoknown as the Oracle SGA). 


       The SGA is divided into four portions - fixed, variable,Database Buffer cache, and Redo log buffer.  A diagram appears below. 


                          








2.2 Definition 


       Memory corruption can be defined as inconsistencies in the data structures that arerelated to handling memory.  Thisinconsistency could appear in any of the different parts of memory discussedabove.  As discussed in the sectionabove, corruption in memory can be caused either in the SGA or the PGA. 


       Onlycorruption in the database buffer cache portion of the SGA can potentially leadto data loss. 


       Thisis termed as ‘Cache corruption’ and is discussed in detail below.  Corruption in theother parts of SGA do not result in loss of data, but can still cause theinstance to crash.  On the otherhand, a corruption in the PGA causes only the corresponding process tocrash.  If this process is updating ablock in the buffer cache when this happens, then the background process, PMONdoes the necessary recovery on the block being changed by this process.  SMON and otherprocesses will rollback any other uncommitted data. 


 


2.3 Cache corruption


       TheOracle buffer cache is a mechanism where frequently accessed blocks are storedin memory for quicker access. The cache also maintains older versions of blocksfor consistent read purposes.  If thereis a corruption in this part of memory then there is a possibility of loss ofdata. 


       Theforeground processes read Oracle blocks from the disk into the buffercache.  There arecertain checks done on the data block when it is read from the disk.  


       For example, one of the checks is to compare the Incarnation Number (INC) andSequence Number (SEQ) data structures from the header of the data block withthe INCSEQ structure in the footer to make sure that the block versions match.This is done to avoid reading a block from disk whose header is corrupt. 


              


                           


The structuresdiscussed above is specific to version 7. 


 


2.4 Causes 


       In general, memory corruption is usually identified by abackground or shadow  process whenever ittries to access the part of memory that is corrupted.  Cache corruption, which might cause a loss ofdata, is usually caught by:


       (1)the shadow process when trying to read or update a block in memory, 


       (2)by the background process, DBWR, when trying to write a dirtied block frommemory to disk,


       (3)by the background process, PMON, while performing block recovery.  


 


Common causes include: 


       (1)Operatingsystem bugs causing bad reads or bad writes 


       (2)Hardware issues 


       (3)Oracle bug 


       (4)Non-Oracle program attaching and illegally writing to the sameshared memory address 


 


2.5  Prevention 


       There is very little that can be done from the user’s perspective to prevent memorycorruption. An INIT.ORA parameter, DB_BLOCK_CHECKSUM can be set to true.  


       This enables another check where the block checksum is calculated and compared withthe checksum stored in the block header. If they aredifferent then the block is considered corrupt on disk, and the block is notread into memory. This prevents corrupt blocks from entering the cache.When a block is changed and being written to disk from memory, DBWR calculatesa new checksum for the block by summing up its contents and writes it in theblock header.  


       There is a CPU overhead associated with this parameter since the checksum is calculated every time a block is read and written. 


       A similar parameter, LOG_BLOCK_CHECKSUM, can be set totrue in the INIT.ORA for verifying the records written to the redo log buffer.  This extra test prevents bad redo from beingapplied to a block in cache during recovery. There is CPU overhead for reasons similar to the DB_BLOCK_CHECKSUM. 


       The best form of corruption prevention from occurring in a production environmentis to test the hardware, operating system, database, application and so on forbugs before rolling them into production. This is also true when introducing new hardware, patches and upgrades. 


 


2.6  Detection 


       Memorycorruption can be detected from errors in the operating system logs indicatingany kind of memory problems.  These canalso be found from the Oracle ALERT.LOG by certain ORA-600 errors.  The first argument ofthe ORA-600 could be from 17000 through 17999 in case of memory corruption.  


       Somecache corruption can be detected by ORA-600s in the range from 2000 through8000.  One thing to remember is that notall ORA-600s imply memory corruption.


       對於memory corruption的錯誤提示,是ORA-600的第一個參數值從17000到17999或者2000到8000. 這裏只是一種可能,並不完全肯定。


       ORA-600 各個參數含義說明


       http://blog.csdn.net/tianlesoftware/article/details/6645809


 


2.7  Repair 








Location 




Data loss 




Repair 








SGA - Buffer cache 




Probable data loss 




⇒ If the corrupt block has been written to disk, then the object to which the block belongs to has to be repaired by methods discussed under Media corruption. ⇒ SHUTDOWN and STARTUP the instance ⇒ To diagnose the causes for the corruption, call Oracle support with appropriate files listed below the table. 








SGA - Redo buffer 




No data loss 




⇒ SHUTDOWN and STARTUP the instance. ⇒ To diagnose the causes for the corruption, call Oracle support with appropriate files listed below the table 








SGA - Shared SQL Area 




No data loss 




⇒ SHUTDOWN and STARTUP the instance. ⇒ To diagnose the causes for the corruption, call Oracle support with appropriate files listed below the table 








PGA 




No data loss 




⇒ To diagnose the causes for the corruption, call Oracle support with appropriate files listed below the table 




 


       To diagnose the cause of the corruption, call Oracle support with the followinginformation: 


       (1)INIT.ORA file 


       (2)ALERT.LOG 


       (3)Trace files for any ORA-600s found in the directory specified by theINIT.ORA parameter, USER_DUMP_DEST 


       (4)Heap dump : this can be obtained by executing the following commandin SVRMGR or SQLDBA: 


       SQL>ALTER SESSION SET EVENTS ‘IMMEDIATE TRACE NAME HEAPDUMP, LEVEL 10’; 


       This creates a trace file in the USER_DUMP_DEST. 


       (5)Reproducibletest case 


       (6)Thorough history of events that led to the corruption 


       (7)Record any noticeable changes to the environment such as newINIT.ORA parameters, new code, patches, and upgrades. 


 


三. Logical Corruption (soft corruption)


       LogicalCorruption can be defined as a situation where the actual data is not corruptedin a data block but a query results in a wrong set of data due to a problem inthe way data was loaded into the database or due to a misexecution of theoptimizer path.  A logical corruption is not an inconsistencyin a data block but an inconsistency in the result of a query. 


       邏輯壞塊通常不是data block的不一致,而是查詢結果的不一致。


       Forexample, a query that is expected to fetch 5 rows might result in 10 rowsbecause the data in the table was duplicated due to the lack of a primary orunique key. 


       比如我們查詢5行記錄,實際卻返回10行。 


 


3.1 Causes 


       Badapplication design that lacks validation or proper integrity checks OptimizerBugs (in rare cases) 


       通常是應用設計不完善或者是優化器的bug造成。


 


3.2 Prevention 


       The only kind of prevention methodology for avoiding user errors is to test thatapplications return valid results thoroughly before implementing them inproduction environments.   


       There cannot be a way to prevent logical corruption caused due to optimizer problemsunless the appropriate patches are applied or the database is upgraded to themost recent version. 


 


3.3 Detection 


       When logical corruption is caused by user errors, they are more difficult to detectthan those caused by optimizer problems. In case of user errors, the user should have a good knowledge of theapplication that is being run to identify inconsistencies in the queryresults.  In the case of optimizerproblems, inconsistencies may be accompanied by changes in the query executionplan leading to a different response time. 


       Also one may notice, invalid results only when using the cost based optimizer rather than the rule based optimizer. Other changes inthe query path can be caused by changing the optimizer mode, specifying a different optimizer hint, dropping or creatingindexes, or analyzing objects to generate new statistics. 


 


3.4 Repair 


       Whenthese problems are caused by user errors, they can be fixed by making theapplication more robust.  


       解決這個問題需要完善系統的設計。


       Fo rexample, if invalid data is successfully inserted into a table because of alack of integrity checking , then constraints should be created and the invaliddata will have to be found and deleted. In case of optimizer problems, contactWorldwide Customer Support Services to determine if this is a bug with anavailable patch.  


       They will need to be provided with the query, EXPLAIN PLAN and potentially theexport dump of the tables involved. 


 


四. Media Corruption(physical corruption )


       Media corruption can be defined as a situation where an inconsistency has occurred inthe data block structures in the physical disk as a result of mediafailures.  Media failures are failuresthat are caused due to hardware problems, operating system problems, controllerproblems, logical volume problems, and so on. As a result of a media corruption, the data in the corrupted block islost. 


       物理壞塊通常是block上的不一致,造成物理壞塊的原因可能是硬件故障,操作系統問題,控制器問題,邏輯卷問題等。 對於物理壞塊,其block上的數據會丟失。


 


       MediaCorruption could occur in different parts of the database and the detection,prevention and repair are different depending on the object that iscorrupted.  The following are differentobjects that could be corrupted: 


       物理壞塊可以存在數據庫的不同部分,對應不同部分的處理方法也不一樣,具體可以以下下幾種:


       ∗ Control file 


       ∗ Redo log file 


       ∗ Data file The block could belong to one of the following categoriesin case of data  file corruption: 


       (1)File header block 


       (2)Data dictionary object (SYSTEM tablespace) 


       (3)Undo header and Undo blocks (ROLLBACK tablespace) 


       (4)Sort blocks (TEMP tablespace) 


       (5)Data/Index blocks (DATA/INDEX tablespace) 


               =>Tables 


               =>Clusters 


               =>Indexes 


 


4.1 Control filecorruption 


       TheControl file is the file that has the structural information of thedatabase.  The control file hasinformation such as the database name, names and locations of the data filesand the redo log files belonging to the database, the creation timestamp of thedatabase, log sequence information, checkpoint information and so on. 


       控制文件包含了數據庫的一些結構信息。 


 


The control file can be dumped in asciiformat by executing the command: 


       ALTERSESSION SET EVENTS ‘IMMEDIATE TRACE NAME CONTROLF LEVEL 10’; 


 


       This command creates a trace file in the location defined by the INIT.ORA parameter,USER_DUMP_DEST.  


 


可以dump 控制文件。 有關控制文件的更多內容參考我的blog:


       Oracle 控制文件


       http://blog.csdn.net/tianlesoftware/article/details/4974440


 


       Whenany of the mirrors (Oracle’s multiplexing) of the control file is corrupted,critical information about the database cannot be accessed which will result ina database crash.  The following tablesexplain the potential causes, detection, prevention and repair of a corruptionin the current control file. 


       雖然控制文件有冗餘,但是任何一個控制文件出現損壞都會導致db crash。


 


 


4.1.1 Causes and Prevention


 


                          




4.1.2 Detectionand Repair


          


                        




   控制文件通常有三個,任何一個出現corrupt都會使db crash。 這種情況下,可以修改初始話參數,使用其他的幾個控制來啓動db,如果能啓動,就ok。不能啓動,就需要重建控制文件來解決。 關於控制文件的重建,我上面貼的鏈接裏有。


 


4.2  Redo log file corruption


       Redolog files are critical in protecting a database from failures.  The changes made to data in a database arerecorded in the redo log files.  


       redo 是在db failures來進行恢復的,所有的事務操作都會記錄到redolog裏。


 


       When there is a failure which prevents data from being written to the data files ondisk from memory, the changes can be obtained from the redo log files.  When a data file is corrupted, a backup froman earlier day can be restored and the changes from that day onwards can beapplied to the data file from the redo log files.  


 


       The redo log files are used only when a database is recovered from a failure.  There are two kinds of redo log files: Onlineredo log files and Archived redo log files. Archived redo log files are the spooled copies of the online redo logfiles.  


 


       Corruption could occur in an online redo log file or an archived redo log file.  If a corruption occurs on an online redo logfile, the instance crashes if there are no mirrors of the redo log file or ifthe mirrors are corrupted as well.  Ifthe corruption occurs in an archived redo log file, the database is notaffected unless a backup is restored and recovery is being done for which it isneeded. 


       corruption 可能發生在online log 或archived log。 如果發生在online log,並且log 沒有鏡像或者鏡像log也出現了壞塊,這種情況下,db 會crash。 如果出現在歸檔文件上,僅當使用備份進行恢復時纔會有影響。


 


4.2.1  Causes and  Prevention


                  




  4.2.2 Detectionand Repair


                 


 


Forcing the database tostartup using _ALLOW_RESETLOGS_CORRUPTION: 


       This parameter is undocumented and unsupported. 


       The_allow_resetlogs_corruptionshould only be done as a last resort. Usually when a database is opened witheither the RESETLOGS or NORESETLOGS option, the status and checkpointstructures in all the file headers of all data files are checked to make surethat they are consistent.  Once this ischecked, the redo logs are zeroed out in case of RESETLOGS.  When the _ALLOW_RESETLOGS_CORRUPTIONparameter is set, the file header checks are bypassed.  This means that we do not make sure that thefiles are consistent and open the database. It will potentially cause some lost data and lost data integrity.  The database should be rebuilt since data andthe data dictionary could be corrupt in ways that are not immediatelydetectable.  This could lead to futureoutages or put the database in a state where it cannot be salvaged at all.  There is no guarantee that this will work. 


 


Advantages 


       nsome circumstances, database can be brought up even when no valid backups areavailable 


 


Disadvantages 


(1)The database will not be in aconsistent state once the undocumented parameter is used and so it has to berebuilt by doing a full database export, recreate the database and a fulldatabase import. 


(2)Not guaranteed to work 


 


關於online redocorrupt,曾經整理過,參考:


Current online Redo 和 Undo 損壞的處理方法


http://blog.csdn.net/tianlesoftware/article/details/6261475


 


Oracle 不同故障的恢復方案


http://blog.csdn.net/tianlesoftware/article/details/6106178


 


4.3 Data file corruption


       The data files are the physical storage unit of data stored in a database.  Each data file ismade up of data blocks which can be divided into 5 different types (in thecontext of media corruption): 


       (1)File header blocks : File header block is the first Oracle block inevery data file in an Oracle database. This block keeps track of a lot of information about the data file thatit belongs to. 


       (2)Data Dictionary blocks : SYSTEM tablespace consists of datadictionary objects.  Data dictionaryobjects are objects that keep track of information stored in the database suchas the information about the tablespaces, information about the data files,information of amount of free space in each tablespace. 


       (3)Undo header and Undo blocks: ROLLBACK tablespace consists ofrollback segments that are made up of undo header blocks and undo blocks.  These blocks are used to undo a transactionwhen it fails or when the user executes a ROLLBACK command. 


       (4)Sort blocks: TEMP tablespace consists of temp segments that are madeup of sort blocks. Sorts are usually done in memory where they are allocated asize specified by the INIT.ORA parameter, SORT_AREA_SIZE amount of space.  If the sort is so huge that it cannot fit inthe allocated space in memory, then temp segments are created in the user’stemporary tablespace for doing the sort on disk. 


       (5)Data blocks: DATA tablespace consists of tables, clusters andindexes. 


 


       The following tables explain the potential causes of data file corruption and alsothe different ways to prevent data file corruption. 


 


4.3.1 Causes and Prevention


 


                




       Now that we have discussed the causes and the prevention, let us discuss thedetection and repair which is different for corruption in different blocks. 


 


4.3.2 File header block 


       Fileheader block is the first Oracle block in a data file.  This block keeps track of information aboutthe data file itself (i.e., file metadata) including different checkpointstructures (explained below), status of the file (e.g., hot backup in progress,hot backup ended, media recovery required, instance recovery required), andresetlogs information (information on the time when the database has beenbrought with resetlogs option most recently). 


       File header block 是data file的第一個block。 它保存了datafile的一些信息。


 


       Checkpointingis the process of writing the blocks that have been changed in memory to disk.The control file and the data file headers are updated after every checkpointis done in the database.  This is anautomatic action executed by the background process called the Database Writer(DBWR).  This can also be forced bycertain user commands such as a normal SHUTDOWN, a normal OFFLINE of atablespace and so on.  


       checkpoint進程會將已經變更的block從內存寫如磁盤。 每次執行checkpoint時,dbwr 進程會更新controlfile 和 datafileheaders。 當normal shutdown 或者normal offline tablespace時,也會更新control file 和datafile headers.


 


       Thedata file header block keeps track of the most recent checkpoint informationwhich denotes that all blocks in this data file that has been changed beforethis checkpoint has been written to disk from memory and so a failure in thememory will not affect the data in this data file before the checkpoint. 


       datafile header 保存了最近一次checkpoint的信息,它表示該checkpint 之前已經改變的block已經從內存寫入磁盤了。 當memory 發生錯誤時,不會影響該checkpoint 之前的data。


 


       Whena file header block is corrupted, the information stored in this block cannotbe accessed which means that the objects in the data file cannot be accessedeither.  The following tables give anidea of the potential causes for a corruption in a file header block and theprevention, detection and repair of the same. 


       當 fileheader block 發生corrupted時,存儲在該block裏的信息無法讀取,那麼的data file 也就無能讀取。 




              






在我的blog 裏的示例四,就是使用bbed修改data file header的:


Oracle bbed 五個 實用示例


http://blog.csdn.net/tianlesoftware/article/details/6684505


 


4.3.3 Data dictionary object (SYSTEM tablespace)


       Datadictionary objects are objects that reside in the SYSTEM tablespace which hascritical information about the objects in the database and their relationshipsand attributes. 


       Data dictionary 對象存放在system 表空間,它保存了數據庫中對象的重要信息,如對象之間的關係和屬性等。


 


Data dictionary objects found in the systemtablespace are described below: 


SYSTEM 表空間下的data dictionary objects 有如下幾種:


       1)Objects created by the script SQL.BSQ run by the CREATEDATABASE command found under $ORACLE_HOME/dbs directory.  There are certain tables, indexes andclusters that cannot be dropped and recreated in the SYSTEM tablespace.  These are the tables that are used whenbringing up the database.  They arecalled the bootstrap objects and are found in the SQL.BSQ.  It is not an easy task to locate datadictionary objects that can be dropped and recreated since the relationsbetween the different objects could result in an inconsistent database if wedrop the wrong object. 


       datadictionary 是在創建數據庫時通過$ORACLE_HOME/dbs/sql.bsq腳本創建的,並且這些核心的表,索引和clusters 不能被drop 和recreate。 他們在db 啓動時使用。


       2) Views createdon the fixed data structures (V$ views). 


       3)SYSTEM rollback segment created after the databasecreation. If the corruption is in this segment, then the most recentbackup should be restored and a point in time recovery should be done on thedatabase up to the time when the corruption could have occurred. 


       system rollback segment 在數據庫創建之後創建。 關於systemrollback segment在我的blog裏有說明:


              Current online Redo 和 Undo 損壞的處理方法


              http://blog.csdn.net/tianlesoftware/article/details/6261475


       4)Compatibility segment (this is the only segment of type ‘CACHE’ in the SYSTEMtablespace). The Compatibility segment is a segment that keeps track of thefeatures being used in the database which will be used when the database isbeing downgraded to an earlier version. This segment is used to make sure thatthe features being used in the current version are disabled before beingdowngraded to the earlier version.  Ifthe compatibility segment has a corruption then, the database can be brought upby shutting down and starting it up.  Ifthe problem is still not fixed call Oracle support with appropriate trace filesand the alert.log. 


 


       The supported way of fixing data dictionary corruption is to restore from a backupand roll forward using the archived redo logs. 


 


 


 


4.3.4 Undo header and Undo blocks (ROLLBACK tablespace)


       Rollback segments are undo segments that have information about the transaction that hasbeen executed so that it can be rolled back in case of failure of thetransaction or when the user asks for the transaction to be rolled backexplicitly.  


       They are made up of undo header blocks and undo blocks which are required to accessundo information to provide for consistent reads and transactionconsistency.  If the rollback segment iscorrupted, the transaction consistency of the data blocks (including datadictionary objects) can be jeopardized.


 










 


The corruption found when undoing atransaction could fall under three categories :


在以下三種情況下,可以會出現corruption:


       (1)belongs to the object (table/index/cluster) that has the data onwhich the transaction was executed (data block) 


       (2)belongs to the undo block that is being used to undo the transaction(undo header) 


       (3)belongs to the undo segment header block of the segment where theundo block is found (undo block) 


 


對於第一種情況,出現corruption 的位置是data block。 


       Thefirst case where the corruption is in the object to which the activetransaction belongs to, we have to identify the object first.  This can be done by setting an event in theINIT.ORA as follows: 


       event= “10015 trace name context forever, level 10” 


       這種情況的壞塊是active transaction,當我們重啓DB後,會進行相關的transaction recover(Rolling Back)。 當recover 完成,就可以正常訪問對應的block了。 相關文章參考:


       還原真實的cache recovery


       http://blog.csdn.net/tianlesoftware/article/details/6547891


       Oracle 實例恢復時 前滾(roll forward) 後滾(rollback) 問題


       http://blog.csdn.net/tianlesoftware/article/details/6286330


 


       Thisevent traces the undo segment recovery when the database is started, the eventputs out a trace in the directory specified by the INIT.ORA parameter,USER_DUMP_DEST.  This trace file containsa transaction table for each of the rollback segments that are onlined in thedatabase.  The trace file has a messagethat says ‘error recovering tx(#, #) object #’. Tx(#, #) refers to thetransaction information and the object # is the object id of the object thathas a corruption.  


       設置event時候,會dump DB 啓動是的undo segment recovery,在trace裏會有‘error recovering tx(#, #) object#’信息,其中Tx(#, #)指的是事務信息,object #是corruption 的對象id。找到對象ID 後可以通過如下SQL查詢:


 


The following query gives the name of theobject that is corrupted: 


       SQL>SELECTOWNER, OBJECT_NAME, OBJECT_TYPE, STATUS  FROM SYS.DBA_OBJECTS WHERE OBJECT_ID = <object #_from_tracefile>; 


 


對於第二種情況和第三種情況的處理,參考我的blog:


              Current online Redo 和 Undo 損壞的處理方法


              http://blog.csdn.net/tianlesoftware/article/details/6261475


 


4.3.5 Sort blocks (TEMP tablespace) 


       Sortsare usually done in the part of memory allocated from the SGA.  This is defined by the INIT.ORA parameterSORT_AREA_SIZE.  If the sort space neededfor a sort is so big that it cannot fit in the sort area defined in memory,then it is done on disk by creating segments called the Temporary segments. 


       當SORT_AREA_SIZE 指定的sort 空間不夠時,會在磁盤上創建一個temporary segments來作爲排序使用。 建議每個用戶指定自己的temporary tablespace。


       Itis advisable to create a separate tablespace called the TEMP tablespace.  After creating this tablespace, alter theusers to use this as their temporary tablespace by executing the followingcommand: 


       ALTERUSER user_name TEMPORARY TABLESPACE TEMP; 


       


       This way, the temporary segments created by any user will be in the TEMP tablespaceand it provides easy manageability. 


 


Detection


· Usually thissegment is never corrupted since they are reformatted every time they get used


 


Repair


· Tempsegments are reused frequently


· If problempersists, either move or drop and recreate the temp tablespace


 


4.3.6 Data/Index blocks (DATA/INDEX tablespace) 


       Whena data block is corrupted, when it belongs to a table segment, cluster segmentor an index segment, the detection mechanisms are the same: 


       datablock的corrupt可以出現在table segment,cluster segment和index segment。


 


可以通過以下方法來檢測corruption:


       (1)DBVERIFY can be used to detect the corrupted blocks in thedata file 


       (2)ANALYZE command run on the objects give errors (ANALYZE<table/cluster/index> <table/cluster/index_name> VALIDATESTRUCTURE;) When an ANALYZE (with CASCADE option) is run on a table or cluster,it cross verifies the index and data/cluster blocks along with the integritychecks done for the block. 


       (3)DB_BLOCK_CHECKSUM can be set to TRUE in the INIT.ORA file.When a block is changed and being written to disk from memory, DBWR calculatesa checksum for the block by summing up its contents and writes it in the blockalong with it on disk.  The next time when the block is being read by theforeground process, it calculates the checksum again for the block that isbeing read and compared with the checksum already written in the block ondisk.  If both are different then theblock has been corrupted on disk and so the block is not read into memory sothat it prevents cache corruption.  There is an overhead associated with this parameter since thechecksum is calculated each time it is read and written. 


       (4)Events 10210, 10211, 10212 can be set in the INIT.ORA file todetect software corrupt blocks.  When there is a corruption in a block, it is not detecteduntil the block is being updated. So any SELECTs on a corrupted block is executed until it is marked assoftware corrupt.  When the events areset in the INIT.ORA, the blocks are checked for integrity by comparing certaindata structures and once there is an inconsistency found, the seq or the sequence of the block is set to 0 in theblock header representing that the block is software corrupt.  


The events can be set as follows: 


       event= “10210 trace name context forever, level 10”  (for data blocks)


       event= “10211 trace name context forever, level 10”  (for index blocks) 


       event= “10212 trace name context forever, level 10”  (for cluster blocks) 


       (5)Users receiving ORA-1578 when trying to access an object.  The query from DBA_EXTENTS given in pg# 30shows that the error is on a table, cluster or index in the data tablespace,SQL語句如下:


       SQL>selectsegment_type, segment_name from sys.dba_extents where file_id =<file_id_from_ora-1578> and <block_id_from_ora-1578> betweenblock_id and (block_id+blocks-1);


       (6)ALERT.LOG shows ORA-1578 or ORA-600s with the first argument in therange of 2000 to 8000 


 


4.3.6.1 Tables


       Whena data block is corrupted in a table, it should be understood that the data inthe corrupted block is lost.  The onlyway to not lose any data from the table is to restore from a valid backup andrecover until a point in time before the corruption occurred.


       datablock 壞塊通常意味着數據的丟失,如果要保證沒有數據丟失,需要通過有效的備份進行恢復。這塊參考1.4節。


 


方法一:Event method


       Event10231 can be set to skip corrupted blocks on full table scans in the INIT.ORAfile.  The object can be exported aftersetting this event.  This is notguaranteed to work for every kind of corruption.  


       This works only when the block is soft corrupted(邏輯壞塊), sequence is set to 0. The event can be set as follows: 


       event= “10231 trace name context forever, level 10” 


       對於table上的邏輯壞塊,並且sequence 設置爲0,可以設置10231 event,當全表掃描時,可以跳過corruption,從而讀取數據。


 


 


方法二:ROWID method


       Extractthe data that does not belong to the corrupted block using ROWIDs. Every row inevery table in an Oracle database has a ROWID column which is usually not displayedunless SELECTed explicitly.  


       通過rowid,我們可以抽取出非corrupted block上的數據。關於ROWID的更多內容,參考我的Blog:


       Oracle Rowid 介紹


       http://blog.csdn.net/tianlesoftware/article/details/5020718


 


(1)通過函數構建ROWID


函數參數如下:     


function ROWID_CREATE(rowid_type    IN number,


                           object_number INnumber,


                           relative_fno  IN number,


                           block_number  IN number,


                           row_number    IN number)


              return ROWID;


 


    -- rowid_type      - type(restricted=0/extended=1)


    -- object_number   - data objectnumber


    -- relative_fno    - relative filenumber


    -- block_number    - block numberin this file


    -- row_number      - row number inthis block


 


這些參數可以通過如下方式獲取:


ROWID_TYPE:


       Thisis 1 because we are using the extended rowid format.


 


RELATIVE_FNO:


       Thisshould have been available when you came to this article. It can also be foundfrom the DBA_EXTENTS view given the absolute file number and block number ofthe corrupt block:


       SQL>SELECTtablespace_name, relative_fno, segment_type, owner, segment_name, partition_nameFROM dba_extents WHERE file_id = <AFN>    AND  <BL> between block_id and block_id + blocks-1;


 


OBJECT_NUMBER:


       For a non-partitioned table, select the DATA_OBJECT_ID from DBA_OBJECTS for theproblem table:


 


SQL>SELECT data_object_id FROM dba_objectsWHERE object_name = '<TABLE-NAME>' AND owner = '<TABLE-OWNER>' ;


 


Note that a partitionedtable has an object number for each partition.


--注意,對於分區表,每個分區有一個對象id:


Select the DATA_OBJECT_ID from DBA_OBJECTSthus:


       SQL>SELECTdata_object_id FROM dba_objects WHERE object_name = '<TABLE-NAME>' ANDowner = '<TABLE-OWNER>' AND subobject_name = '<PARTITION-NAME>' ;


       相對與普通表,普通表多了一個subojbect_name 條件。


 


BLOCK_NUMBER andROW_NUMBER:


       Theblock number of the corrupt block should be available before you came to thisarticle. (Eg: It is reported in an ORA-1578 error, or as a Page Number byDBVerify).


       Fora ROWID range scan we generally want to select all rows BEFORE the corruptblock, then all rows AFTER the corrupt block. The first row in a block is rowzero (0) and so we want all rowids LESS THAN "Block <BL> row 0"and then GREATER THAN OR EQUAL TO "Block <BL>+1 row 0".


 


(2)創建ROWID


  You can now create the rowid strings to use in a predicate thus:


現在可以創建rowid:


 


The "LOW_RID"is the lowest rowid INSIDE the corrupt block:


SQL>SELECT dbms_rowid.rowid_create(1,<DATA_OBJECT_ID>,<RFN>,<BL>,0)LOW_RID  from DUAL;


 


The "HI_RID" isthe first rowid AFTER the corrupt block:


SQL>SELECT  dbms_rowid.rowid_create(1,<DATA_OBJECT_ID>,<RFN>,<BL>+1,0)HI_RID       from DUAL;


 


(3)重建數據


       Itis now possible to use CREATE TABLE AS SELECT or INSERT ... SELECT to get datawithout accessing the corrupt block using a query of the form:


       根據剛纔查詢的rowid,跳過corrupt block來進行createtable 或者insert 操作:


SQL>CREATE TABLE salvage_table AS SELECT/*+ ROWID(A) */ * FROM <owner.tablename> A WHERE rowid <'<low_rid>'   ;


SQL>INSERT INTO salvage_table SELECT /*+ROWID(A) */ * FROM <owner.tablename> A WHERE rowid >='<hi_rid>';


注意hint裏的A是表的別名。


 


       Fora table partition then only the problem partition need be selected from byusing the PARTITION(xxx) option in the FROM clause:


       對於分區表,僅需要對問題分區進行處理:


 


SQL>CREATE TABLE salvage_table AS SELECT/*+ ROWID(A) */ * FROM <owner.tablename> PARTITION(<partition_name>) A WHERE rowid < '<lo_rid>';


SQL>INSERT INTO salvage_table SELECT /*+ROWID(A) */ * FROM <owner.tablename> PARTITION (<partition_name>) AWHERE rowid >= '<hi_rid>';


       注意: 採用上面這種rowid的方法,不能處理含有LONG字段的表,對於LONG字段的表,只能使用帶有where 條件的export/import.


 


       If the corrupt block is the table segment header, this method won't work. You stillhave the option of using any indexes on the corrupt table to extract the data.


       如果是塊頭出現corrupt,那麼這個方法不使用。 當塊頭出現問題,整個塊的data 都不可讀取。 如果是非塊頭,我們可以使用這種方法挽回更多的數據。


 


  Use the following query to determine if the affected block is thesegment header :


       可以使用如下SQL判斷是否是塊頭block:


SQL>selectfile_id,block_id,blocks,extent_id from dba_extents where owner='<owner>'and segment_name='<table_name>' and segment_type='TABLE'   order by extent_id;


 


FILE_ID BLOCK_ID    BLOCKS EXTENT_ID


--------- --------- --------- ---------


8    94854     20780         0 <- EXTENT_IDZERO is segment header


 


方法三: Index method


       If there are any indexes on the corrupt table then it is possible to get someinformation about what data was in the corrupt block from the index. Thisrequires selecting indexed columns from the table for rowids in the corruptblock. We already know the ROWID range covered by the corrupt block from theSELECT dbms_rowid.rowid_create ... statements above. 


       To extract the column data use one of the following forms of select statement:


 


If the columns requiredat NOT NULLable you can use a fast full scan:


如果列是非空的,可是使用fast full scan:


       SQL>SELECT/*+ INDEX_FFS(X <index_name>) */ <index_column1>,<index_column2> ...  FROM<tablename> X WHERE rowid >= '<low_rid>'          AND rowid <  '<hi_rid>' ;


 


If the columns required are NULLable thenyou cannot use an index fast full scan and must use a range scan. This requiresyou to know a minimum possible value for the leading index column to ensure youenable the index scan:


如果列是可以null的,那麼必須使用range scan:


SQL>SELECT /*+ INDEX(X<index_name>) */ <index_column1>, <index_column2> ...  FROM <tablename> X  WHERE rowid >= '<low_rid>'        ANDrowid <  '<hi_rid>'  AND <index_column1> >=<min_col1_value>;


    


       Using this technique for all indexes on the table may be able to retrieve some of thedata. See <View:DBA_IND_COLUMNS> for which columns make up each index.


 


關於第二和第三中方法,MOS上有示例:


       ExtractingData from a Corrupt Table using ROWID Range Scans in Oracle8 and higher [ID61685.1]


 


 


4.3.6.2 Clusters


       Clusters can be defined as a way to store more than one tables physically together sincethe tables have some common columns.  Bystoring the related rows from multiple tables together, the access time can bereduced.  


       For example, if a cluster is formed with the EMP and DEPT tables, the cluster blocklooks like: 


 


 








       A query that runs across EMP and DEPT will have to access just one block to getboth tables’ rows.  The IO is less andthe access time is less too. 


       The detection and repair for a cluster block corruption is similar to that of atable block corruption but the only difference is that when there is acorruption in a cluster block, all the objects that make up the cluster has tobe fixed. 


 


4.3.6.3 indexes


Detection


·(1) DBVERIFY gives the corrupted blockinformation


·(2)ALERT.LOG shows corruption errors


·(3)Users running queries against theindex get ORA-1578 on the index


·(4)ORA-600 with first argument in the rangebetween 2000 and 8000


 


Repair


·(1) Drop and recreate the indexsegment


·(2)Restore from a valid backup and recover


 


       索引上的corrupt block 是比較好處理的,因爲可以對索引進行重建,所以不會造成data lose。


 


 


五. dbms_repair 包與壞塊


       Oracle從8i 開始提供了DBMS_REPAIR包,該包可以用來發現、標識並修改數據文件中的壞塊。dbms_repair包沒有進行授權,只有sys用戶可以執行。


       任何工具都不是萬能的,使用這個包的同時會帶來數據丟失、表和索引返回數據不一致,完整性約束破壞等其他問題。因此當出現錯誤時,應當首先從物理備份或邏輯備份恢復,使用dbms_repair只是在沒有備份的情況下使用的一種手段,這種方式一般都會造成數據的丟失。


       dbms_repair包的工作原理比較簡單,是將檢查到的壞塊標註出來,使隨後的dml操作跳過該塊,同時,dbms_repair包還提供了用於保存索引中包含的標註爲壞塊中的鍵值,以及修復freelist和segment bitmap的過程。


       


官網的說明:


       DBMS_REPAIR Examples


       http://download.oracle.com/docs/cd/B28359_01/server.111/b28310/repair004.htm#ADMIN11828


       Using the DBMS_REPAIR Package


       http://download.oracle.com/docs/cd/B28359_01/server.111/b28310/repair003.htm#ADMIN11815


 


yangtingkun的blog:


       DBMS_REPAIR的使用


       http://yangtingkun.itpub.net/post/468/9121


 


5.1 準備工作


create tablespace block datafile '/u01/block.dbf' size 5M;


create table DMM tablespace block as select * from all_tables;


commit;


CREATE INDEX indx_dmm on DMM(TABLE_NAME);


select count(*) from DMM;


  COUNT(*)


----------


12896


 


5.2.創建管理表


SQL> conn sys/admin as sysdba;


已連接。


SQL> execDBMS_REPAIR.ADMIN_TABLES('REPAIR_TABLE',1,1,'USERS');


PL/SQL procedure successfully completed


SQL> execDBMS_REPAIR.ADMIN_TABLES('ORPHAN_TABLE',2,1,'USERS');


PL/SQL procedure successfully completed


 


5.3 檢查壞塊:dbms_repair.check_object


 Set serveroutput on;


DECLARE


   cc  NUMBER;


BEGIN


   DBMS_REPAIR.check_object (schema_name => 'SYS', -- 注意此處是用戶名


                             object_name     => 'DMM',


                             corrupt_count   => cc);


   DBMS_OUTPUT.put_line ( TO_CHAR (cc));


END;


       正常情況下輸入爲0。 如果有壞塊,可以在創建的REPAIR_TABLE中查看塊損壞信息:   


 


/* Formatted on 2009-12-17 13:18:19 (QP5v5.115.810.9015) */


SELECT   object_name,


         relative_file_id,


         block_id,


         marked_corrupt,


         corrupt_description,


         repair_description,


         CHECK_TIMESTAMP


  FROM   repair_table;                                                                                                     


                                 


       注意:在8i下,check_object只會檢查壞塊,MARKED_CORRUPT爲false,故需要進行 定位壞塊(fix_corrupt_blocks) ,修改MARKED_CORRUPT爲true,同時更新CHECK_TIMESTAMP。


       9i以後經過check_object,MARKED_CORRUPT的值已經標識爲TRUE了。所以可以直接進行第四步了。


 


5.4 .定位壞塊:dbms_repair.fix_corrupt_blocks     


只有將壞塊信息寫入定義的REPAIR_TABLE後,才能定位壞塊。 


/* Formatted on 2009-12-17 13:29:01 (QP5v5.115.810.9015) */


DECLARE


   cc   NUMBER;


BEGIN


   DBMS_REPAIR.fix_corrupt_blocks (schema_name   => 'SYS',


                                   object_name   => 'DMM',


                                   fix_count     => cc);


   DBMS_OUTPUT.put_line (a => TO_CHAR (cc));


END;


 


 5.5 .跳過壞塊


我們前面雖然定位了壞塊,但是,如果我們訪問table:


SQL> select count(*) from SYS.DMM;


ORA-01578: ORACLE 數據塊損壞(文件號14,塊號154)


ORA-01110: 數據文件 14: 'D: /BLOCK.DBF'


 


還是會得到錯誤信息。 這裏需要用skip_corrupt_blocks來跳過壞塊:


 


/* Formatted on 2009-12-17 13:30:17 (QP5v5.115.810.9015) */


exec dbms_repair.skip_corrupt_blocks(schema_name => 'SYS',object_name => 'DMM',flags => 1);


 


SQL> select count(*) from SYS.DMM;


  COUNT(*)


----------


     12850


丟失了12896-12850=46行數據。


 


5.6.處理index上的無效鍵值;dump_orphan_keys


 /* Formatted on 2009-12-17 13:34:55(QP5 v5.115.810.9015) */


DECLARE


   cc   NUMBER;


BEGIN


   DBMS_REPAIR.dump_orphan_keys (schema_name         => 'SYS',


                                 object_name         => 'INDX_DMM',


                                 object_type         => 2,


                                 repair_table_name   => 'REPAIR_TABLE',


                                 orphan_table_name   => 'ORPHAN_TABLE',


                                 key_count           => CC);


END;


 


通過以下命令可以知道丟失行的信息:


SQL> SELECT * FROM ORPHAN_TABLE;


 


我們根據這個結果來考慮是否需要rebuild index.


 


5.7重建freelist:rebuild_freelists


 /* Formatted on 2009-12-17 13:37:57(QP5 v5.115.810.9015) */


exec dbms_repair.rebuild_freelists(schema_name => 'SYS',object_name => 'DMM');


 


 


 


 


 


 


 


 


-------------------------------------------------------------------------------------------------------


Blog:http://blog.csdn.net/tianlesoftware 


Weibo:http://weibo.com/tianlesoftware
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章