重啓後,兩節點的數據庫可以正常啓動了,但是數據文件的壞塊還存在
SQL> run
1* select name,status from v$datafile
NAME STATUS
-------------------- --------------------
/dev/rlv_system_8g SYSTEM
/dev/rlv_undot11_8g ONLINE
/dev/rlv_sysaux_8g ONLINE
/dev/rlv_user_8g ONLINE
/dev/rlv_undot12_8g ONLINE
/dev/rlv_raw37_16g RECOVER
在 zhyw2上做恢復操作,對這個壞塊嘗試恢復:
SQL> recover datafile '/dev/rlv_raw37_16g';
ORA-00279: change 11318004822236 generated at 08/13/2010 16:42:39 needed for
thread 2
ORA-00289: suggestion : /arch2/bsp1922_2_229_713969898.arc
ORA-00280: change 11318004822236 for thread 2 is in sequence #229
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
auto
ORA-00279: change 11321028506146 generated at 08/17/2010 09:42:36 needed for
thread 2
ORA-00289: suggestion : /arch2/bsp1922_2_230_713969898.arc
ORA-00280: change 11321028506146 for thread 2 is in sequence #230
ORA-00278: log file '/arch2/bsp1922_2_229_713969898.arc' no longer needed for
this recovery
Log applied.
Media recovery complete.
數據文件成功被修復了,這時,它處於offline狀態,我敲入下面的命令,把它恢復到online狀態:
alter database datafile '/dev/rlv_raw37_16g' online;
客戶反映臨時表空間老是報空間不足的錯誤,讓我乘着這次停機,也幫他看看,
我看了下,確認當前oravg7上是否有空閒空間,
[root@zhyw2]#lsvg -l oravg7
oravg7:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lv_raw93_16g raw 512 512 1 open/syncd N/A
lv_raw94_16g raw 512 512 1 open/syncd N/A
lv_raw95_16g raw 512 512 1 open/syncd N/A
lv_raw96_16g raw 512 512 1 open/syncd N/A
lv_raw97_16g raw 512 512 1 closed/syncd N/A
準備使用 lv_raw97_16g 這個lv 做爲新增的temp 表空間的一個文件
查看兩節點該lv對應的權限:
權限正常
[root@zhyw2]#cd /dev/
[root@zhyw2]#ls -l *raw97*
brw-rw---- 1 root system 106, 5 Mar 26 05:43 lv_raw97_16g
crw-rw---- 1 oracle dba 106, 5 Mar 26 05:43 rlv_raw97_16g
[root@zhyw2]#
[root@zhyw1]#ls -l *raw97*
brw-rw---- 1 root system 106, 5 Mar 16 14:49 lv_raw97_16g
crw-rw-r-- 1 oracle oinstall 106, 5 Mar 16 14:49 rlv_raw97_16g
使用oracle 命令增加temp 表空間
alter tablespace temp add tempfile '/dev/rlv_raw97_16g' size 15872m;
所有的活都幹完了,
等黃工他們把業務起起來,確認了應用沒有問題之後,
我又在客戶現場等待了一段時間。一直狀態正常。
我想,我終於可以回家睡個好覺了。
下午4點多,正在迷迷糊糊中,又接到了丁工的電話:“程工,不好意思,我們的生產庫又出問題了,他們業務反映查詢很慢,你還是過來再看看吧." 哎,看來真是好事多磨呀,這回又有什麼問題在等着我呢?(未完待續)
到客戶現場後,應用開發商負責的黃工已經在等着我了,他告訴我現在的數據庫很不正常,雖然沒有出現錯誤信息但是數據庫一做sort操作,就變得很慢。而且一些應用的查詢語句頻繁的發生錯誤。 黃工很懷疑EMC的存儲還是有壞塊問題,並嚴重懷疑temp 表空間對應的lv 下有壞塊。我查看了下當前兩節點的IO使用率,感覺還比較正常,再做了下節點的awr report, 信息如下: zhyw2: 命中率: Buffer Nowait %: 99.95 Redo NoWait %: 100.00 Buffer Hit %: 99.90 In-memory Sort %: 100.00 Library Hit %: 79.74 Soft Parse %: 65.09 Execute to Parse %: 60.45 Latch Hit %: 98.06 Parse CPU to Parse Elapsd %: 8.98 % Non-Parse CPU: 99.68 等待事件 Top 5 Timed Events Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class CPU time 1,335,611 68.5 enq: CF - contention 503,175 239,024 475 12.3 Other enq: TS - contention 388,571 175,924 453 9.0 Other library cache lock 74,915 31,273 417 1.6 Concurrency gc cr multi block request 162,634,284 29,927 0 1.5 Cluster zhyw1: Buffer Nowait %: 100.00 Redo NoWait %: 100.00 Buffer Hit %: 99.87 In-memory Sort %: 100.00 Library Hit %: 80.86 Soft Parse %: 61.23 Execute to Parse %: 64.13 Latch Hit %: 98.09 Parse CPU to Parse Elapsd %: 12.10 % Non-Parse CPU: 99.56 等待事件: Top 5 Timed Events Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class CPU time 942,235 40.9 enq: CF - contention 847,042 404,685 478 17.6 Other enq: DX - contention 71,048 208,067 2,929 9.0 Other enq: TS - contention 140,442 63,747 454 2.8 Other inactive transaction branch 35,518 34,682 976 1.5 Other 我發現,羣集實例library cache命中率偏低。 在全局等待事件中,鎖資源等待比較突出。 我看了下兩個節點的SGA_MAX_SIZE ,SGA_TARGET。他們的當前值都是16G。 考慮到當前的RAC環境下,單節點數據庫服務器的內存就有72G,且平時空閒率>40%,可以考慮多劃分一點出來給SGA。 我懷疑因爲最近業務量增長比較大,系統原有的一些資源也出現了瓶頸,所以纔出現了訪問緩慢的情況。 但是鑑於 EMC DMX1500之前出過錯誤,它的影響了系統IO 的可能性也是有的,所以我先建議EMC 工程師檢查下劃給此套RAC系統的disk 資源是否有壞塊。 使用下面的命令,查看磁盤的信息: # ./symdev list pd Symmetrix ID: 隱藏 Device Name Directors Device --------------------------- ------------- ------------------------------------- Cap Sym Physical SA :P DA :IT Config Attribute Sts (MB) --------------------------- ------------- ------------------------------------- 0022 /dev/rhdisk3 08A:1 01A:C0 2-Way Mir N/Grp'd VCM WD 3 0029 /dev/rhdiskpower0 08A:1 16B:D1 2-Way Mir N/Grp'd RW 3 002A /dev/rhdiskpower1 08A:1 01A:C2 2-Way Mir N/Grp'd RW 3 0059 /dev/rhdiskpower64 09A:1 16B:C2 2-Way Mir N/Grp'd RW 3 005A /dev/rhdiskpower65 09A:1 01C:C2 2-Way Mir N/Grp'd RW 3 0084 /dev/rhdiskpower2 08A:1 16A:D0 2-Way Mir N/Grp'd RW 1024 0085 /dev/rhdiskpower3 08A:1 16B:D3 2-Way Mir N/Grp'd RW 1024 013A /dev/rhdiskpower4 08A:1 16B:CE RDF1+Mir Grp'd (M) RW 49140 013E /dev/rhdiskpower5 08A:1 01C:D5 RDF1+Mir Grp'd (M) RW 49140 0142 /dev/rhdiskpower6 08A:1 16C:D6 RDF1+Mir Grp'd (M) RW 49140 0146 /dev/rhdiskpower7 08A:1 01C:D7 RDF1+Mir Grp'd (M) RW 49140 014A /dev/rhdiskpower8 08A:1 16C:D8 RDF1+Mir Grp'd (M) RW 49140 014E /dev/rhdiskpower9 08A:1 16A:C3 RDF1+Mir Grp'd (M) RW 49140 0152 /dev/rhdiskpower10 08A:1 01A:C2 RDF1+Mir Grp'd (M) RW 49140 0156 /dev/rhdiskpower11 08A:1 16A:C1 RDF1+Mir Grp'd (M) RW 49140 015A /dev/rhdiskpower12 08A:1 01A:D9 RDF1+Mir Grp'd (M) RW 49140 015E /dev/rhdiskpower13 08A:1 16A:D8 RDF1+Mir Grp'd (M) RW 49140 0162 /dev/rhdiskpower14 08A:1 01A:DB RDF1+Mir Grp'd (M) RW 49140 0166 /dev/rhdiskpower15 08A:1 16A:DA RDF1+Mir Grp'd (M) RW 49140 016A /dev/rhdiskpower16 08A:1 01A:D5 RDF1+Mir Grp'd (M) RW 49140 016E /dev/rhdiskpower17 08A:1 16A:D4 RDF1+Mir Grp'd (M) RW 49140 0172 /dev/rhdiskpower18 08A:1 01A:D7 RDF1+Mir Grp'd (M) RW 49140 0176 /dev/rhdiskpower19 08A:1 16A:D6 RDF1+Mir Grp'd (M) RW 49140 017A /dev/rhdiskpower20 08A:1 16C:C3 RDF1+Mir Grp'd (M) RW 49140 017E /dev/rhdiskpower21 08A:1 01C:C2 RDF1+Mir Grp'd (M) RW 49140 0182 /dev/rhdiskpower22 08A:1 16C:C1 RDF1+Mir Grp'd (M) RW 49140 0186 /dev/rhdiskpower23 08A:1 01C:C0 RDF1+Mir Grp'd (M) RW 49140 018A /dev/rhdiskpower24 08A:1 16C:D0 RDF1+Mir Grp'd (M) RW 49140 018E /dev/rhdiskpower25 08A:1 16C:DA RDF1+Mir Grp'd (M) RW 49140 0192 /dev/rhdiskpower26 08A:1 01C:D9 RDF1+Mir Grp'd (M) RW 49140 0196 /dev/rhdiskpower27 08A:1 16C:DC RDF1+Mir Grp'd (M) RW 49140 019A /dev/rhdiskpower28 08A:1 01C:DB RDF1+Mir Grp'd (M) RW 49140 019E /dev/rhdiskpower29 08A:1 16B:CE RDF1+Mir Grp'd (M) RW 49140 01A2 /dev/rhdiskpower30 08A:1 01C:D5 RDF1+Mir Grp'd (M) RW 49140 01A6 /dev/rhdiskpower31 08A:1 16C:D6 RDF1+Mir Grp'd (M) RW 49140 01AA /dev/rhdiskpower32 08A:1 01C:D7 RDF1+Mir Grp'd (M) RW 49140 01AE /dev/rhdiskpower33 08A:1 16C:D8 RDF1+Mir Grp'd (M) RW 49140 01B2 /dev/rhdiskpower34 08A:1 16A:C3 RDF1+Mir Grp'd (M) RW 49140 01B6 /dev/rhdiskpower35 08A:1 01A:C2 RDF1+Mir Grp'd (M) RW 49140 01BA /dev/rhdiskpower36 08A:1 16A:C1 RDF1+Mir Grp'd (M) RW 49140 01BE /dev/rhdiskpower37 08A:1 01A:D9 RDF1+Mir Grp'd (M) RW 49140 01C2 /dev/rhdiskpower38 08A:1 16A:D8 RDF1+Mir Grp'd (M) RW 49140 01C6 /dev/rhdiskpower39 08A:1 01A:DB RDF1+Mir Grp'd (M) RW 49140 01CA /dev/rhdiskpower40 08A:1 16A:DA RDF1+Mir Grp'd (M) RW 49140 01CE /dev/rhdiskpower41 08A:1 01A:D5 RDF1+Mir Grp'd (M) RW 49140 01D2 /dev/rhdiskpower42 08A:1 16A:D4 RDF1+Mir Grp'd (M) RW 49140 01D6 /dev/rhdiskpower43 08A:1 01A:D7 RDF1+Mir Grp'd (M) RW 49140 01DA /dev/rhdiskpower44 08A:1 16A:D6 RDF1+Mir Grp'd (M) RW 49140 01DE /dev/rhdiskpower45 08A:1 16C:C3 RDF1+Mir Grp'd (M) RW 49140 01E2 /dev/rhdiskpower46 08A:1 01C:C2 RDF1+Mir Grp'd (M) RW 49140 01E6 /dev/rhdiskpower47 08A:1 16C:C1 RDF1+Mir Grp'd (M) RW 49140 01EA /dev/rhdiskpower48 08A:1 01C:C0 RDF1+Mir Grp'd (M) RW 49140 01EE /dev/rhdiskpower49 08A:1 16C:D0 RDF1+Mir Grp'd (M) RW 49140 01F2 /dev/rhdiskpower50 08A:1 16C:DA RDF1+Mir Grp'd (M) RW 49140 01F6 /dev/rhdiskpower51 08A:1 01C:D9 RDF1+Mir Grp'd (M) RW 49140 01FA /dev/rhdiskpower52 08A:1 16C:DC RDF1+Mir Grp'd (M) RW 49140 01FE /dev/rhdiskpower53 08A:1 01C:DB RDF1+Mir Grp'd (M) RW 49140 0202 /dev/rhdiskpower54 08A:1 16B:CE RDF1+Mir Grp'd (M) RW 49140 0206 /dev/rhdiskpower55 08A:1 01C:D5 RDF1+Mir Grp'd (M) RW 49140 020A /dev/rhdiskpower56 08A:1 16C:D6 RDF1+Mir Grp'd (M) RW 49140 020E /dev/rhdiskpower57 08A:1 01C:D7 RDF1+Mir Grp'd (M) RW 49140 0212 /dev/rhdiskpower58 08A:1 16C:D8 RDF1+Mir Grp'd (M) RW 49140 0216 /dev/rhdiskpower59 08A:1 16A:C3 RDF1+Mir Grp'd (M) RW 49140 021A /dev/rhdiskpower60 08A:1 01A:C2 RDF1+Mir Grp'd (M) RW 49140 021E /dev/rhdiskpower61 08A:1 16A:C1 RDF1+Mir Grp'd (M) RW 49140 0222 /dev/rhdiskpower62 08A:1 01A:D9 RDF1+Mir Grp'd (M) RW 49140 0226 /dev/rhdiskpower63 08A:1 16A:D8 RDF1+Mir Grp'd (M) RW 49140 我建議EMC的王工檢查 ID 爲 0022-0226之間的磁盤壞塊情況。 同時,當前此套數據庫已經搭建好了 從DMX 1500到 DMX950的SRDF關係,但是還沒有同步。 因此有必要把他們之間的底層同步啓動起來。保證業務數據的安全。我建議王工把SRDF跑起來。 我們選擇了下面的方式啓動SRDF關係: EMC工程師做了下面的操作,檢查相關信息: # ./symcfg disc This operation may take up to a few minutes. Please be patient... [root@zhyw1]#./symdg list D E V I C E G R O U P S Number ofName Type Valid Symmetrix ID Devs GKs BCVs VDEVs TGTs zhyw_1500_950 RDF1 Yes 隱藏 60 0 0 0 0 [root@zhyw1]#./symrdf -g zhyw_1500_950 query |more Device Group (DG) Name : zhyw_1500_950 DG's Type : RDF1 DG's Symmetrix ID : 隱藏 (Microcode Version: 5773) Remote Symmetrix ID : 000290301387 (Microcode Version: 5773) RDF (RA) Group Number : 2 (01) Source (R1) View Target (R2) View MODES -------------------------------- ------------------------ ----- ------------ ST LI ST Standard A N A Logical T R1 Inv R2 Inv K T R1 Inv R2 Inv RDF Pair Device Dev E Tracks Tracks S Dev E Tracks Tracks MDA STATE -------------------------------- -- ------------------------ ----- ------------ DEV001 013A RW 0 786240 NR 02EB WD 0 786240 C.D Suspended DEV002 013E RW 0 786240 NR 02EF WD 0 786240 C.D Suspended DEV003 0142 RW 0 786240 NR 02F3 WD 0 786240 C.D Suspended DEV004 0146 RW 0 786240 NR 02F7 WD 0 786240 C.D Suspended DEV005 014A RW 0 786240 NR 02FB WD 0 786240 C.D Suspended DEV006 014E RW 0 786240 NR 02FF WD 0 786240 C.D Suspended DEV007 0152 RW 0 786240 NR 0303 WD 0 786240 C.D Suspended DEV008 0156 RW 0 786240 NR 0307 WD 0 786240 C.D Suspended DEV009 015A RW 0 786240 NR 030B WD 0 786240 C.D Suspended DEV010 015E RW 0 786240 NR 030F WD 0 786240 C.D Suspended DEV011 0162 RW 0 786240 NR 0313 WD 0 786240 C.D Suspended DEV012 0166 RW 0 786240 NR 0317 WD 0 786240 C.D Suspended DEV013 016A RW 0 786240 NR 031B WD 0 786240 C.D Suspended DEV014 016E RW 0 786240 NR 031F WD 0 786240 C.D Suspended DEV015 0172 RW 0 786240 NR 0323 WD 0 786240 C.D Suspended DEV016 0176 RW 0 786240 NR 0327 WD 0 786240 C.D Suspended DEV017 017A RW 0 786240 NR 032B WD 0 786240 C.D Suspended DEV018 017E RW 0 786240 NR 032F WD 0 786240 C.D Suspended DEV019 0182 RW 0 786240 NR 0333 WD 0 786240 C.D Suspended DEV020 0186 RW 0 786240 NR 0337 WD 0 786240 C.D Suspended DEV021 018A RW 0 786240 NR 033B WD 0 786240 C.D Suspended DEV022 018E RW 0 786240 NR 033F WD 0 786240 C.D Suspended DEV023 0192 RW 0 786240 NR 0343 WD 0 786240 C.D Suspended DEV024 0196 RW 0 786240 NR 0347 WD 0 786240 C.D Suspended DEV025 019A RW 0 786240 NR 034B WD 0 786240 C.D Suspended DEV026 019E RW 0 786240 NR 034F WD 0 786240 C.D Suspended DEV027 01A2 RW 0 786240 NR 0353 WD 0 786240 C.D Suspended DEV028 01A6 RW 0 786240 NR 0357 WD 0 786240 C.D Suspended DEV029 01AA RW 0 786240 NR 035B WD 0 786240 C.D Suspended DEV030 01AE RW 0 786240 NR 035F WD 0 786240 C.D Suspended DEV031 01B2 RW 0 786240 NR 0363 WD 0 786240 C.D Suspended DEV032 01B6 RW 0 786240 NR 0367 WD 0 786240 C.D Suspended DEV033 01BA RW 0 786240 NR 036B WD 0 786240 C.D Suspended DEV034 01BE RW 0 786240 NR 036F WD 0 786240 C.D Suspended DEV035 01C2 RW 0 786240 NR 0373 WD 0 786240 C.D Suspended DEV036 01C6 RW 0 786240 NR 0377 WD 0 786240 C.D Suspended DEV037 01CA RW 0 786240 NR 037B WD 0 786240 C.D Suspended DEV038 01CE RW 0 786240 NR 037F WD 0 786240 C.D Suspended DEV039 01D2 RW 0 786240 NR 0383 WD 0 786240 C.D Suspended DEV040 01D6 RW 0 786240 NR 0387 WD 0 786240 C.D Suspended DEV041 01DA RW 0 786240 NR 038B WD 0 786240 C.D Suspended DEV042 01DE RW 0 786240 NR 038F WD 0 786240 C.D Suspended DEV043 01E2 RW 0 786240 NR 0393 WD 0 786240 C.D Suspended DEV044 01E6 RW 0 786240 NR 0397 WD 0 786240 C.D Suspended DEV045 01EA RW 0 786240 NR 039B WD 0 786240 C.D Suspended DEV046 01EE RW 0 786240 NR 039F WD 0 786240 C.D Suspended DEV047 01F2 RW 0 786240 NR 03A3 WD 0 786240 C.D Suspended DEV048 01F6 RW 0 786240 NR 03A7 WD 0 786240 C.D Suspended DEV049 01FA RW 0 786240 NR 03AB WD 0 786240 C.D Suspended DEV050 01FE RW 0 786240 NR 03AF WD 0 786240 C.D Suspended DEV051 0202 RW 0 786240 NR 03B3 WD 0 786240 C.D Suspended DEV052 0206 RW 0 786240 NR 03B7 WD 0 786240 C.D Suspended DEV053 020A RW 0 786240 NR 03BB WD 0 786240 C.D Suspended DEV054 020E RW 0 786240 NR 03BF WD 0 786240 C.D Suspended DEV055 0212 RW 0 786240 NR 03C3 WD 0 786240 C.D Suspended DEV056 0216 RW 0 786240 NR 03C7 WD 0 786240 C.D Suspended DEV057 021A RW 0 786240 NR 03CB WD 0 786240 C.D Suspended DEV058 021E RW 0 786240 NR 03CF WD 0 786240 C.D Suspended DEV059 0222 RW 0 786240 NR 03D3 WD 0 786240 C.D Suspended DEV060 0226 RW 0 786240 NR 03D7 WD 0 786240 C.D Suspended Total -------- -------- -------- -------- Track(s) 0 47174400 0 47174400 MB(s) 0 2948400 0 2948400 Legend for MODES: M(ode of Operation): A = Async, S = Sync, E = Semi-sync, C = Adaptive Copy D(omino) : X = Enabled, . = Disabled A(daptive Copy) : D = Disk Mode, W = WP Mode, . = ACp off 採用下面的命令來開始源存儲到目標存儲的同步操作: [root@zhyw1]#./symrdf -g zhyw_1500_950 resume Execute an RDF 'Resume' operation for device group 'zhyw_1500_950' (y/[n]) ? y An RDF 'Resume' operation execution is in progress for device group 'zhyw_1500_950'. Please wait... Merge device track tables between source and target.......Started. Devices: 015A-0179 in (3435,002)......................... Merged. Devices: 013A-0159 in (3435,002)......................... Merged. Devices: 019A-01B9 in (3435,002)......................... Merged. Devices: 017A-0199 in (3435,002)......................... Merged. Devices: 01BA-01D9 in (3435,002)......................... Merged. Devices: 01DA-01F9 in (3435,002)......................... Merged. Devices: 021A-0229 in (3435,002)......................... Merged. Devices: 01FA-0219 in (3435,002)......................... Merged. Merge device track tables between source and target.......Done. Resume RDF link(s)........................................Started. Resume RDF link(s)........................................Done. The RDF 'Resume' operation successfully executed for device group 'zhyw_1500_950'. 我們使用了下面的方式來查看srdf的同步情況 [root@zhyw1]#./symrdf -g zhyw_1500_950 query –i 5 我查了下,同步穩定的時候,從源存儲到目標存儲的同步速度大概在280MB/s。 因爲整個同步過程大概需要3小時,我決定趁着這個功夫,去沙發小趟一下。 丁工進監控室後, 發現我躺在沙發上,建議我去他們的休息室休息。我想想也好,養足精深再幹。 睡夢中被叫醒,應用系統的許工告訴我同步已經差不多結束了,但是同步到後來, 發現同步速度很慢。 srdf 同步 緩慢: [root@zhyw1]#./symrdf -g zhyw_1500_950 query –i 5 DEV001 013A RW 0 0 RW 02EB WD 0 0 S.. Synchronized DEV002 013E RW 0 0 RW 02EF WD 0 0 S.. Synchronized DEV003 0142 RW 0 0 RW 02F3 WD 0 0 S.. Synchronized DEV004 0146 RW 0 0 RW 02F7 WD 0 0 S.. Synchronized DEV005 014A RW 0 31569 RW 02FB WD 0 31569 S.. SyncInProg DEV006 014E RW 0 0 RW 02FF WD 0 0 S.. Synchronized DEV007 0152 RW 0 0 RW 0303 WD 0 0 S.. Synchronized DEV008 0156 RW 0 0 RW 0307 WD 0 0 S.. Synchronized DEV009 015A RW 0 0 RW 030B WD 0 0 S.. Synchronized DEV010 015E RW 0 0 RW 030F WD 0 0 S.. Synchronized DEV011 0162 RW 0 0 RW 0313 WD 0 0 S.. Synchronized DEV012 0166 RW 0 0 RW 0317 WD 0 0 S.. Synchronized DEV013 016A RW 0 0 RW 031B WD 0 0 S.. Synchronized DEV014 016E RW 0 0 RW 031F WD 0 0 S.. Synchronized DEV015 0172 RW 0 0 RW 0323 WD 0 0 S.. Synchronized DEV016 0176 RW 0 0 RW 0327 WD 0 0 S.. Synchronized DEV017 017A RW 0 35641 RW 032B WD 0 35641 S.. SyncInProg DEV018 017E RW 0 0 RW 032F WD 0 0 S.. Synchronized DEV019 0182 RW 0 0 RW 0333 WD 0 0 S.. Synchronized DEV020 0186 RW 0 0 RW 0337 WD 0 0 S.. Synchronized DEV021 018A RW 0 0 RW 033B WD 0 0 S.. Synchronized DEV022 018E RW 0 0 RW 033F WD 0 0 S.. Synchronized DEV023 0192 RW 0 0 RW 0343 WD 0 0 S.. Synchronized DEV024 0196 RW 0 0 RW 0347 WD 0 0 S.. Synchronized DEV025 019A RW 0 0 RW 034B WD 0 0 S.. Synchronized DEV026 019E RW 0 0 RW 034F WD 0 0 S.. Synchronized DEV027 01A2 RW 0 0 RW 0353 WD 0 0 S.. Synchronized DEV028 01A6 RW 0 0 RW 0357 WD 0 0 S.. Synchronized DEV029 01AA RW 0 0 RW 035B WD 0 0 S.. Synchronized DEV030 01AE RW 0 33371 RW 035F WD 0 33363 S.. SyncInProg DEV031 01B2 RW 0 0 RW 0363 WD 0 0 S.. Synchronized DEV032 01B6 RW 0 0 RW 0367 WD 0 0 S.. Synchronized DEV033 01BA RW 0 0 RW 036B WD 0 0 S.. Synchronized DEV034 01BE RW 0 0 RW 036F WD 0 0 S.. Synchronized DEV035 01C2 RW 0 0 RW 0373 WD 0 0 S.. Synchronized DEV036 01C6 RW 0 0 RW 0377 WD 0 0 S.. Synchronized DEV037 01CA RW 0 0 RW 037B WD 0 0 S.. Synchronized DEV038 01CE RW 0 0 RW 037F WD 0 0 S.. Synchronized DEV039 01D2 RW 0 0 RW 0383 WD 0 0 S.. Synchronized DEV040 01D6 RW 0 0 RW 0387 WD 0 0 S.. Synchronized DEV041 01DA RW 0 0 RW 038B WD 0 0 S.. Synchronized DEV042 01DE RW 0 30881 RW 038F WD 0 30848 S.. SyncInProg DEV043 01E2 RW 0 0 RW 0393 WD 0 0 S.. Synchronized DEV044 01E6 RW 0 0 RW 0397 WD 0 0 S.. Synchronized DEV045 01EA RW 0 0 RW 039B WD 0 0 S.. Synchronized DEV046 01EE RW 0 0 RW 039F WD 0 0 S.. Synchronized DEV047 01F2 RW 0 0 RW 03A3 WD 0 0 S.. Synchronized DEV048 01F6 RW 0 0 RW 03A7 WD 0 0 S.. Synchronized DEV049 01FA RW 0 0 RW 03AB WD 0 0 S.. Synchronized DEV050 01FE RW 0 0 RW 03AF WD 0 0 S.. Synchronized DEV051 0202 RW 0 0 RW 03B3 WD 0 0 S.. Synchronized DEV052 0206 RW 0 0 RW 03B7 WD 0 0 S.. Synchronized DEV053 020A RW 0 0 RW 03BB WD 0 0 S.. Synchronized DEV054 020E RW 0 0 RW 03BF WD 0 0 S.. Synchronized DEV055 0212 RW 0 30301 RW 03C3 WD 0 30301 S.. SyncInProg DEV056 0216 RW 0 0 RW 03C7 WD 0 0 S.. Synchronized DEV057 021A RW 0 0 RW 03CB WD 0 0 S.. Synchronized DEV058 021E RW 0 0 RW 03CF WD 0 0 S.. Synchronized DEV059 0222 RW 0 0 RW 03D3 WD 0 0 S.. Synchronized DEV060 0226 RW 0 0 RW 03D7 WD 0 0 S.. Synchronized Total -------- -------- -------- -------- Track(s) 0 161763 0 161722 MB(s) 0.0 10110.2 0.0 10107.6 Synchronization rate : 0.0 MB/S Estimated time to completion : 3 days, 02:53:24 Legend for MODES: M(ode of Operation): A = Async, S = Sync, E = Semi-sync, C = Adaptive Copy D(omino) : X = Enabled, . = Disabled A(daptive Copy) : D = Disk Mode, W = WP Mode, . = ACp off 我看了下,現在同步基本已經hang在那裏了.當然,對應的也只有幾個disk 沒有同步結束了。 Emc工程師解釋說,因爲現在的磁盤通道變少了,所以同步變慢了。 我想,如果停止對源存儲的IO訪問,應該可以加快同步的進程,於是我問黃工,現在是不是可以停止數據庫應用,他說沒問題,申請停機時間一直到凌晨5點,現在還有時間。 於是我很快的停掉了數據庫應用。 果然,停止應用後,同步速度又上來了,達到了38M/s.我們都感到很興奮。 很快同步就只剩下一個DEV了。但是奇怪的是,這個dev的同步一致沒有完成。 看到這麼慢的速度,EMC工程師也無法解釋了,他們馬上去檢查了陣列的情況, 過了一會,EMC 張工給了我確認,下面的這塊盤上有壞塊: DEV017 017A RW 0 1 RW 032B WD 0 1 S.. SyncInProg 我查了下 017A對應的是 hdiskpower20, 017A /dev/rhdiskpower20 09B:1 16C:C3 RDF1+Mir N/Grp'd (M) RW 49140 而這塊盤正好對應了temp 表空間所在的 vg oravg7 看來黃工最開始的懷疑是正確的! 磁盤有問題,必須修復,但如何修復值得仔細研究。 我首先查看了下這個磁盤影響到的數據庫文件有多少。 使用下面命令查看到磁盤對應的pdisk ./symdev list pd 017A /dev/rhdiskpower20 09B:1 16C:C3 RDF1+Mir N/Grp'd (M) RW 49140 使用下面命令查看到pdisk對應的 vg Lspv hdiskpower16 00c450b57c60a7cf oravg7 concurrent hdiskpower17 00c450b57c56d256 oravg7 concurrent hdiskpower18 00c450b57c5a2e83 oravg7 concurrent hdiskpower19 00c450b57c5ee285 oravg7 concurrent hdiskpower20 00c450b57c530a23 oravg7 concurrent hdiskpower21 00c450b57c57d497 oravg7 concurrent hdiskpower22 00c450b57c5bb0e0 oravg7 concurrent hdiskpower23 00c450b57c5f9003 oravg7 concurrent 查看此vg下的lv信息: # lsvg -l oravg7 oravg7: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT lv_raw93_16g raw 512 512 1 open/syncd N/A lv_raw94_16g raw 512 512 1 open/syncd N/A lv_raw95_16g raw 512 512 1 open/syncd N/A lv_raw96_16g raw 512 512 1 open/syncd N/A lv_raw97_16g raw 512 512 1 open/syncd N/A lv_raw98_16g raw 512 512 1 closed/syncd N/A lv_raw99_16g raw 512 512 1 closed/syncd N/A lv_raw100_16g raw 512 512 1 closed/syncd N/A lv_raw101_16g raw 512 512 1 closed/syncd N/A lv_raw102_16g raw 512 512 1 closed/syncd N/A lv_raw104_16g raw 512 512 1 closed/syncd N/A lv_raw105_16g raw 512 512 1 closed/syncd N/A lv_raw106_16g raw 512 512 1 closed/syncd N/A lv_raw107_16g raw 512 512 1 closed/syncd N/A lv_raw108_16g raw 512 512 1 closed/syncd N/A lv_raw109_16g raw 512 512 1 closed/syncd N/A lv_raw110_16g raw 512 512 2 closed/syncd N/A lv_raw111_16g raw 512 512 2 closed/syncd N/A lv_raw112_16g raw 512 512 2 closed/syncd N/A lv_raw113_16g raw 512 512 2 closed/syncd N/A lv_raw114_16g raw 512 512 2 closed/syncd N/A lv_raw115_16g raw 512 512 2 closed/syncd N/A lv_raw116_16g raw 512 512 2 closed/syncd N/A 在數據庫層次查看lv使用情況: 對應了表空間 BSPLOG,temp,以及SRADM_TBS,即這些表空間受影響 SQL> run 1* select a.name,b.name from v$tablespace a ,v$datafile b where a.ts#=b.ts# BSPLOG /dev/rlv_raw94_16g BSPLOG /dev/rlv_raw93_16g BSPLOG /dev/rlv_raw95_16g SRADM_TBS /dev/rlv_raw96_16g SQL> select name from v$tempfile; NAME -------------------------------------------------------------------------------- /dev/rlv_temp_8g /dev/rlv_raw97_16g 王工和我解釋,這個BSPLOG 表空間信息很少,而且容易恢復。而SRADM_TBS是之前流複製應用的表空間,可以忽略,最後一個lv對應了臨時表空間的文件,也可以修復。 聽到這麼解釋,我也有了底氣: 準備採用下面的方式來恢復: 1>對數據庫的BSPLOG表空間用戶進行exp備份。 2>調整臨時表空間的位置到沒有問題的 oravg3,oravg6,再停止數據庫 3>EMC工程師恢復底層的disk fracture 4>開啓數據庫,如果有問題則恢復 5>把BSPLOG表空間上的用戶,表空間重建並利用第一步的備份來imp數據 好,可以開始行動了。 1》首先建立lv,供臨時表空間使用,並賦予權限 mklv -y 'lv_temp2_14g' -T O -w n -s n -r n -t raw oravg3 448 mklv -y 'lv_temp3_14g' -T O -w n -s n -r n -t raw oravg6 448 mklv -y 'lv_templinshi_14g' -T O -w n -s n -r n -t raw oravg2 448 # chown oracle:oinstall /dev/rlv_temp2_14g # chown oracle:oinstall /dev/rlv_temp3_14g [root@zhyw2]#chown oracle:oinstall /dev/rlv_templinshi_1 2》新建temp1 表空間,利用temp1做橋樑重建temp create temporary tablespace temp1 tempfile '/dev/rlv_templinshi_1 ' size 500M reuse autoextend on next 100M maxsize unlimited extent management local uniform size 1M; SQL> create temporary tablespace temp1 2 tempfile '/dev/rlv_templinshi_1' size 500m reuse 3 autoextend on next 100m maxsize unlimited 4 extent management local uniform size 1m; Tablespace created. SQL> alter database default temporary tablespace temp1; Database altered. SQL> drop tablespace temp including contents and datafiles; Tablespace dropped. SQL> create temporary tablespace temp 2 tempfile '/dev/rlv_temp_8g' size 8000m reuse 3 autoextend on next 100m maxsize unlimited 4 extent management local uniform size 1m; Tablespace created. SQL> alter tablespace temp add tempfile '/dev/rlv_temp2_14g' size 13824m; Tablespace altered. SQL> alter tablespace temp add tempfile '/dev/rlv_temp3_14g' size 13824m; Tablespace altered. alter database default temporary tablespace temp; drop tablespace temp1 including contents and datafiles; 3》做數據庫 BSPLOG 表空間對應的用戶的備份 giaplog nohup exp giaplog/password file='/tmp/exp/giaplog.dmp' owner=giaplog log='/tmp/exp/giaplog.log' & 4》 EMC工程師恢復錯誤 5》啓動數據庫,關閉監聽後,重建bsplog表空間和上面的用戶,並導回數據 6》分析數據 SQL> execute dbms_utility.analyze_schema('GIAPLOG','COMPUTE'); PL/SQL procedure successfully completed Executed in 6.9 seconds