ASM磁盤狀態爲forcing

結論:
如果一個diskgroup有一個failgroup offline,超出disk_repair_time定義的時間後,asm會對該failgroup做drop操作。
如果drop之後剩餘的failgroup少於冗餘策略的最低要求(normal爲2,high爲3)或者剩餘空間不足以滿足冗餘的空間
需求,也就是rebalance無法正常進行,就會出現forcing狀態的磁盤。

如果有forcing狀態的盤,使用alter添加磁盤時要指定name, name是forcing狀態的盤原先的name。

例如ssddg的磁盤ssddg_0001盤被drop掉了,state爲forcing,用下面命令恢復磁盤組之後,SSDDG_0001狀態變爲normal:

alter diskgroup ssddg add failgroup rac2 disk '/dev/raw/raw3' name SSDDG_0001 force;
實驗過程:

實驗對象爲一個有3個failgroup的磁盤組,實驗需要,把disk_repair_time設置爲5min:

SQL>  select name,failgroup,path from v$asm_disk where group_number=(select group_number from v$asm_diskgroup where  name='SSDDG');
NAME        FAILG PATH
--------------- ----- --------------------------------------------------------------------------------
SSDDG_0001    RAC2  /dev/raw/raw3
SSDDG_0000    RAC1  /dev/raw/raw2
SSDDG_0001_0    RAC3  /dev/raw/raw5
ASMCMD> lsattr -G ssddg -lm
Group_Name  Name                     Value       RO  Sys  
SSDDG       access_control.enabled   FALSE       N   Y    
SSDDG       access_control.umask     066         N   Y    
SSDDG       au_size                  1048576     Y   Y    
SSDDG       cell.smart_scan_capable  FALSE       N   N    
SSDDG       compatible.asm           11.2.0.3.0  N   Y    
SSDDG       compatible.rdbms         11.2.0.3    N   Y    
SSDDG       content.type             data        N   Y    
SSDDG       disk_repair_time         5m          N   Y    
SSDDG       idp.boundary             auto        N   Y    
SSDDG       idp.type                 dynamic     N   Y    
SSDDG       sector_size              512         Y   Y

磁盤組裏只有很少的數據:

ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512   4096  1048576     20480    18245                0           18245              0             N  DATADG/
MOUNTED  EXTERN  N         512   4096  1048576      2048     1653                0            1653              0             Y  OCRVOTE/
MOUNTED  NORMAL  N         512   4096  1048576     24576    24237             8192            8022              0             N  SSDDG/

實驗一、
手動把SSDDG_0001_0這塊盤offline:

SQL> select name,failgroup,path,mode_status,state from v$asm_disk where group_number=(select group_number from v$asm_diskgroup where  name='SSDDG');
NAME        FAILG PATH                                           MODE_STATUS    STATE
--------------- ----- -------------------------------------------------------------------------------- -------------- ----------------
SSDDG_0001_0    RAC3                                               OFFLINE          NORMAL
SSDDG_0001    RAC2  /dev/raw/raw3                                       ONLINE          NORMAL
SSDDG_0000    RAC1  /dev/raw/raw2                                       ONLINE          NORMAL

5分鐘之後盤被drop

NAME        FAILG PATH                                           MODE_STATUS    STATE
--------------- ----- -------------------------------------------------------------------------------- -------------- ----------------
_DROPPED_0001_S RAC3                                               OFFLINE          FORCING
SDDG
SSDDG_0001    RAC2  /dev/raw/raw3                                       ONLINE          NORMAL
SSDDG_0000    RAC1  /dev/raw/raw2                                       ONLINE          NORMAL

drop之後的最終結果:

NAME        FAILG PATH                                           MODE_STATUS    STATE
--------------- ----- -------------------------------------------------------------------------------- -------------- ----------------
SSDDG_0001    RAC2  /dev/raw/raw3                                       ONLINE          NORMAL
SSDDG_0000    RAC1  /dev/raw/raw2                                       ONLINE          NORMAL

結論:offline的盤被drop之後,磁盤組還能保持2個failgroup,空間也足夠normal冗餘,則盤的信息不再有記錄。

實驗二、在實驗一基礎上繼續offline一個failgroup,drop之後的最終結果:

NAME        FAILG PATH                                           MODE_STATUS    STATE
--------------- ----- -------------------------------------------------------------------------------- -------------- ----------------
_DROPPED_0002_S RAC2                                               OFFLINE          FORCING
SDDG
SSDDG_0000    RAC1  /dev/raw/raw2                                       ONLINE          NORMAL

實驗三、環境同實驗一,在ssddg上創建10G的數據文件,如果drop掉一個failgroup,diskgroup的空間不能滿足normal冗餘。將一個failgroup offline ,drop之後的最終結果:

NAME        FAILG PATH                                           MODE_STATUS    STATE
--------------- ----- -------------------------------------------------------------------------------- -------------- ----------------
_DROPPED_0002_S RAC3                                               OFFLINE          FORCING
SDDG
SSDDG_0000    RAC1  /dev/raw/raw2                                       ONLINE          NORMAL
SSDDG_0001    RAC2  /dev/raw/raw3                                       ONLINE          NORMAL

asmcmd lsop的結果:

[grid@node1 ~]$ asmcmd lsop
Group_Name  Dsk_Num  State  Power  EST_WORK  EST_RATE  EST_TIME  
SSDDG       REBAL    ERRS   1

alert日誌內容:

Thu Mar 31 15:31:00 2016
ERROR: ORA-15041 thrown in ARB0 for group number 3
Errors in file /oracle/11.2.0/grid/log/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_22412.trc:
ORA-15041: diskgroup "SSDDG" space exhausted
Thu Mar 31 15:31:00 2016
NOTE: stopping process ARB0
NOTE: rebalance interrupted for group 3/0xf7e87524 (SSDDG)
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章