0、問題描述
IO因RAID卡問題,導致IO一直跑滿:
RAID卡默認配置是:當RAID卡沒電池,則不走RAID卡緩存。(華爲服務器)
而我們的服務器都沒有配置RAID電池,所以所有數據經過RAID卡後,無緩存,無數據合併,直接寫入磁盤,由於寫入的數據有大量的隨機IO,導致磁盤IO被喫滿。
處理完RAID卡問題後,IO下降,監控IO的後面是跑業務,大量計算導致IO上升,爲正常情況
一、MegaCli命令介紹
MegaCli是一款管理維護硬件RAID軟件,可以用來查看raid信息等
MegaCli 的Media Error Count: 0 Other Error Count: 0
Medai Error Count 表示磁盤可能錯誤,可能是磁盤有壞道,這個值不爲0值得注意,數值越大,危險係數越高,
Other Error Count 表示磁盤可能存在鬆動,可能需要重新再插入。MegaCli 可以對陣列中所有的磁盤進行檢測。
二、安裝過程
1、系統環境
dmidecode -t1 | egrep "Manufacturer|Product Name"
cat /etc/redhat-release
查看廠商和產品型號,以及"Serial Number"
2、下載和安裝
rpm -qa | egrep ‘Lib_Utils|MegaCli‘ //檢查是否安裝
https://raw.githubusercontent.com/crazy-zhangcong/tools/master/MegaCli8.07.10.tar.gz
ftp://download2.boulder.ibm.com/ecc/sar/CMA/XSA/ibm_utl_sraidmr_megacli-8.00.48_linux_32-64.zip
解壓過後有linux目錄
1
2
3
4
5
6
7
8
|
[root@localhost MegaCli8.07.10] # tree ├── Linux │ ├── Lib_Utils-1.00-09.noarch.rpm │ ├── MegaCli-8.02.21-1.noarch.rpm [root@localhost Linux] # rpm -ivh Lib_Utils-1.00-09.noarch.rpm MegaCli-8.02.21-1.noarch.rpm [root@localhost Linux] # ln -sv /opt/MegaRAID/MegaCli/MegaCli64 /usr/bin/ "/usr/bin/MegaCli64" -> "/opt/MegaRAID/MegaCli/MegaCli64" |
三、硬盤命令使用
1、常用查詢命令
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
/opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -Lall -aALL 【查raid級別】 /opt/MegaRAID/MegaCli/MegaCli64 -AdpAllInfo -aALL 【查raid卡信息】 /opt/MegaRAID/MegaCli/MegaCli64 -PDList -aALL 查看【硬盤信息】 /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -aAll 【查看電池信息】 /opt/MegaRAID/MegaCli/MegaCli64 -FwTermLog -Dsply -aALL 【查看raid卡日誌】 /opt/MegaRAID/MegaCli/MegaCli64 -adpCount 【顯示適配器個數】 /opt/MegaRAID/MegaCli/MegaCli64 -AdpGetTime –aALL 【顯示適配器時間】 /opt/MegaRAID/MegaCli/MegaCli64 -AdpAllInfo -aAll 【顯示所有適配器信息】 /opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -LALL -aAll 【顯示所有邏輯磁盤組信息】 /opt/MegaRAID/MegaCli/MegaCli64 -PDList -aAll 【顯示所有的物理信息】 /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuStatus -aALL | grep ‘Charger Status’ 【查看充電狀態】 /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuStatus -aALL【顯示BBU狀態信息】 /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuCapacityInfo -aALL【顯示BBU容量信息】 /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuDesignInfo -aALL 【顯示BBU設計參數】 /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuProperties -aALL 【顯示當前BBU屬性】 /opt/MegaRAID/MegaCli/MegaCli64 -cfgdsply -aALL 【顯示Raid卡型號,Raid設置,Disk相關信息】 |
2、查看磁盤緩存策略
1
2
3
4
5
|
/opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -Cache -L0 -a0【顯示0 RAID卡 0 RAID組的緩存策略 /opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -Cache -L1 -a0【顯示1 RAID卡 0 RAID組的緩存策略】 /opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -Cache -LALL -a0【顯示所有RAID卡 0 RAID組的緩存策略】 /opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -Cache -LALL –aALL【顯示所有 RAID卡 所有 RAID組的緩存策略】 /opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -DskCache -LALL -aALL |
設置磁盤的緩存模式和訪問方式 (Change Virtual Disk Cache and Access Parameters)
1
2
3
4
5
6
7
8
9
10
11
12
13
|
Description Allows you to change the following virtual disk parameters: -WT (Write through), WB (Write back): Selects write policy. -NORA (No read ahead), RA (Read ahead), ADRA (Adaptive read ahead): Selects read policy. -Cached, -Direct: Selects cache policy. -RW, -RO, Blocked: Selects access policy. -EnDskCache: Enables disk cache. -DisDskCache: Disables disk cache. MegaCli -LDSetProp { WT | WB|NORA |RA | ADRA|-Cached|Direct} | {-RW|RO|Blocked} | {-Name[string]} | {-EnDskCache|DisDskCache} –Lx | -L0,1,2|-Lall -aN|-a0,1,2|-aALL MegaCli -LDSetProp WT -L0 -a0 |
3、設置磁盤緩存策略
顯示磁盤緩存和訪問方式(Display Virtual Disk Cache and Access Parameters)
1
2
3
4
5
6
7
8
|
MegaCli -LDGetProp -Cache | -Access | -Name | -DskCache -Lx|-L0,1,2| -Lall -aN|-a0,1,2|-aALL Displays the cache and access policies of the virtual disk(s): -WT (Write through), WB (Write back): Selects write policy. -NORA (No read ahead), RA (Read ahead), ADRA (Adaptive read ahead): Selects read policy. -Cache, -Cached, Direct: Displays cache policy. -Access, -RW, -RO, Blocked: Displays access policy. -DskCache: Displays physical disk cache policy. |
緩存策略解釋:
1
2
3
4
5
6
7
|
WT (Write through) WB (Write back) NORA (No read ahead) RA (Read ahead) ADRA (Adaptive read ahead) C (Cached) D (Direct) |
例子:
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp WT|WB|NORA|RA|ADRA -L0 -a0
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp -Cached|-Direct -L0 -a0
enable / disable disk cache
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp -EnDskCache|-DisDskCache -L0 -a0
4、創建陣列
創建一個raid5陣列,由物理盤2,3,4構成,該陣列的熱備盤是物理盤5
/opt/MegaRAID/MegaCli/MegaCli64 -CfgLdAdd -r5 [1:2,1:3,1:4] WB Direct -Hsp[1:5] -a0
創建陣列,不指定熱備
/opt/MegaRAID/MegaCli/MegaCli64 -CfgLdAdd -r5 [1:2,1:3,1:4] WB Direct -a0
創建一個raid10陣列,由物理盤2,3和4,5分別做raid1,在將兩組raid1做raid0
/opt/MegaRAID/MegaCli/MegaCli64 –CfgSpanAdd –r10 –Array0[1:2,1:3] –Array1[1:4,1:5] WB Direct -a0
5、刪除陣列
/opt/MegaRAID/MegaCli/MegaCli64 -CfgLdDel -L1 -a0
6、在線添加磁盤
/opt/MegaRAID/MegaCli/MegaCli64 -LDRecon -Start -r5 -Add -PhysDrv[1:4] -L1 -a0
7、陣列創建完後,會有一個初始化同步塊的過程,可以看看其進度。
/opt/MegaRAID/MegaCli/MegaCli64 -LDInit -ShowProg -LALL -aALL
或者以動態可視化文字界面顯示
/opt/MegaRAID/MegaCli/MegaCli64 -LDInit -ProgDsply -LALL -aALL
8、查看陣列後臺初始化進度
/opt/MegaRAID/MegaCli/MegaCli64 -LDBI -ShowProg -LALL -aALL
或者以動態可視化文字界面顯示
/opt/MegaRAID/MegaCli/MegaCli64 -LDBI -ProgDsply -LALL -aALL
9、指定第5塊盤作爲全局熱備
/opt/MegaRAID/MegaCli/MegaCli64 -PDHSP -Set [-EnclAffinity] [-nonRevertible] -PhysDrv[1:5] -a0
10、指定爲某個陣列的專用熱備
/opt/MegaRAID/MegaCli/MegaCli64 -PDHSP -Set [-Dedicated [-Array1]] [-EnclAffinity] [-nonRevertible] -PhysDrv[1:5] -a0
11、刪除全局熱備
/opt/MegaRAID/MegaCli/MegaCli64 -PDHSP -Rmv -PhysDrv[1:5] -a0
12、將某塊物理盤下線/上線
/opt/MegaRAID/MegaCli/MegaCli64 -PDOffline -PhysDrv [1:4] -a0
/opt/MegaRAID/MegaCli/MegaCli64 -PDOnline -PhysDrv [1:4] -a0
13、查看物理磁盤重建進度
/opt/MegaRAID/MegaCli/MegaCli64 -PDRbld -ShowProg -PhysDrv [1:5] -a0
或者以動態可視化文字界面顯示
/opt/MegaRAID/MegaCli/MegaCli64 -PDRbld -ProgDsply -PhysDrv [1:5] -a0
14、磁盤狀態的變化,從拔盤,到插盤過程中
Device |Normal|Damage|Rebuild|Normal
Virtual Drive |Optimal|Degraded|Degraded|Optimal
Physical Drive |Online|Failed –> Unconfigured|Rebuild|Online
四、其他命令說明
1、當前raid緩存狀態,raid緩存狀態設置爲wb的話要注意電池放電事宜,設置電池放電模式爲自動學習模式
/opt/MegaRAID/MegaCli/MegaCli64 -ldgetprop -dskcache -lall -aall
2、電池設置相關
查看電池狀態信息(Display BBU Status Information)
MegaCli -AdpBbuCmd -GetBbuStatus -aN|-a0,1,2|-aALL
MegaCli -AdpBbuCmd -GetBbuStatus -aALL
查看電池容量(Display BBU Capacity Information)
MegaCli -AdpBbuCmd -GetBbuCapacityInfo -aN|-a0,1,2|-aALL
MegaCli -AdpBbuCmd -GetBbuCapacityInfo –aALL
查看電池設計參數(Display BBU Design Parameters)
MegaCli -AdpBbuCmd -GetBbuDesignInfo -aN|-a0,1,2|-aALL
MegaCli -AdpBbuCmd -GetBbuDesignInfo –aALL
查看電池屬性(Display Current BBU Properties)
MegaCli -AdpBbuCmd -GetBbuProperties -aN|-a0,1,2|-aALL
MegaCli -AdpBbuCmd -GetBbuProperties –aALL
設置電池爲學習模式爲循環模式(Start BBU Learning Cycle)
Description Starts the learning cycle on the BBU.
No parameter is needed for this option.
MegaCli -AdpBbuCmd -BbuLearn -aN|-a0,1,2|-aALL
3、通過腳本檢測RAID 磁盤狀態
1
2
3
|
MEGACLI= "/opt/MegaRAID/MegaCli/MegaCli64 " $MEGACLI -pdlist -aALL | grep "Firmware state" | awk -F : ‘{print $2}‘ | awk -F , ‘{print $1}‘ $MEGACLI -pdlist -aALL | grep -E "Media Error|Other Error" | awk -F : ‘{print $2}‘ |