CentOS7 Raid卡問題導致磁盤IO暴增與處理

0、問題描述

IO因RAID卡問題,導致IO一直跑滿:

RAID卡默認配置是:當RAID卡沒電池,則不走RAID卡緩存。(華爲服務器)

而我們的服務器都沒有配置RAID電池,所以所有數據經過RAID卡後,無緩存,無數據合併,直接寫入磁盤,由於寫入的數據有大量的隨機IO,導致磁盤IO被喫滿。

 處理完RAID卡問題後,IO下降,監控IO的後面是跑業務,大量計算導致IO上升,爲正常情況

一、MegaCli命令介紹

MegaCli是一款管理維護硬件RAID軟件,可以用來查看raid信息等
MegaCli 的Media Error Count: 0 Other Error Count: 0 
Medai Error Count 表示磁盤可能錯誤,可能是磁盤有壞道,這個值不爲0值得注意,數值越大,危險係數越高,
Other Error Count 表示磁盤可能存在鬆動,可能需要重新再插入。MegaCli 可以對陣列中所有的磁盤進行檢測。

二、安裝過程

1、系統環境

dmidecode -t1 | egrep "Manufacturer|Product Name"
cat /etc/redhat-release
查看廠商和產品型號,以及"Serial Number"

 

2、下載和安裝

rpm -qa | egrep ‘Lib_Utils|MegaCli‘ //檢查是否安裝
https://raw.githubusercontent.com/crazy-zhangcong/tools/master/MegaCli8.07.10.tar.gz
ftp://download2.boulder.ibm.com/ecc/sar/CMA/XSA/ibm_utl_sraidmr_megacli-8.00.48_linux_32-64.zip
解壓過後有linux目錄

1
2
3
4
5
6
7
8
[root@localhost MegaCli8.07.10]# tree
├── Linux
│   ├── Lib_Utils-1.00-09.noarch.rpm
│   ├── MegaCli-8.02.21-1.noarch.rpm
 
[root@localhost Linux]# rpm -ivh Lib_Utils-1.00-09.noarch.rpm MegaCli-8.02.21-1.noarch.rpm
[root@localhost Linux]# ln -sv /opt/MegaRAID/MegaCli/MegaCli64 /usr/bin/
"/usr/bin/MegaCli64" ->"/opt/MegaRAID/MegaCli/MegaCli64"

 

三、硬盤命令使用

1、常用查詢命令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
/opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -Lall -aALL 【查raid級別】
/opt/MegaRAID/MegaCli/MegaCli64 -AdpAllInfo -aALL 【查raid卡信息】
/opt/MegaRAID/MegaCli/MegaCli64 -PDList -aALL 查看【硬盤信息】
/opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -aAll 【查看電池信息】
/opt/MegaRAID/MegaCli/MegaCli64 -FwTermLog -Dsply -aALL 【查看raid卡日誌】
/opt/MegaRAID/MegaCli/MegaCli64 -adpCount 【顯示適配器個數】
/opt/MegaRAID/MegaCli/MegaCli64 -AdpGetTime –aALL 【顯示適配器時間】
/opt/MegaRAID/MegaCli/MegaCli64 -AdpAllInfo -aAll 【顯示所有適配器信息】
/opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -LALL -aAll 【顯示所有邏輯磁盤組信息】
/opt/MegaRAID/MegaCli/MegaCli64 -PDList -aAll 【顯示所有的物理信息】
/opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuStatus -aALL |grep ‘Charger Status’ 【查看充電狀態】
/opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuStatus -aALL【顯示BBU狀態信息】
/opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuCapacityInfo -aALL【顯示BBU容量信息】
/opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuDesignInfo -aALL 【顯示BBU設計參數】
/opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuProperties -aALL 【顯示當前BBU屬性】
/opt/MegaRAID/MegaCli/MegaCli64 -cfgdsply -aALL 【顯示Raid卡型號,Raid設置,Disk相關信息】

 

2、查看磁盤緩存策略

1
2
3
4
5
/opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -Cache -L0 -a0【顯示0 RAID卡 0 RAID組的緩存策略
/opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -Cache -L1 -a0【顯示1 RAID卡 0 RAID組的緩存策略】
/opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -Cache -LALL -a0【顯示所有RAID卡 0 RAID組的緩存策略】
/opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -Cache -LALL –aALL【顯示所有 RAID卡 所有 RAID組的緩存策略】
/opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -DskCache -LALL -aALL

 設置磁盤的緩存模式和訪問方式 (Change Virtual Disk Cache and Access Parameters)

1
2
3
4
5
6
7
8
9
10
11
12
13
Description Allows you to change the following virtual disk parameters:
-WT (Write through), WB (Write back): Selects write policy.
-NORA (Noread ahead), RA (Read ahead), ADRA (Adaptiveread ahead): Selectsread policy.
-Cached, -Direct: Selects cache policy.
-RW, -RO, Blocked: Selects access policy.
-EnDskCache: Enables disk cache.
-DisDskCache: Disables disk cache.
MegaCli -LDSetProp { WT | WB|NORA |RA | ADRA|-Cached|Direct} |
{-RW|RO|Blocked} |
{-Name[string]} |
{-EnDskCache|DisDskCache} –Lx |
-L0,1,2|-Lall -aN|-a0,1,2|-aALL
MegaCli -LDSetProp WT -L0 -a0

 

3、設置磁盤緩存策略

顯示磁盤緩存和訪問方式(Display Virtual Disk Cache and Access Parameters)

1
2
3
4
5
6
7
8
MegaCli -LDGetProp -Cache | -Access | -Name | -DskCache -Lx|-L0,1,2|
-Lall -aN|-a0,1,2|-aALL
Displays the cache and access policies of the virtual disk(s):
-WT (Write through), WB (Write back): Selects write policy.
-NORA (Noread ahead), RA (Read ahead), ADRA (Adaptiveread ahead): Selectsread policy.
-Cache, -Cached, Direct: Displays cache policy.
-Access, -RW, -RO, Blocked: Displays access policy.
-DskCache: Displays physical disk cache policy.

 緩存策略解釋:

1
2
3
4
5
6
7
WT (Write through)
WB (Write back)
NORA (Noread ahead)
RA (Read ahead)
ADRA (Adaptiveread ahead)
C (Cached)
D (Direct)

 

例子:
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp WT|WB|NORA|RA|ADRA -L0 -a0
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp -Cached|-Direct -L0 -a0
enable / disable disk cache
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp -EnDskCache|-DisDskCache -L0 -a0

4、創建陣列

創建一個raid5陣列,由物理盤2,3,4構成,該陣列的熱備盤是物理盤5
/opt/MegaRAID/MegaCli/MegaCli64 -CfgLdAdd -r5 [1:2,1:3,1:4] WB Direct -Hsp[1:5] -a0
創建陣列,不指定熱備
/opt/MegaRAID/MegaCli/MegaCli64 -CfgLdAdd -r5 [1:2,1:3,1:4] WB Direct -a0
創建一個raid10陣列,由物理盤2,3和4,5分別做raid1,在將兩組raid1做raid0
/opt/MegaRAID/MegaCli/MegaCli64 –CfgSpanAdd –r10 –Array0[1:2,1:3] –Array1[1:4,1:5] WB Direct -a0

5、刪除陣列

/opt/MegaRAID/MegaCli/MegaCli64 -CfgLdDel -L1 -a0

6、在線添加磁盤

/opt/MegaRAID/MegaCli/MegaCli64 -LDRecon -Start -r5 -Add -PhysDrv[1:4] -L1 -a0

7、陣列創建完後,會有一個初始化同步塊的過程,可以看看其進度。

/opt/MegaRAID/MegaCli/MegaCli64 -LDInit -ShowProg -LALL -aALL
或者以動態可視化文字界面顯示
/opt/MegaRAID/MegaCli/MegaCli64 -LDInit -ProgDsply -LALL -aALL

8、查看陣列後臺初始化進度

/opt/MegaRAID/MegaCli/MegaCli64 -LDBI -ShowProg -LALL -aALL
或者以動態可視化文字界面顯示
/opt/MegaRAID/MegaCli/MegaCli64 -LDBI -ProgDsply -LALL -aALL

9、指定第5塊盤作爲全局熱備

/opt/MegaRAID/MegaCli/MegaCli64 -PDHSP -Set [-EnclAffinity] [-nonRevertible] -PhysDrv[1:5] -a0

10、指定爲某個陣列的專用熱備

/opt/MegaRAID/MegaCli/MegaCli64 -PDHSP -Set [-Dedicated [-Array1]] [-EnclAffinity] [-nonRevertible] -PhysDrv[1:5] -a0

11、刪除全局熱備

/opt/MegaRAID/MegaCli/MegaCli64 -PDHSP -Rmv -PhysDrv[1:5] -a0

12、將某塊物理盤下線/上線

/opt/MegaRAID/MegaCli/MegaCli64 -PDOffline -PhysDrv [1:4] -a0
/opt/MegaRAID/MegaCli/MegaCli64 -PDOnline -PhysDrv [1:4] -a0

13、查看物理磁盤重建進度

/opt/MegaRAID/MegaCli/MegaCli64 -PDRbld -ShowProg -PhysDrv [1:5] -a0
或者以動態可視化文字界面顯示
/opt/MegaRAID/MegaCli/MegaCli64 -PDRbld -ProgDsply -PhysDrv [1:5] -a0

14、磁盤狀態的變化,從拔盤,到插盤過程中

Device |Normal|Damage|Rebuild|Normal
Virtual Drive |Optimal|Degraded|Degraded|Optimal
Physical Drive |Online|Failed –> Unconfigured|Rebuild|Online

四、其他命令說明

1、當前raid緩存狀態,raid緩存狀態設置爲wb的話要注意電池放電事宜,設置電池放電模式爲自動學習模式

/opt/MegaRAID/MegaCli/MegaCli64 -ldgetprop -dskcache -lall -aall

2、電池設置相關

查看電池狀態信息(Display BBU Status Information)
MegaCli -AdpBbuCmd -GetBbuStatus -aN|-a0,1,2|-aALL
MegaCli -AdpBbuCmd -GetBbuStatus -aALL

查看電池容量(Display BBU Capacity Information)
MegaCli -AdpBbuCmd -GetBbuCapacityInfo -aN|-a0,1,2|-aALL
MegaCli -AdpBbuCmd -GetBbuCapacityInfo –aALL

查看電池設計參數(Display BBU Design Parameters)
MegaCli -AdpBbuCmd -GetBbuDesignInfo -aN|-a0,1,2|-aALL
MegaCli -AdpBbuCmd -GetBbuDesignInfo –aALL

查看電池屬性(Display Current BBU Properties)
MegaCli -AdpBbuCmd -GetBbuProperties -aN|-a0,1,2|-aALL
MegaCli -AdpBbuCmd -GetBbuProperties –aALL

設置電池爲學習模式爲循環模式(Start BBU Learning Cycle)
Description Starts the learning cycle on the BBU.
No parameter is needed for this option.
MegaCli -AdpBbuCmd -BbuLearn -aN|-a0,1,2|-aALL

3、通過腳本檢測RAID 磁盤狀態

1
2
3
MEGACLI="/opt/MegaRAID/MegaCli/MegaCli64 "
$MEGACLI -pdlist -aALL  |grep "Firmware state" |awk -F : ‘{print $2}‘ |awk -F , ‘{print $1}‘
$MEGACLI -pdlist -aALL  |grep -E"Media Error|Other Error" |awk -F : ‘{print $2}‘

 

轉載自:centos 系統查看raid信息 - xuefy - 博客園 (cnblogs.com)

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章