linux平均負載與CPU使用率關係

想要知道怎麼排查，先看哪些因素會影響到平均負載，然後逐一排查。

平均負載的定義：處於運行狀態和不可中斷狀態的平均進程數，也就是平均活躍進程數。

1、判斷平均負載合理性

（1）系統負載和CPU總數（邏輯總數）的關係，過載情況：load average > CPU 總數

（2）查看系統平均負載

$ top
top - 21:52:21 up 386 days, 4:10, 1 user, load average: 0.00, 0.00, 0.00
Tasks: 164 total, 1 running, 163 sleeping, 0 stopped, 0 zombie

或
$ uptime
21:53:01 up 386 days, 4:11, 1 user, load average: 0.00, 0.00, 0.00 # 依次表示1min, 5min，15min平均負載

（3）查看系統的CPU總數

$ grep 'model name' /proc/cpuinfo
model name   : QEMU Virtual CPU version (cpu64-rhel6)
model name   : QEMU Virtual CPU version (cpu64-rhel6)
model name   : QEMU Virtual CPU version (cpu64-rhel6)
model name   : QEMU Virtual CPU version (cpu64-rhel6)

$ grep 'model name' /proc/cpuinfo|wc -l
4

(4) 結合（2）中1min, 5min, 15min負載情況判斷負載是在增加還是減少

1min負載>5min負載>15min負載負載在增加

1min負載<5min負載<15min負載負載在減少

(5) 當負載高於CPU總數80%時就要排查原因了

2、通過步驟1如果已經確定了負載很高，如何排查？

負載增加或者很大時，CPU使用率可能在相同區間內增加，但有時未必，爲什麼呢？

還是迴歸到定義，平均負載：單位時間內處於運行狀態和不可中斷狀態的進程數（中斷有CPU中斷和IO中斷）；CPU使用率：單位時間內CPU處理進程情況統計。所以，平均負載還包括了正在運行使用的CPU進程，等待CPU和等待I/O的進程。

用到的命令：mpstat pidstat(sudo yum install sysstat或sudo apt-get install sysstat安裝命令)

（1）$ watch -d uptime #查看負載變化

Every 2.0s: uptime Sun Jan 6 23:54:40 2019

23:54:40 up 387 days, 6:13, 1 user, load average: 0.01, 0.01, 0.00

（2）$ mpstat -P ALL 5 1 #查看CPU性能指標，並每隔5s輸出1組數據
Linux 2.6.32-696.16.1.el6.x86_64 (XXX-hostname-000) 01/06/2019 _x86_64_ (4 CPU)

11:55:25 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
11:55:30 PM all 2.20 0.00 1.25 0.00 0.00 0.30 0.10 0.00 96.15 #總的cpu情況
11:55:30 PM 0 2.79 0.00 2.00 0.00 0.00 0.60 0.20 0.00 94.41 #邏輯cpu0的情況
11:55:30 PM 1 2.40 0.00 1.20 0.00 0.00 0.20 0.00 0.00 96.20
11:55:30 PM 2 2.00 0.00 1.00 0.00 0.00 0.20 0.00 0.00 96.81
11:55:30 PM 3 1.80 0.00 1.00 0.00 0.00 0.40 0.00 0.00 96.79

通過上面兩步，判斷負載升高時，%usr，%iowait 是否升高。

場景a） %usr高 CPU使用率超過80%，考慮擴容

場景b) %iowait 高，CPU不高判斷是否存在大量的磁盤IO操作，批量查詢或者寫入，大的json。

針對上面2種情況，使用pidstat找出對應的進程：

$ pidstat -u 5 1 #5s輸出1組進程數據
Linux 2.6.32-696.16.1.el6.x86_64 (XXX-hostname-000) 01/06/2019 _x86_64_ (4 CPU)

11:57:57 PM PID %usr %system %guest %CPU CPU Command
11:58:02 PM 1383 0.00 0.20 0.00 0.20 3 xxx
11:58:02 PM 3961 1.00 1.40 0.00 2.40 0 xxx
11:58:02 PM 4048 0.20 0.00 0.00 0.20 2 xxx
11:58:02 PM 8473 0.00 0.20 0.00 0.20 1 java
11:58:02 PM 20394 1.60 0.00 0.00 1.60 2 php

場景c) 大量進程情況
$ pidstat -u 5 1
14:23:25 UID PID %usr %system %guest %wait %CPU CPU Command
14:23:30 0 3190 25.00 0.00 0.00 74.80 25.00 0 stress
14:23:30 0 3191 25.00 0.00 0.00 75.20 25.00 0 stress
14:23:30 0 3192 25.00 0.00 0.00 74.80 25.00 1 stress
14:23:30 0 3193 25.00 0.00 0.00 75.00 25.00 1 stress
14:23:30 0 3194 24.80 0.00 0.00 74.60 24.80 0 stress
14:23:30 0 3195 24.80 0.00 0.00 75.00 24.80 0 stress
14:23:30 0 3196 24.80 0.00 0.00 74.60 24.80 1 stress
14:23:30 0 3197 24.80 0.00 0.00 74.80 24.80 1 stress
14:23:30 0 3200 0.00 0.20 0.00 0.20 0.20 0 pidstat

可以看出，8 個進程在爭搶 2 個 CPU，每個進程等待 CPU時間高達25%，超出了CPU計算能力的進程最終導致CPU過載。

當然，還有一種情況CPU使用高了，負載並不高，考慮是單個進程引起，定位問題進程，請參考：https://blog.csdn.net/wangtingting_100/article/details/80666709

總結下，負載升高可能是CPU升高引起的，也可能是IO等待引起的。CPU升高可能是大流量引起的，也可能某個進程導致的。

注：學習倪朋飛老師性能優化課程總結。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

linux平均負載與CPU使用率關係

PDManer [元數建模]-v4.9.0 發佈：一款簡單好用的數據庫建模平臺

使用neovim打造go ide(支持代碼跳轉, 代碼補全, 實時語法檢查)

sql求連續值問題

cs01 CSS Syntax

挑戰程序設計競賽 2.3章習題 poj 3046 Ant Counting

[MASM拾遺]Offset僞指令

h30 HTML Layout Elements

瞭解顯卡

一款基於C#開發的通訊調試工具（支持Modbus RTU、MQTT調試）

Linux/Golang/glibC系統調用

公有云vm批量回收

linux內存泄漏與OOM問題排查

tsar之查看收集應用nginx信息

php應用CPU使用率100%解決思路

python之連接Mysql實現增刪改查

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結