引言
程序運行卡的時候,我們經常需要定位到底程序在執行什麼代碼最消耗CPU,這裏有個我經常用的小技巧,分享一下。
過程如下:
首先,Linux下面top命令可以直接找出最消耗CPU的進程
top
可以看到如下顯示,我們看到前面有三個都是100% CPU佔用的
top - 19:53:23 up 118 days, 6:09, 4 users, load average: 3.00, 2.68, 1.68
Tasks: 446 total, 1 running, 438 sleeping, 7 stopped, 0 zombie
Cpu(s): 8.6%us, 4.2%sy, 0.0%ni, 87.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 132103700k total, 76585104k used, 55518596k free, 424052k buffers
Swap: 16383996k total, 0k used, 16383996k free, 36290728k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
156888 hdfs 20 0 34.9g 544m 11m S 100.8 0.4 9:44.08 java
156928 hdfs 20 0 34.9g 545m 11m S 100.8 0.4 9:42.84 java
156769 hdfs 20 0 34.9g 544m 11m S 100.2 0.4 9:53.26 java
153980 hdfs 20 0 64.0g 24g 55m S 1.7 19.6 5879:56 presto-server
105 root 20 0 0 0 0 S 0.3 0.0 45:26.08 events/6
106 root 20 0 0 0 0 S 0.3 0.0 4:41.29 events/7 代碼片
接下來,我們一個個分析,先找第一個最消耗CPU的,我們使用命令
top -Hp 156888
這個命令可以展示進程內最消耗CPU的線程,可以得到如下結果
top - 20:00:25 up 118 days, 6:16, 2 users, load average: 3.00, 2.91, 2.14
Tasks: 39 total, 1 running, 38 sleeping, 0 stopped, 0 zombie
Cpu(s): 9.3%us, 3.5%sy, 0.0%ni, 87.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 132103700k total, 76578608k used, 55525092k free, 424052k buffers
Swap: 16383996k total, 0k used, 16383996k free, 36290948k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
156926 hdfs 20 0 34.9g 544m 11m R 99.8 0.4 16:43.39 java
156901 hdfs 20 0 34.9g 544m 11m S 0.3 0.4 0:00.16 java
156888 hdfs 20 0 34.9g 544m 11m S 0.0 0.4 0:00.00 java
156889 hdfs 20 0 34.9g 544m 11m S 0.0 0.4 0:00.12 java
156890 hdfs 20 0 34.9g 544m 11m S 0.0 0.4 0:00.16 java
156891 hdfs 20 0 34.9g 544m 11m S 0.0 0.4 0:00.17 java
156892 hdfs 20 0 34.9g 544m 11m S 0.0 0.4 0:00.18 java
156893 hdfs 20 0 34.9g 544m 11m S 0.0 0.4 0:00.17 java
排名第一的便是我們這個進程下面的線程號了。
接下來,我們使用一個java中的jstack命令去定位哪一行:
jstack 156888
這樣子直接獲取信息內容是這樣子的:
"Service Thread" #17 daemon prio=9 os_prio=0 tid=0x00002b47080f6800 nid=0x264fc runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread11" #16 daemon prio=9 os_prio=0 tid=0x00002b47080f3800 nid=0x264fb waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread10" #15 daemon prio=9 os_prio=0 tid=0x00002b47080f1000 nid=0x264fa waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread9" #14 daemon prio=9 os_prio=0 tid=0x00002b47080ef000 nid=0x264f9 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread8" #13 daemon prio=9 os_prio=0 tid=0x00002b47080ed000 nid=0x264f8 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
我們大致讀得出這個線程號,但是我們需要定位最消耗cpu的那個線程,我們看到nid項其實就是線程號,不過是16進制輸出的,所以我們也要把我們的線程號轉換成16進制,可以直接用計算器,沒問題,我這裏用linux下面打印命令:
printf "%x\n" 156926
結果爲:
264fe
這個時候我們可以用這個數字結合我們的jstack命令去查詢了:
jstack 156888 |grep 264fe
我們這樣子可以搜索到這個線程,但是隻有一行結果
"thread1" #18 prio=5 os_prio=0 tid=0x00002b4708167000 nid=0x264fe runnable [0x00002b4723545000]
我們可以在grep前後增加要顯示的行數,可以看到上下文的內容,完整命令如下:
jstack 156888 |grep -A 20 -B 10 264fe
我們可以輸出我們要的結果
"thread1" #18 prio=5 os_prio=0 tid=0x00002b4708167000 nid=0x264fe runnable [0x00002b4723545000]
java.lang.Thread.State: RUNNABLE
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
- locked <0x0000000080208ea8> (a java.io.BufferedOutputStream)
at java.io.PrintStream.write(PrintStream.java:482)
- locked <0x00000000802048c0> (a java.io.PrintStream)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104)
- locked <0x00000000802048a8> (a java.io.OutputStreamWriter)
at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:185)
at java.io.PrintStream.write(PrintStream.java:527)
- eliminated <0x00000000802048c0> (a java.io.PrintStream)
at java.io.PrintStream.print(PrintStream.java:669)
at java.io.PrintStream.println(PrintStream.java:806)
- locked <0x00000000802048c0> (a java.io.PrintStream)
at Demo1_16.lambda$main$0(Demo1_16.java:6)
at Demo1_16$$Lambda$1/834600351.run(Unknown Source)
從內容上面看,我們尅看到是在第6行代碼上面有個打印,我把完整代碼貼出來:
public class Demo1_16 {
public static void main(String[] args) {
new Thread(()->{
while (true){
System.out.println("hi!");
}
},"thread1").start();
}
}
可以看到是有個死循環導致的,至於爲什麼有三個佔用最高的,那是因爲我同時啓動了三個這個程序。
我們把命令可以直接連起來:
jstack 156888 |grep `printf "%x" 264fe` -C6 --color
總結:
要定位最消耗的代碼,我們的思路是縮小範圍,順序是進程->線程->代碼,最後再看實際代碼去解決問題