線上發生線程死鎖問題,怎麼排查?

記一次面試騰訊全資子公司問到的問題,其中問到的一個問題,線上環境發生死鎖,你怎麼排查?我的回答時找到對應機器及對應進程號,根據命令jstack pid命令即可找到死鎖原因。現用一個實際例子演示一下,以加深自己的記憶。

先寫一段死鎖代碼,如下:

public class Atr implements Runnable{

        private String lockA;
        private String lockB;
        public Atr(String lockA,String lockB){
            this.lockB=lockB;
            this.lockA=lockA;
        }

        @Override
        public void run() {
            synchronized (lockA){
                System.out.println("等待A");
                try {
                    Thread.sleep(1000);//獲取死鎖的效果明顯
                    synchronized (lockB){
                        System.out.println("獲取鎖B");
                    }
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }

            }
        }
        public static void main(String[] args) {
            String lockA = "lockA";
            String lockB = "lockB";
            new Thread(new Atr(lockA,lockB),"VVVVV").start();
            new Thread(new Atr(lockB,lockA),"BBBBB").start();
        }

    }


然後在該目錄下執行 以下命令進行編譯,編譯成Atr.class文件

javac Atr.java

然後在該目錄下執行下面命令:

java Atr

注意:如果提示Could not find or load main class,請先檢查java CLASSPATH環境變量是否配置正確,可參考我的linux服務器java環境配置(windows java環境配置可自行網上查找):

export JAVA_HOME=/usr/local/jdk1.8.0_211
export JRE_HOME=${JAVA_HOME}/jre  
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib  
export PATH=${JAVA_HOME}/bin:$PATH

環境問題解決後執行java Atr命令啓動程序,可以看到以下打印信息:

我們發現,程序只輸出了兩行內容,然後程序就不再打印其它的東西了,但是程序並沒有停止。這樣就產生了死鎖。 當線程"BBBBB"使用synchronized鎖住了lockA的同時,線程"VVVVV"也是用synchronized鎖住了lockB。當兩個線程都執行完第一個打印任務的時候,線程"BBBBB"想鎖住lockB,線程"VVVVV"想鎖住lockA。但是,線程"BBBBB"當前鎖着lockA,線程"VVVVV"鎖着lockB。所以兩個線程都無法繼續執行下去,就造成了死鎖。

然後通過jps -l 查找正在運行的java程序的pid,如下圖所示:

如上圖所示可知進程pid 爲 15107

接着我們使用jstack pid執行命令:

jstack 15107

 控制檯上可查看到如下堆棧信息:

Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.211-b12 mixed mode):

"Attach Listener" #12 daemon prio=9 os_prio=0 tid=0x00007f162c001000 nid=0x3c92 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"DestroyJavaVM" #11 prio=5 os_prio=0 tid=0x00007f167c009800 nid=0x3b04 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"BBBBB" #10 prio=5 os_prio=0 tid=0x00007f167c0e3800 nid=0x3b15 waiting for monitor entry [0x00007f1661de4000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at Atr.run(Atr.java:17)
        - waiting to lock <0x00000000e1c5bf90> (a java.lang.String)
        - locked <0x00000000e1c5bfc8> (a java.lang.String)
        at java.lang.Thread.run(Thread.java:748)

"VVVVV" #9 prio=5 os_prio=0 tid=0x00007f167c0e2000 nid=0x3b14 waiting for monitor entry [0x00007f1661ee5000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at Atr.run(Atr.java:17)
        - waiting to lock <0x00000000e1c5bfc8> (a java.lang.String)
        - locked <0x00000000e1c5bf90> (a java.lang.String)
        at java.lang.Thread.run(Thread.java:748)

"Service Thread" #8 daemon prio=9 os_prio=0 tid=0x00007f167c0ce800 nid=0x3b12 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007f167c0c1800 nid=0x3b11 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f167c0bf800 nid=0x3b10 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f167c0bc800 nid=0x3b0f waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f167c0bb000 nid=0x3b0e runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f167c088000 nid=0x3b0d in Object.wait() [0x00007f16625ec000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000e1c08ed0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
        - locked <0x00000000e1c08ed0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216)

"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f167c085800 nid=0x3b0c in Object.wait() [0x00007f16626ed000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000e1c06bf8> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:502)
        at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
        - locked <0x00000000e1c06bf8> (a java.lang.ref.Reference$Lock)
  at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)

"VM Thread" os_prio=0 tid=0x00007f167c07b800 nid=0x3b0b runnable 

"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f167c01e800 nid=0x3b05 runnable 

"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f167c020800 nid=0x3b06 runnable 

"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007f167c022000 nid=0x3b07 runnable 

"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007f167c024000 nid=0x3b08 runnable 

"GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007f167c026000 nid=0x3b09 runnable 

"GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007f167c027800 nid=0x3b0a runnable 

"VM Periodic Task Thread" os_prio=0 tid=0x00007f167c0d1800 nid=0x3b13 waiting on condition 

JNI global references: 5


Found one Java-level deadlock:
=============================
"BBBBB":
  waiting to lock monitor 0x00007f16380062c8 (object 0x00000000e1c5bf90, a java.lang.String),
  which is held by "VVVVV"
"VVVVV":
  waiting to lock monitor 0x00007f1638004e28 (object 0x00000000e1c5bfc8, a java.lang.String),
  which is held by "BBBBB"

Java stack information for the threads listed above:
===================================================
"BBBBB":
        at Atr.run(Atr.java:17)
        - waiting to lock <0x00000000e1c5bf90> (a java.lang.String)
        - locked <0x00000000e1c5bfc8> (a java.lang.String)
        at java.lang.Thread.run(Thread.java:748)
"VVVVV":
        at Atr.run(Atr.java:17)
        - waiting to lock <0x00000000e1c5bfc8> (a java.lang.String)
        - locked <0x00000000e1c5bf90> (a java.lang.String)
        at java.lang.Thread.run(Thread.java:748)

Found 1 deadlock.

由上面堆棧信息Found one Java-level deadlock指出造成死鎖的兩個線程的內容。然後,又通過 Java stack information for the threads listed above來顯示更詳細的死鎖的信息。 其上面意思是:

線程"BBBBB"在想要執行第17行的時候,當前鎖住了資源<0x00000000e1c5bfc8>,但是他在等待資源<0x00000000e1c5bf90> 線程"VVVVV"在想要執行第17行的時候,當前鎖住了資源<0x00000007d6aa2c98>,但是他在等待資源<0x00000007d6aa2ca8> 由於這兩個線程都持有資源,並且都需要對方的資源,所以造成了死鎖。 原因我們找到了,就可以具體問題具體分析,解決這個死鎖了。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章