Hive查詢慢的那點事

當你用jdbc對HiveServer2做一些查詢的時候,有時候會遇到一些延時,爲了查找原因,我們可以收集jstack dump日誌,它可以把Hiveserver2進程的所有線程的callstack打印出來提供你分析,那麼如何分析jstack日誌呢?

1,一般情況下如果callstack發現有org.apache.thrift.server.TServlet.doPost function,那麼就可以認爲這是個hive query,行爲訪問的是http post.
2,查看有沒有鎖的情況,如果有鎖,那麼慢的原因就很有可能是它。鎖有很多,如下parking to wait for <0x00007fc009bcabf8>就可以認爲它是個鎖。
3,如下就是一個導致Hive查詢慢的線程。

"HiveServer2-HttpHandler-Pool: Thread-17151"
  java.lang.Thread.State: WAITING (parking)
  at sun.misc.Unsafe.park(Native Method)
parking to wait for  <0x00007fc009bcabf8> (a java.util.concurrent.Semaphore$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at java.util.concurrent.Semaphore.acquire(Semaphore.java:312)
at org.apache.hive.service.cli.session.HiveSessionImpl.acquire(HiveSessionImpl.java:315)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:471)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:466)
at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:510)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1377)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1362)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.thrift.server.TServlet.doPost(TServlet.java:83)
at org.apache.hive.service.cli.thrift.ThriftHttpServlet.doPost(ThriftHttpServlet.java:206)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:565)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:479)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1031)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:406)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:965)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
at org.eclipse.jetty.server.Server.handle(Server.java:349)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:449)
at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:925)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:857)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:76)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:609)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:45)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
那麼怎麼分析如上的callstack呢?

主要是這行java.util.concurrent.Semaphore.acquire,Semaphore就是一個鎖,也就是說如上的線程最終要做的事情就是如何得到這個鎖,由於等待這個鎖的時間太長了,最終導致查詢延時.

接下來針對 org.apache.hive.service.cli.session.HiveSessionImpl.acquire(HiveSessionImpl.java:315)分析一下源代碼

HiveSessionImpl.java - https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java。

 protected void acquire(boolean userAccess, boolean isOperation) {
  if (isOperation && operationLock != null) {
  try {
    operationLock.acquire(); ----->問題出現在這裏啊at java.util.concurrent.Semaphore.acquire(Semaphore.java:312)
  } catch (InterruptedException e) {
    Thread.currentThread().interrupt();
    throw new RuntimeException(e);
  }
}

operationLock的定義是private final Semaphore operationLock,哦,懂了。從callstack中我們可以看到java.util.concurrent.Semaphore.acquire被call到,其實就是執行operationLock.acquire(),operationLock就是Semaphore這個鎖。

if (isOperation && operationLock != null)也是個判斷,很明顯isOperation && operationLock != null爲true,所以纔去拿這個Semaphore鎖啊,哈哈哈哈。

那麼接下來就想,怎麼才能讓isOperation && operationLock != null爲false,不去拿這個鎖呢?

繼續分析HiveSessionImpl.java源代碼:

this.operationLock = serverConf.getBoolVar(
ConfVars.HIVE_SERVER2_PARALLEL_OPS_IN_SESSION) ? null : new Semaphore(1);

哈哈,原來上面就是答案啊,HIVE_SERVER2_PARALLEL_OPS_IN_SESSION就是個配置參數,把它設置成true,那麼operationLock就爲null,isOperation && operationLock != null就會變成false,最終不會去執行operationLock.acquire,從而不去拿這個鎖,問題搞定!

發佈了13 篇原創文章 · 獲贊 0 · 訪問量 964
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章