盤點Java線程池配置的常見誤區,你中了幾個?

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"前言","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於線程的創建和銷燬對操作系統來說都是比較重量級的操作,所以線程的池化在各種語言內都有實踐,當然在 Java 語言中線程池是也非常重要的一部分,有 Doug Lea 大神對線程池的封裝,我們使用的時候是非常方便,但也可能會因爲不瞭解其具體實現,對線程池的配置參數存在誤解。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們經常在一些技術書籍或博客上看到,向線程池提交任務時,線程池的執行邏輯如下:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1、當一個任務被提交後,線程池首先檢查正在運行的線程數是否達到核心線程數,如果未達到則創建一個線程。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2、如果線程池內正在運行的線程數已經達到了核心線程數,任務將會被放到 BlockingQueue 內。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"3、如果 BlockingQueue 已滿,線程池將會嘗試將線程數擴充到最大線程池容量。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"4、如果當前線程池內線程數量已經達到最大線程池容量,則會執行拒絕策略拒絕任務提交。流程如圖(摘自美團技術博客):","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/70/705649a21d5143fb030d4806fc8d6a09.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"流程描述沒有問題,但如果某些點未經過推敲,容易導致誤解,而且描述中的情境太理想化,如果配置時不考慮運行時環境,也會出現一些非常詭異的問題。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"核心池","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"線程池內線程數量小於等於 coreSize 的部分我稱爲核心池,核心池是線程池的常駐部分,內部的線程一般不會被銷燬,我們提交的任務也應該絕大部分都由核心池內的線程來執行。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"線程創建時機的誤解","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有關核心池最常見的一個誤區是沒搞清楚核心池內線程的創建時機,這個問題,我覺得甩 10% 的鍋給 Doug Lea 大神應該不算過分,因爲他在文檔裏寫道 “If fewer than corePoolSize threads are running, try to start a new thread with the given command as its first task”,其中 \"running\" 這個詞就比較有歧義,因爲在我們理解裏 running 是指當前線程已被操作系統調度,擁有操作系統時間分片,或者被理解爲正在執行某個任務。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於以上的理解,我們很容易就認爲如果任務的 QPS 非常低,線程池內線程數量永遠也達不到 coreSize。即如果我們配置了 coreSize 爲 1000,實際上 QPS 只有 1,單個任務耗時 1s,那麼核心池大小就會一直是 1,即使有流量抖動,核心池也只會被擴容到 3。因爲一個線程每秒執行執行一個任務,剛好不用創建新線程就足以應對 1QPS。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"創建過程","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但如果簡單設計一個測試,使用 jstack 打印出線程棧並數一下線程池內線程數量,會發現線程池內的線程數會隨着任務的提交而逐漸增大,直到達到 coreSize。因爲核心池的設計初衷是想它能作爲常駐池,承載日常流量,所以它應該被儘快初始化,於是線程池的邏輯是在沒有達到 coreSize 之前,每一個任務都會創建一個新的線程,對應的源碼爲:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"public void execute(Runnable command) {\\\n ...\\\n int c = ctl.get();\\\n if (workerCountOf(c) < corePoolSize) { // workerCountOf() 方法是獲取線程池內線程數量\\\n if (addWorker(command, true))\\\n return;\\\n c = ctl.get();\\\n }\\\n ...\\\n }\n","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而文檔裏的 running 狀態也指的是線程已經被創建,我們也知道線程被創建後,會在一個 while 循環裏嘗試從 BlockingQueue 裏獲取並執行任務,說它正在 running 也不爲過。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於此,我們對一些高併發服務進行的預熱,其實並不是期望 JVM 能對熱點代碼做 JIT 等優化,對線程池、連接池和本地緩存的預熱纔是重點。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"BlockingQueue","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"BlockingQueue 是線程池內的另一個重要組件,首先它是線程池”生產者-消費者”模型的中間媒介,另外它也可以爲大量突發的流量做緩衝,但理解和配置它也經常會出錯。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"運行模型","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最常見的錯誤是不理解線程池的運行模型。首先要明確的一點是線程池並沒有準確的調度功能,即它無法感知有哪些線程是處於空閒狀態的,並把提交的任務派發給空閒線程。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"線程池採用的是”生產者-消費者”模式,除了觸發線程創建的任務(線程的 firstTask)不會入 BlockingQueue 外,其他任務都要進入到 BlockingQueue,等待線程池內的線程消費,而任務會被哪個線程消費到完全取決於操作系統的調度。對應的生產者源碼如下:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"public void execute(Runnable command) {\\\n ...\\\n if (isRunning(c) && workQueue.offer(command)) { isRunning() 是判斷線程池處理戚狀態\\\n int recheck = ctl.get();\\\n if (! isRunning(recheck) && remove(command))\\\n reject(command);\\\n else if (workerCountOf(recheck) == 0)\\\n addWorker(null, false);\\\n }\\\n ...\\\n }\n","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對應的消費者源碼如下:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"private Runnable getTask() {\\\n for (;;) {\\\n ...\\\n Runnable r = timed ?\\\n workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :\\\n workQueue.take();\\\n if (r != null)\\\n return r;\\\n ...\\\n }\\\n }\n","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"BlockingQueue 的緩衝作用","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於”生產者-消費者”模型,我們可能會認爲如果配置了足夠的消費者,線程池就不會有任何問題。其實不然,我們還必須考慮併發量這一因素。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"設想以下情況:有 1000 個任務要同時提交到線程池內併發執行,在線程池被初始化完成的情況下,它們都要被放到 BlockingQueue 內等待被消費,在極限情況下,消費線程一個任務也沒有執行完成,那麼這 1000 個請求需要同時存在於 BlockingQueue 內,如果配置的 BlockingQueue Size 小於 1000,多餘的請求就會被拒絕。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那麼這種極限情況發生的概率有多大呢?答案是非常大,因爲操作系統對 I/O 線程的調度優先級是非常高的,一般我們的任務都是由 I/O 的準備或完成(如 tomcat 受理了 http 請求)開始的,所以很有可能被調度到的都是 tomcat 線程,它們在一直往線程池內提交請求,而消費者線程卻調度不到,導致請求堆積。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我負責的服務就發生過這種請求被異常拒絕的情況,壓測時 QPS 2000,平均響應時間爲 20ms,正常情況下,40 個線程就可以平衡生產速度,不會堆積。但在 BlockingQueue Size 爲 50 時,即使線程池 coreSize 爲 1000,還會出現請求被線程池拒絕的情況。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這種情況下,BlockingQueue 的重要的意義就是它是一個能長時間存儲任務的容器,能以很小的代價爲線程池提供緩衝。根據上文可知,線程池能支持BlockingQueue Size個任務同時提交,我們把最大同時提交的任務個數,稱爲併發量,配置線程池時,瞭解併發量異常重要。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"併發量的計算","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們常用 QPS 來衡量服務壓力,所以配置線程池參數時也經常參考這個值,但有時候 QPS 和併發量有時候相關性並沒有那麼高,QPS 還要搭配任務執行時間來推算峯值併發量。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"比如請求間隔嚴格相同的接口,平均 QPS 爲 1000,它的併發量峯值是多少呢?我們並沒有辦法估算,因爲如果任務執行時間爲 1ms,那麼它的併發量只有 1;而如果任務執行時間爲 1s,那麼併發量峯值爲 1000。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可是知道了任務執行時間,就能算出併發量了嗎?也不能,因爲如果請求的間隔不同,可能 1min 內的請求都在一秒內發過來,那這個併發量還要乘以 60,所以上面才說知道了 QPS 和任務執行時間,併發量也只能靠推算。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"計算併發量,我一般的經驗值是 QPS * 平均響應時間,再留上一倍的冗餘,但如果業務重要的話,BlockingQueue Size 設置大一些也無妨(1000 或以上),畢竟每個任務佔用的內存量很有限。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"考慮運行時","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"GC","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除了上面提到的各種情況下,GC 也是一個很重要的影響因素。我們都知道 GC 是 Stop the World 的,但這裏的 World 指的是 JVM,而一個請求 I/O 的準備和完成是操作系統在進行的,JVM 停止了,但操作系統還是會正常受理請求,在 JVM 恢復後執行,所以 GC 是會堆積請求的。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上文中提到的併發量計算一定要考慮到 GC 時間內堆積的請求同時被受理的情況,堆積的請求數可以通過 QPS * GC時間 來簡單得出,還有一定要記得留出冗餘。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"業務峯值","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除此之外,配置線程池參數時,一定要考慮業務場景。假如接口的流量大部分來自於一個定時程序,那麼平均 QPS 就沒有了任何意義,線程池設計時就要考慮給 BlockingQueue 的 Size 設置一個大一些的值;而如果流量非常不平均,一天內只有某一小段時間纔有高流量的話,而且線程資源緊張的情況下,就要考慮給線程池的 maxSize 留下較大的冗餘;在流量尖刺明顯而響應時間不那麼敏感時,也可以設置較大的 BlockingQueue,允許任務進行一定程度的堆積。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當然除了經驗和計算外,對服務做定時的壓測無疑更能幫助掌握服務真實的情況。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"小結","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"總結線程池的配置時,我最大的感受是一定要讀源碼!讀源碼!讀源碼!只看一些書和文章的總結是無法喫透一些重要概念的,即使搞懂了大部分也很容易會在一些角落踩坑。深入理解原理後,面對複雜情況,纔有靈活配置的能力。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"關注公衆號:北遊學Java,回覆【721】即可領取我整理好的線程池學習資料以及精選面試題。","attrs":{}}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章