分析和解決JAVA 內存泄露的實戰例子

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這幾天,一直在爲Java的“內存泄露”問題糾結。Java應用程序佔用的內存在不斷的、有規律的上漲,最終超過了監控閾值。福爾摩 斯不得不出手了!"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"分析內存泄露的一般步驟"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果發現Java應用程序佔用的內存出現了泄露的跡象,那麼我們一般採用下面的步驟分析:"}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"把Java應用程序使用的heap dump下來"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"使用Java heap分析工具,找出內存佔用超出預期(一般是因爲數量太多)的嫌疑對象"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"必要時,需要分析嫌疑對象和其他對象的引用關係。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"查看程序的源代碼,找出嫌疑對象數量過多的原因。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"dump heap"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果Java應用程序出現了內存泄露,千萬彆着急着把應用殺掉,而是要保存現場。如果是互聯網應用,可以把流量切到其他服務器。保存現場的目的就是爲了把 運行中JVM的heap dump下來。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"JDK自帶的jmap工具,可以做這件事情。它的執行方法是:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"jmap -dump:format=b,file=heap.bin \n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"format=b的含義是,dump出來的文件時二進制格式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"file-heap.bin的含義是,dump出來的文件名是heap.bin。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"就是JVM的進程號。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(在linux下)先執行ps aux | grep java,找到JVM的pid;然後再執行jmap -dump:format=b,file=heap.bin ,得到heap dump文件。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"analyze heap"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"將二進制的heap dump文件解析成human-readable的信息,自然是需要專業工具的幫助,這裏推薦Memory Analyzer 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Memory Analyzer,簡稱MAT,是Eclipse基金會的開源項目,由SAP和IBM捐助。巨頭公司出品的軟件還是很中用的,MAT可以分析包含數億級對 象的heap、快速計算每個對象佔用的內存大小、對象之間的引用關係、自動檢測內存泄露的嫌疑對象,功能強大,而且界面友好易用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MAT的界面基於Eclipse開發,以兩種形式發佈:Eclipse插件和Eclipe RCP。MAT的分析結果以圖片和報表的形式提供,一目瞭然。總之個人還是非常喜歡這個工具的。下面先貼兩張官方的screenshots:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e4/e44168001bda03170a543df5de032cd9.png","alt":"image.png","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/27/271c94a98fc85c28ebc2f0b4d5172396.png","alt":"image.png","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"言歸正傳,我用MAT打開了heap.bin,很容易看出,char[]的數量出其意料的多,佔用90%以上的內存 。一般來說,char[]在JVM確實會佔用很多內存,數量也非常多,因爲String對象以char[]作爲內部存儲。但是這次的char[]太貪婪 了,仔細一觀察,發現有數萬計的char[],每個都佔用數百K的內存 。這個現象說明,Java程序保存了數以萬計的大String對象 。結合程序的邏輯,這個是不應該的,肯定在某個地方出了問題。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"順藤摸瓜"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在可疑的char[]中,任意挑了一個,使用Path To GC Root功能,找到該char[]的引用路徑,發現String對象是被一個HashMap中引用的 。這個也是意料中的事情,Java的內存泄露多半是因爲對象被遺留在全局的HashMap中得不到釋放。不過,該HashMap被用作一個緩存,設置了緩 存條目的閾值,導達到閾值後會自動淘汰。從這個邏輯分析,應該不會出現內存泄露的。雖然緩存中的String對象已經達到數萬計,但仍然沒有達到預先設置 的閾值(閾值設置地比較大,因爲當時預估String對象都比較小)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是,另一個問題引起了我的注意:爲什麼緩存的String對象如此巨大?內部char[]的長度達數百K。雖然緩存中的 String對象數量還沒有達到閾值,但是String對象大小遠遠超出了我們的預期,最終導致內存被大量消耗,形成內存泄露的跡象(準確說應該是內存消 耗過多) 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"就這個問題進一步順藤摸瓜,看看String大對象是如何被放到HashMap中的。通過查看程序的源代碼,我發現,確實有String大對象,不 過並沒有把String大對象放到HashMap中,而是把String大對象進行split(調用String.split方法),然後將split出 來的String小對象放到HashMap中 了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這就奇怪了,放到HashMap中明明是split之後的String小對象,怎麼會佔用那麼大空間呢?難道是String類的split方法有問題?"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"查看代碼"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"帶着上述疑問,我查閱了Sun JDK6中String類的代碼,主要是是split方法的實現:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"public \nString[] split(String regex, int limit) { \n return Pattern.compile(regex).split(this, limit); \n} \n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以看出,Stirng.split方法調用了Pattern.split方法。繼續看Pattern.split方法的代碼:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"public \nString[] split(CharSequence input, int limit) { \n int index = 0; \n boolean matchLimited = limit > 0; \n ArrayList matchList = new \nArrayList(); \n Matcher m = matcher(input); \n // Add segments before each match found \n while(m.find()) { \n if (!matchLimited || matchList.size() < limit - 1) { \n String match = input.subSequence(index, \nm.start()).toString(); \n matchList.add(match); \n index = m.end(); \n } else if (matchList.size() == limit - 1) { // last one \n String match = input.subSequence(index, \n \ninput.length()).toString(); \n matchList.add(match); \n index = m.end(); \n } \n } \n // If no match was found, return this \n if (index == 0) \n return new String[] {input.toString()}; \n // Add remaining segment \n if (!matchLimited || matchList.size() < limit) \n matchList.add(input.subSequence(index, \ninput.length()).toString()); \n // Construct result \n int resultSize = matchList.size(); \n if (limit == 0) \n while (resultSize > 0 && \nmatchList.get(resultSize-1).equals(\"\")) \n resultSize--; \n String[] result = new String[resultSize]; \n return matchList.subList(0, resultSize).toArray(result); \n } \n 注意看第9行:Stirng match = input.subSequence(intdex, m.start()).toString();\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏的match就是split出來的String小對象,它其實是String大對象subSequence的結果。繼續看 String.subSequence的代碼:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"public \nCharSequence subSequence(int beginIndex, int endIndex) { \n return this.substring(beginIndex, endIndex); \n} \n String.subSequence有調用了String.subString,繼續看:\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"public String \nsubstring(int beginIndex, int endIndex) { \n if (beginIndex < 0) { \n throw new StringIndexOutOfBoundsException(beginIndex); \n } \n if (endIndex > count) { \n throw new StringIndexOutOfBoundsException(endIndex); \n } \n if (beginIndex > endIndex) { \n throw new StringIndexOutOfBoundsException(endIndex - beginIndex); \n } \n return ((beginIndex == 0) && (endIndex == count)) ? this : \n new String(offset + beginIndex, endIndex - beginIndex, value); \n } \n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"看第11、12行,我們終於看出眉目,如果subString的內容就是完整的原字符串,那麼返回原String對象;否則,就會創建一個新的 String對象,但是這個String對象貌似使用了原String對象的char[]。我們通過String的構造函數確認這一點:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"// Package \nprivate constructor which shares value array for speed. \n String(int offset, int count, char value[]) { \n this.value = value; \n this.offset = offset; \n this.count = count; \n } \n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了避免內存拷貝、加快速度,Sun JDK直接複用了原String對象的char[],偏移量和長度來標識不同的字符串內容。也就是說,subString出的來String小對象 仍然會指向原String大對象的char[],split也是同樣的情況 。這就解釋了,爲什麼HashMap中String對象的char[]都那麼大。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"原因解釋"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其實上一節已經分析出了原因,這一節再整理一下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"程序從每個請求中得到一個String大對象,該對象內部char[]的長度達數百K。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"程序對String大對象做split,將split得到的String小對象放到HashMap中,用作緩存。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Sun JDK6對String.split方法做了優化,split出來的Stirng對象直接使用原String對象的char[]"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"HashMap中的每個String對象其實都指向了一個巨大的char[]"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"HashMap的上限是萬級的,因此被緩存的Sting對象的總大小=萬*百K=G級。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"G級的內存被緩存佔用了,大量的內存被浪費,造成內存泄露的跡象。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"解決方案"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原因找到了,解決方案也就有了。split是要用的,但是我們不要把split出來的String對象直接放到HashMap中,而是調用一下 String的拷貝構造函數String(String original),這個構造函數是安全的,具體可以看代碼:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":" /** \n * Initializes a newly created {@code String} object so that it \nrepresents \n * the same sequence of characters as the argument; in other words, \nthe \n * newly created string is a copy of the argument string. Unless an \n * explicit copy of {@code original} is needed, use of this \nconstructor is \n * unnecessary since Strings are immutable. \n * \n * @param original \n * A {@code String} \n */ \n public String(String original) { \n int size = original.count; \n char[] originalValue = original.value; \n char[] v; \n if (originalValue.length > size) { \n // The array representing the String is bigger than the new \n // String itself. Perhaps this constructor is being called \n // in order to trim the baggage, so make a copy of the array. \n int off = original.offset; \n v = Arrays.copyOfRange(originalValue, off, off+size); \n } else { \n // The array representing the String is the same \n // size as the String, so no point in making a copy. \n v = originalValue; \n } \n this.offset = 0; \n this.count = size; \n this.value = v; \n } \n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"只是,new String(string)的代碼很怪異,囧。或許,subString和split應該提供一個選項,讓程序員控制是否複用String對象的 char[]。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"是否Bug"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"雖然,subString和split的實現造成了現在的問題,但是這能否算String類的bug呢?個人覺得不好說。因爲這樣的優化是比較合理 的,subString和spit的結果肯定是原字符串的連續子序列。只能說,String不僅僅是一個核心類,它對於JVM來說是與原始類型同等重要的 類型。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"JDK實現對String做各種可能的優化都是可以理解的。但是優化帶來了憂患,我們程序員足夠了解他們,才能用好他們。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"一些補充"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有個地方我沒有說清楚。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我的程序是一個Web程序,每次接受請求,就會創建一個大的String對象,然後對該String對象進行split,最後split之後的String對象放到全局緩存中。如果接收了5W個請求,那麼就會有5W個大String對象。這5W個大String對象都被存儲在全局緩存中,因此會造成內存泄漏。我原以爲緩存的是5W個小String,結果都是大String。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有同學後續建議用\"java.io.StreamTokenizer\"來解決本文的問題。確實是終極解決方案,比我上面提到的“new String()”,要好很多很多。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"看完三件事❤️"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果你覺得這篇內容對你還蠻有幫助,我想邀請你幫我三個小忙:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"點贊,轉發,有你們的 『點贊和評論』,纔是我創造的動力。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"關注公衆號 『 "},{"type":"text","marks":[{"type":"strong"}],"text":"java爛豬皮"},{"type":"text","text":" 』,不定期分享原創知識。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"同時可以期待後續文章ing🚀"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/85/8518f1f13ab07e46122c84420bbf39a8.png","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"出處:https://club.perfma.com/article/1815828"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章