內存管理設計精要

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"系統設計精要是一系列深入研究系統設計方法的系列文章,文中不僅會分析系統設計的理論,還會分析多個實際場景下的具體實現。這是一個季更或者半年更的系列,如果你有想要了解的問題,可以在文章下面留言。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"持久存儲的磁盤在今天已經不是稀缺的資源了,但是 CPU 和內存仍然是相對比較昂貴的資源,作者在 "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-scheduler\/","title":null,"type":null},"content":[{"type":"text","text":"調度系統設計精要"}]},{"type":"text","text":" 中曾經介紹操作系統和編程語言對 CPU 資源的調度策略和原理,本文將會介紹計算機中常見的另一個稀缺資源 — 內存,是如何管理的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/da\/daba08fda14c91dfe4894e4026ea86c6.png","alt":"system-design-and-memory-management","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 1 - 內存系統設計精要"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"內存管理系統和模塊在操作系統以及編程語言中都佔有着重要的地位,任何資源的使用都離不開申請和釋放兩個動作,內存管理中的兩個重要過程就是內存分配和垃圾回收,內存管理系統如何利用有限的內存資源爲儘可能多的程序或者模塊提供服務是它的核心目標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ed\/ed1a73a2ea165154af51ca80132da421.png","alt":"table-of-content","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 2 - 文章脈絡和內容"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"雖然多數系統都會將內存管理拆分成多個複雜的模塊並引入一些中間層提供緩存和轉換的功能,但是內存管理系統實際上都可以簡化成兩個模塊,即內存分配器(Allocator)、垃圾收集器(Collector)。當然除了這兩個模塊之外,在研究內存管理時都會引入第三個模塊 — 用戶程序(Mutator),幫助我們理解整個系統的工作流程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/af\/aff17b9fb289056225caf688cbcd88a2.png","alt":"mutator-allocator-collector","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 3 - 內存管理系統模塊"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"用戶程序(Mutator)- 可以通過分配器創建對象或者更新對象持有的指針;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"內存分配器(Allocator)— 處理用戶程序的的內存分配請求;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"垃圾收集器(Collector)- 標記內存中的對象並回收不需要的內存;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上述的三個模塊是內存管理系統中的核心,它們在應用程序運行期間可以維護管理內存達到相對平衡的狀態,我們在介紹內存管理時也會圍繞這三個不同的組件,本節將從基本概念、內存分配和垃圾回收三個方面詳細介紹內存管理的相關理論。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"基本概念"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基本概念這一節將介紹內存管理中的基本問題,我們會簡單介紹應用程序的內存佈局、內存管理中的設計的常見概念以及廣義上的幾種不同內存管理方式,這裏會幫助各位讀者從頂層瞭解內存管理。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"內存佈局"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"操作系統會爲在其上運行的應用程序分配一片巨大的虛擬內存,需要注意的是,與操作系統的主存和物理內存不一樣,虛擬內存並不是在物理上真正存在的概念,它是操作系統構建的邏輯概念。應用程序的內存一般會分成以下幾個不同的區域:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/1b\/1b7db8556459675add2a0586a51505b7.png","alt":"memory-layout","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 4 - 內存佈局"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"棧區(Stack)— 存儲程序執行期間的本地變量和函數的參數,從高地址向低地址生長;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"堆區(Heap)— 動態內存分配區域,通過 "},{"type":"codeinline","content":[{"type":"text","text":"malloc"}]},{"type":"text","text":"、"},{"type":"codeinline","content":[{"type":"text","text":"new"}]},{"type":"text","text":"、"},{"type":"codeinline","content":[{"type":"text","text":"free"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"delete"}]},{"type":"text","text":" 等函數管理;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"未初始化變量區(BSS)— 存儲未被初始化的全局變量和靜態變量;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據區(Data)— 存儲在源代碼中有預定義值的全局變量和靜態變量;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"代碼區(Text)— 存儲只讀的程序執行代碼,即機器指令;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上述五種不同段雖然存儲着不同的數據,但是我們可以將它們分成三種不同的內存分配類型,也就是靜態內存、棧內存和堆內存。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"靜態內存"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"靜態內存可以最早追溯到 1960 年的 ALGOL 語言"},{"type":"sup","content":[{"type":"text","text":"1"}]},{"type":"text","text":",靜態變量的生命週期可以貫穿整個程序。所有靜態內存的佈局都是在編譯期間確認的,運行期間也不會分配新的靜態內存,因爲所有的靜態內存都是在編譯期間確認的,所以會爲這些變量申請固定大小的內存空間,這些固定的內存空間也會導致靜態內存無法支持函數的遞歸調用:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/44\/44a3e953992802be120a7b6102029546.png","alt":"static-allocation-features","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 5 - 靜態內存的特性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲編譯器可以確定靜態變量的地址,所以它們是程序中唯一可以使用絕對地址尋址的變量。當程序被加載到內存中時,靜態變量會直接存儲在程序的 BSS 區或者數據區,這些變量也會在程序退出時被銷燬,正是因爲靜態內存的這些特性,我們並不需要在程序運行時引入靜態內存的管理機制。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"棧內存"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"棧是應用程序中常見的內存空間,它遵循後進先出的規則管理存儲的數據"},{"type":"sup","content":[{"type":"text","text":"2"}]},{"type":"text","text":"。當應用程序調用函數時,它會將函數的參數加入棧頂,當函數返回時,它會將當前函數使用的棧全部銷燬。棧內存管理的指令也都是由編譯器生成的,我們會使用 BP 和 SP 這兩個寄存器存儲當前棧的相關信息,完全不需要工程師的參與,不過我們也只能在棧上分配大塊固定的數據結構。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/a4\/a4022dcd8c2351f89cb9843c611d9251.png","alt":"stack-allocation-features","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 6 - 棧內存的特性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲棧內存的釋放是動態的並且是線性的,所以它可以支持函數的遞歸調用,不過運行時動態棧分配策略的引入也會導致程序棧內存的溢出,如果我們在編程語言中使用的遞歸函數超出了程序內存的上限,會造成棧溢出錯誤。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"堆內存"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"堆內存也是應用程序中的常見內存,與超過函數作用域會自動回收的棧內存相比,它能夠讓函數的被調用方向調用方返回內存並在內存的分配提供更大的靈活性,不過它提供的靈活性也帶來了內存泄漏和懸掛指針等內存安全問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/0d\/0de5e31e7baaa1d721e9dda223b87834.png","alt":"heap-allocation-features","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 7 - 堆內存的特性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲堆上的內存是工程師手動申請的,所以需要在使用結束時釋放,一旦用過的內存沒有釋放,就會造成內存泄漏,佔用更多的系統內存;如果在使用結束前釋放,會導致危險的懸掛指針,其他對象指向的內存已經被系統回收或者重新使用。雖然進程的內存可以劃分成很多區域,但是當我們在談內存管理時,一般指的都是堆內存的管理,也就是如何解決內存泄漏和懸掛指針的問題。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"管理方式"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們可以將內存管理簡單地分成手動管理和自動管理兩種方式,手動管理內存一般是指由工程師在需要時通過 "},{"type":"codeinline","content":[{"type":"text","text":"malloc"}]},{"type":"text","text":" 等函數手動申請內存並在不需要時調用 "},{"type":"codeinline","content":[{"type":"text","text":"free"}]},{"type":"text","text":" 等函數釋放內存;自動管理內存由編程語言的內存管理系統自動管理,在大多數情況下不需要工程師的參與,能夠自動釋放不再使用的內存。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ef\/ef06c61559d7c4a4b0135edd7dfd775a.png","alt":"memory-management-approaches","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 8 - 手動管理和自動管理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"手動管理和自動管理只是內存管理的兩種不同方式,本節將分別介紹兩種內存管理的方式以及不同編程語言做出的不同選擇。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"手動管理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"手動管理內存是一種比較傳統的內存管理方式,C\/C++ 這類系統級的編程語言不包含"},{"type":"text","marks":[{"type":"strong"}],"text":"狹義上的"},{"type":"text","text":"自動內存管理機制,工程師需要主動申請或者釋放內存。如果存在理想的工程師能夠精準地確定內存的分配和釋放時機,人肉的內存管理策略只要做到足夠精準,使用手動管理內存的方式可以提高程序的運行性能,也不會造成內存安全問題,但是這種理想的工程師往往不存在於現實中,人類因素(Human Factor)總會帶來一些錯誤,內存泄漏和懸掛指針基本是 C\/C++ 這類語言中最常出現的錯誤,手動的內存管理也會佔用工程師的大量精力,很多時候都需要思考對象應該分配到棧上還是堆上以及堆上的內存應該何時釋放,維護成本相對來說還是比較高的,這也是必然要做的權衡。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"自動管理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自動管理內存基本是現代編程語言的標配,因爲內存管理模塊的功能非常確定,所以我們可以在編程語言的編譯期或者運行時中引入自動的內存管理方式,最常見的自動內存管理機制就是垃圾回收,不過除了垃圾回收之外,一些編程語言也會使用自動引用計數輔助內存的管理。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自動的內存管理機制可以幫助工程師節省大量的與內存打交道的時間,讓工程師將全部的精力都放在覈心的業務邏輯上,提高開發的效率;在一般情況下,這種自動的內存管理機制都可以很好地解決內存泄漏和懸掛指針的問題,但是這也會帶來額外開銷並影響語言的運行時性能。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"對象頭"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對象頭是實現自動內存管理的關鍵元信息,內存分配器和垃圾收集器都會訪問對象頭以獲取相關的信息。當我們通過 "},{"type":"codeinline","content":[{"type":"text","text":"malloc"}]},{"type":"text","text":" 等函數申請內存時,往往都需要將內存按照指針的大小對齊(32 位架構上爲 4 字節,64 位架構上爲 8 字節),除了用於對齊的內存之外,每一個堆上的對象也都需要對應的對象頭:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/97\/97f54cb6cecf1fbee7ed522a700a4cec.png","alt":"object-header","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 9 - 對象頭與對象"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不同的自動內存管理機制會在對象頭中存儲不同的信息,使用垃圾回收的編程語言會存儲標記位 "},{"type":"codeinline","content":[{"type":"text","text":"MarkBit"}]},{"type":"text","text":"\/"},{"type":"codeinline","content":[{"type":"text","text":"MarkWord"}]},{"type":"text","text":",例如:Java 和 Go 語言;使用自動引用計數的會在對象頭中存儲引用計數 "},{"type":"codeinline","content":[{"type":"text","text":"RefCount"}]},{"type":"text","text":",例如:Objective-C。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"編程語言會選擇將對象頭與對象存儲在一起,不過因爲對象頭的存儲可能影響數據訪問的局部性,所以有些編程語言可能會單獨開闢一片內存空間來存儲對象頭並通過內存地址建立兩者之間的隱式聯繫。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"內存分配"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"內存分配器是內存管理系統中的重要組件,它的主要職責是處理用戶程序的內存申請。雖然內存分配器的職責非常重要,但是"},{"type":"text","marks":[{"type":"strong"}],"text":"內存的分配和使用其是一個增加系統中熵的過程"},{"type":"text","text":",所以內存分配器的設計與工作原理相對比較簡單,我們在這裏介紹內存分配器的兩種類型。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"內存分配器只包含線性內存分配器(Sequential Allocator)和空閒鏈表內存分配器(Free-list Allocator)兩種,內存管理機制中的所有內存分配器其實都是上述兩種不同分配器的變種,它們的設計思路完全不同,同時也有着截然不同的應用場景和特性,我們在這裏依次介紹這兩種內存分配器的原理。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"線性分配器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"線性分配(Bump Allocator)是一種高效的內存分配方法,但是有較大的侷限性。當我們在編程語言中使用線性分配器,我們只需要在內存中維護一個指向內存特定位置的指針,當用戶程序申請內存時,分配器只需要檢查剩餘的空閒內存、返回分配的內存區域並修改指針在內存中的位置,即移動下圖中的指針:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/1e\/1ee0759d64c0ca3c745bc1cf3a6870bc.png","alt":"bump-allocator","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 10 - 線性分配器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"根據線性分配器的原理,我們可以推測它有較快的執行速度,以及較低的實現複雜度;但是線性分配器無法在內存被釋放時重用內存。如下圖所示,如果已經分配的內存被回收,線性分配器是無法重新利用紅色的這部分內存的:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/c8\/c8017a289cd9fb8af76c479cf7d7972d.png","alt":"bump-allocator-reclaim-memory","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 11 - 線性分配器回收內存"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"正是因爲線性分配器的這種特性,我們需要合適的垃圾回收算法配合使用。標記壓縮(Mark-Compact)、複製回收(Copying GC)和分代回收(Generational GC)等算法可以通過拷貝的方式整理存活對象的碎片,將空閒內存定期合併,這樣就能利用線性分配器的效率提升內存分配器的性能了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲線性分配器的使用需要配合具有拷貝特性的垃圾回收算法,所以 C 和 C++ 等需要直接對外暴露指針的語言就無法使用該策略,我們會在下一節詳細介紹常見垃圾回收算法的設計原理。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"空閒鏈表分配器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"空閒鏈表分配器(Free-List Allocator)可以重用已經被釋放的內存,它在內部會維護一個類似鏈表的數據結構。當用戶程序申請內存時,空閒鏈表分配器會依次遍歷空閒的內存塊,找到足夠大的內存,然後申請新的資源並修改鏈表:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/e3\/e390a34ded14f0f787af9e9f66d5dab8.png","alt":"free-list-allocator","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 12 - 空閒鏈表分配器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲不同的內存塊以鏈表的方式連接,所以使用這種方式分配內存的分配器可以重新利用回收的資源,但是因爲分配內存時需要遍歷鏈表,所以它的時間複雜度就是 "},{"type":"codeinline","content":[{"type":"text","text":"O(n)"}]},{"type":"text","text":"。空閒鏈表分配器可以選擇不同的策略在鏈表中的內存塊中進行選擇,最常見的就是以下四種方式:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首次適應(First-Fit)— 從鏈表頭開始遍歷,選擇第一個大小大於申請內存的內存塊;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"循環首次適應(Next-Fit)— 從上次遍歷的結束位置開始遍歷,選擇第一個大小大於申請內存的內存塊;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最優適應(Best-Fit)— 從鏈表頭遍歷整個鏈表,選擇最合適的內存塊;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"隔離適應(Segregated-Fit)— 將內存分割成多個鏈表,每個鏈表中的內存塊大小相同,申請內存時先找到滿足條件的鏈表,再從鏈表中選擇合適的內存塊;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上述四種策略的前三種就不過多介紹了,Go 語言使用的內存分配策略與第四種策略有些相似,我們通過下圖瞭解一下該策略的原理:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/95\/95d6f6c5ff9418814b8febe2c6f937f4.png","alt":"segregated-list","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 13 - 隔離適應策略"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如上圖所示,該策略會將內存分割成由 4、8、16、32 字節的內存塊組成的鏈表,當我們向內存分配器申請 8 字節的內存時,我們會在上圖中的第二個鏈表找到空閒的內存塊並返回。隔離適應的分配策略減少了需要遍歷的內存塊數量,提高了內存分配的效率。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"垃圾回收"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"垃圾回收是一種自動的內存管理形式"},{"type":"sup","content":[{"type":"text","text":"3"}]},{"type":"text","text":",垃圾收集器是內存管理系統的重要組件,內存分配器會負責在堆上申請內存,而垃圾收集器會釋放不再被用戶程序使用的對象。談到垃圾回收,很多人的第一反應可能都是暫停程序(stop-the-world、STW)和垃圾回收暫停(GC Pause),垃圾回收確實會帶來 STW,但是這不是垃圾回收的全部,本節將詳細介紹垃圾回收以及垃圾收集器的相關概念和理論。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"什麼是垃圾"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在深入分析垃圾回收之前,我們需要先明確垃圾回收中垃圾的定義,明確定義能夠幫助我們更精確地理解垃圾回收解決的問題以及它的職責。計算機科學中的垃圾包括對象、數據和計算機系統中的其他的內存區域,這些數據不會在未來的計算中使用,因爲內存資源是有限的,所以我們需要將這些垃圾佔用的內存交還回堆並在未來複用"},{"type":"sup","content":[{"type":"text","text":"4"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/06\/0654166cae154dec99f00618671c2641.png","alt":"garbages","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 14 - 語義和語法垃圾"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"垃圾可以分成語義垃圾和語法垃圾兩種,*語義垃圾(Semantic Garbage)*是計算機程序中永遠不會被程序訪問到的對象或者數據;*語法垃圾(Syntactic Garbage)*是計算機程序內存空間中從根對象無法達到(Unreachable)的對象或者數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"語義垃圾是"},{"type":"text","marks":[{"type":"strong"}],"text":"不會被使用的"},{"type":"text","text":"的對象,可能包括廢棄的內存、不使用的變量,垃圾收集器無法解決程序中語義垃圾的問題,我們需要通過編譯器來一部分語義垃圾。語法垃圾是在對象圖中"},{"type":"text","marks":[{"type":"strong"}],"text":"不能從根節點達到的"},{"type":"text","text":"對象,所以語法垃圾在一般情況下都是語義垃圾:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/3c\/3cbad53f53ea0599029d806377d78859.png","alt":"syntactic-garbage","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 15 - 無法達到的語法垃圾"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"垃圾收集器能夠發現並回收的就是對象圖中無法達到的語法垃圾,通過分析對象之間的引用關係,我們可以得到圖中根節點不可達的對象,這些不可達的對象會在垃圾收集器的清理階段被回收。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"收集器性能"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"吞吐量(Throughput)和最大暫停時間(Pause time)是兩個衡量垃圾收集器的主要指標,除了這兩個指標之外,堆內存的使用效率和訪問的局部性也是垃圾收集的常用指標,我們簡單介紹以下這些指標對垃圾收集器的影響。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"吞吐量"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"垃圾收集器的吞吐量其實有兩種解釋,一種解釋是垃圾收集器在執行階段的速度,也就是單位時間的標記和清理內存的能力,我們可以用堆內存除以 GC 使用的總時間來計算。"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"HEAP_SIZE \/ TOTAL_GC_TIME\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一種吞吐量計算方法是使用程序運行的總時間除以所有 GC 循環運行的總時間,GC 的時間對於整個應用程序來說是額外開銷,這個指標能看出額外開銷佔用資源的百分比,從這一點,我們也能看出 GC 的執行效率。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"最大暫停時間"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於在垃圾回收的某些階段會觸發 STW,所以用戶程序是不能執行的,最長的 STW 時間會嚴重影響程序處理請求或者提供服務的尾延遲,所以這一點也是我們在測量垃圾收集器性能時需要考慮的指標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/a3\/a31d3835c6ca43c1bd0e1239ab65ee93.png","alt":"maximum-gc-pause","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 16 - 最大暫停時間"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"使用 STW 垃圾收集器的編程語言,用戶程序在垃圾回收的全部階段都不能執行。併發標記清除的垃圾收集器將可以與用戶程序併發執行的工作全部併發執行,能夠減少最大程序暫停時間,"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"堆使用效率"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"堆的使用效率也是衡量垃圾收集器的重要指標。爲了能夠標識垃圾,我們需要在內存空間中引入包含特定信息的對象頭,這些對象頭都是垃圾收集器帶來的額外開銷,正如網絡帶寬可能不是最終的下載速度,協議頭和校驗碼的傳輸會佔用網絡帶寬,對象頭的大小最終也會影響堆內存的使用效率;除了對象頭之外,堆在使用過程中出現的碎片也會影響內存的使用效率,爲了保證內存的對齊,我們會在內存中留下很多縫隙,這些縫隙也是內存管理帶來的開銷。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"訪問局部性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"訪問的局部性是我們在討論內存管理時不得不談的話題,空間的局部性是指處理器在短時間內總會重複地訪問同一片或者相鄰的內存區域,操作系統會以內存頁爲單位管理內存空間,在理想情況下,合理的內存佈局可以使得垃圾收集器和應用程序都能充分地利用空間局部性提高程序的執行效率。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"收集器類型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"垃圾收集器的類型在總體上可以分成直接(Direct)垃圾收集器和跟蹤(Tracing)垃圾收集器。直接垃圾收集器包括引用計數(Refernce-Counting),跟蹤垃圾收集器包含標記清理、標記壓縮、複製垃圾回收等策略,而引用計數收集器卻不是特別常見,少數編程語言會使用這種方式管理內存。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/c6\/c62e6cc1e3af8ef3cc1911ea761c6a21.png","alt":"garbage-collector-types","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 17 - 垃圾收集器類型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除了直接和跟蹤垃圾收集器這些相對常見的垃圾回收方法之外,也有使用所有權或者手動的方式管理內存,我們在本節中會介紹引用計數、標記清除、標記壓縮和複製垃圾回收四種不同類型垃圾收集器的設計原理以及它們的優缺點。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"引用計數"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於引用計數的垃圾收集器是直接垃圾收集器,當我們改變對象之間的引用關係時會修改對象之間的引用計數,每個對象的引用計數都記錄了當前有多少個對象指向了該對象,當對象的引用計數歸零時,當前對象就會被自動釋放。在使用引用計數的編程語言中,垃圾收集是在用戶程序運行期間實時發生的,所以在理論上也就不存在 STW 或者明顯地垃圾回收暫停。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/c1\/c1c084a388b6be9e7306db03bf22430a.png","alt":"reference-counting","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 18 - 對象的引用計數"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如上圖所示,基於引用計數的垃圾收集器需要應用程序在對象頭中存儲引用計數,引用計數就是該類型的收集器在內存中引入的額外開銷。我們在這裏舉一個例子介紹引用計數的工作原理,如果在使用引用計數回收器的編程語言中使用如下所示賦值語句時:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"obj.field = new_ref;\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"對象 "},{"type":"codeinline","content":[{"type":"text","text":"obj"}]},{"type":"text","text":" 原來引用的對象 "},{"type":"codeinline","content":[{"type":"text","text":"old_ref"}]},{"type":"text","text":" 的引用計數會"},{"type":"text","marks":[{"type":"strong"}],"text":"減一"},{"type":"text","text":";"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"對象 "},{"type":"codeinline","content":[{"type":"text","text":"obj"}]},{"type":"text","text":" 引用的新對象 "},{"type":"codeinline","content":[{"type":"text","text":"new_ref"}]},{"type":"text","text":" 的引用計數會"},{"type":"text","marks":[{"type":"strong"}],"text":"加一"},{"type":"text","text":";"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"如果 "},{"type":"codeinline","content":[{"type":"text","text":"old_ref"}]},{"type":"text","text":" 對象的引用計數歸零,我們會釋放該對象回收它的內存;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這種類型的垃圾收集器會帶來兩個比較常見的問題,分別是遞歸的對象回收和循環引用:"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"遞歸回收 — 每當對象的引用關係發生改變時,我們都需要計算對象的新引用計數,一旦對象被釋放,我們就需要遞歸地訪問所有該對象的引用並將被引用對象的計數器減一,一旦涉及到較多的對象就可能會造成 GC 暫停;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"循環引用 — 對象的相互引用在對象圖中也非常常見,如果對象之間的引用都是強引用,循環引用會導致多個對象的計數器都不會歸零,最終會造成內存泄漏;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"遞歸回收是使用引用計數時不得不面對的問題,我們很難在工程上解決該問題;不過使用引用計數的編程語言卻可以利用弱引用來解決循環引用的問題,弱引用也是對象之間的引用關係,"},{"type":"text","marks":[{"type":"strong"}],"text":"建立和銷燬弱引用關係都不會修改雙方的引用計數"},{"type":"text","text":",這就能避免對象之間的弱引用關係,不過這也需要工程師對引用關係作出額外的並且正確的判斷。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/12\/1225d7409a0b41c0f2ac193e619b04ba.png","alt":"strong-and-weak-reference","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 19 - 強引用與弱引用"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除了弱引用之外,一些編程語言也會在引用計數的基礎上加入標記清除技術,通過遍歷和標記堆中不再被使用的對象解決循環引用的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"引用計數垃圾收集器是一種非移動(Non-moving)的垃圾回收策略,它在回收內存的過程中不會移動已有的對象,很多編程語言都會對工程師直接暴露內存的指針,所以 C、C++ 以及 Objective-C 等編程語言其實都可以使用引用計數來解決內存管理的問題。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"標記清除"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"標記清除(Mark-Sweep)是最簡單也最常見的垃圾收集策略,它的執行過程可以分成"},{"type":"text","marks":[{"type":"strong"}],"text":"標記"},{"type":"text","text":"和"},{"type":"text","marks":[{"type":"strong"}],"text":"清除"},{"type":"text","text":"兩個階段,標記階段會使用深度優先或者廣度優先算法掃描堆中的存活對象,而清除階段會回收內存中的垃圾。當我們使用該策略回收垃圾時,它會首先從根節點出發沿着對象的引用遍歷堆中的全部對象,能夠被訪問到的對象是存活的對象,不能被訪問到的對象就是內存中的垃圾。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如下圖所示,內存空間中包含多個對象,我們從根對象出發依次遍歷對象的子對象並將從根節點可達的對象都標記成存活狀態,即 A、C 和 D 三個對象,剩餘的 B、E 和 F 三個對象因爲從根節點不可達,所以會被當做垃圾:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ac\/ac7eec250a997b9739e2cef64455150b.png","alt":"mark-sweep-mark-phase","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 20 - 標記清除的標記階段"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"標記階段結束後會進入清除階段,在該階段中收集器會依次遍歷堆中的所有對象,釋放其中沒有被標記的 B、E 和 F 三個對象並將新的空閒內存空間以鏈表的結構串聯起來,方便內存分配器的使用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/f5\/f53c8a3f8040a3cc41336d45c27c4740.png","alt":"mark-sweep-sweep-phase","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 21 - 標記清除的收集階段"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"使用標記清除算法的編程語言需要在對象頭中加入表示對象存活的標記位(Mark Bit),標記位與操作系統的寫時複製不兼容,因爲即使內存頁中的對象沒有被修改,垃圾收集器也會修改內存頁中對象相鄰的標記位導致內存頁的複製。我們可以使用位圖(Bitmap)標記避免這種情況,表示對象存活的標記與對象分別存儲,清理對象時也只需要遍歷位圖,能夠降低清理過程的額外開銷。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如上圖所示,使用標記清除算法的垃圾收集器一般會使用基於空閒鏈表的分配器,因爲對象在不被使用時會被就地回收,所以長時間運行的程序會出現很多內存碎片,這會降低內存分配器的分配效率,在實現上我們可以將空閒鏈表按照對象大小分成不同的區以減少內存中的碎片。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"標記清除策略是一種實現簡單的垃圾收集策略,但是它的內存碎片化問題也比較嚴重,簡單的內存回收策略也增加了內存分配的開銷和複雜度,當用戶程序申請內存時,我們也需要在內存中找到足夠大的塊分配內存。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"標記壓縮"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"標記壓縮(Mark-Compact)也是比較常見的垃圾收集算法,與標記清除算法類似,標記壓縮的執行過程可以分成"},{"type":"text","marks":[{"type":"strong"}],"text":"標記"},{"type":"text","text":"和"},{"type":"text","marks":[{"type":"strong"}],"text":"壓縮"},{"type":"text","text":"兩個階段。該算法在標記階段也會從根節點遍歷對象,查找並標記所有存活的對象;在壓縮階段,我們會將所有存活的對象緊密排列,『擠出』存活對象之間的縫隙:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/a6\/a6e9890a5d52e88c25901d000deb5bcc.png","alt":"mark-compact","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 22 - 標記壓縮算法"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲在壓縮階段我們需要移動存活的對象,所以這一種 moving 收集器,如果編程語言支持使用指針訪問對象,那麼我們就無法使用該算法。標記的過程相對比較簡單,我們在這裏以 Lisp 2 壓縮算法爲例重點介紹該算法的壓縮階段:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"計算當前對象遷移後的最終位置並將位置存儲在轉發地址(Forwarding Address)中;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"根據當前對象子對象的轉發地址,將引用指向新的位置;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"將所有存活的對象移動到對象頭中轉發地址的位置;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從上述過程我們可以看出,使用標記壓縮算法的編程語言不僅要在對象頭中存儲標記位,還需要存儲當前對象的轉發地址,這增加了對象在內存中的額外開銷。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"標記壓縮算法的實現比較複雜,在執行的過程中需要遍歷三次堆中的對象,作爲 moving 的垃圾收集器,它不適用於 C、C++ 等編程語言;壓縮算法的引入可以減少程序中的內存碎片,我們可以直接使用最簡單的線性分配器爲用戶程序快速分配內存。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"複製垃圾回收"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"複製垃圾回收(Copying GC)也是跟蹤垃圾收集器的一種,它會將應用程序的堆分成兩個大小相等的區域,如下圖所示,其中左側區域負責爲用戶程序分配內存空間,而右側區域用於垃圾回收。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/3c\/3c797f5d230bf5f7648407e00a985514.png","alt":"copying-gc","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 23 - 複製垃圾回收"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當用戶程序使用的內存超過上圖中的左側區域就會出現內存不足(Out-of memory、OOM),垃圾收集器在這時會開啓新的垃圾收集循環,複製垃圾回收的執行過程可以非常以下的四個階段:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"複製階段 — 從 GC 根節點出發遍歷內存中的對象,將發現的存活對象遷移到右側的內存中;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"轉發階段 — 在原始對象的對象頭或者在原位置設置新對象的轉發地址(Forwarding Address),如果其他對象引用了該對象可以從轉發地址轉到新的地址;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"修復指針 — 遍歷當前對象持有的引用,如果引用指向了左側堆中的對象,回到第一步遷移發現的新對象;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"交換階段 — 當內存中不存在需要遷移的對象之後,交換左右兩側的內存區域;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/fb\/fb3b868335a6cd772d778d7bde49821a.png","alt":"copying-gc-copy-phase","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 24 - 複製垃圾回收的複製階段"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如上圖所示,當我們把 A 對象複製到右側的區域後,會將原始的 A 對象指向新的 A 對象,這樣其他引用 A 的對象可以快速找到它的新地址;因爲 A 對象的複製是『像素級複製』,所以 A 對象仍然會指向左側內存的 C 對象,這時需要將 C 對象複製到新的內存區域並修改 A 對象的指針。在最後,當不存在需要拷貝的對象時,我們可以直接交換兩個內存區域的指針。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"複製垃圾回收與標記壓縮算法一樣都會拷貝對象,能夠減少程序中的內存碎片,我們可以使用線性的分配器快速爲用戶程序分配內存。因爲只需要掃描一半的堆,遍歷堆的次數也會減少,所以可以減少垃圾回收的時間,但是這也會降低內存的利用率。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"高級垃圾回收"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"內存管理是一個相對比較大的話題,我們在上一小節介紹了垃圾回收的一些基本概念,其中包括常見的垃圾回收算法:引用計數、標記清除、標記壓縮和複製垃圾回收,這些算法都是比較基本的垃圾回收算法,我們在這一節中將詳細介紹一些高級的垃圾回收算法,它們會利用基本的垃圾回收算法和新的數據結構構建更復雜的收集器。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"分代垃圾收集器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分代垃圾回收(Generational garbage collection)是在生產環境中比較常見的垃圾收集算法,該算法主要建立在弱分代假設(Weak Generational Hypothesis)上 —— 大多數的對象會在生成後馬上變成垃圾,只有極少數的對象可以存活很久"},{"type":"sup","content":[{"type":"text","text":"5"}]},{"type":"text","text":"。根據該經驗,分代垃圾回收會把堆中的對象分成多個代,不同代垃圾回收的觸發條件和算法都完全不同。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/d6\/d6e93d9c390a357cc06ec54ba993b867.png","alt":"young-and-old-generation","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 25 - 青年代和老年代"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"常見的分代垃圾回收會將堆分成青年代(Young、Eden)和老年代(Old、Tenured),所有的對象在剛剛初始化時都會進入青年代,而青年代觸發 GC 的頻率也更高;而老年代的對象 GC 頻率相對比較低,只有青年代的對象經過多輪 GC 沒有被釋放纔可能被晉升(Promotion)到老年代,晉升的過程與複製垃圾回收算法的執行過程相差無幾。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"青年代的垃圾回收被稱作是 Minor GC 循環,而老年代的垃圾回收被稱作 Major GC 循環,Full GC 循環一般是指整個堆的垃圾回收,需要注意的是很多時候我們都會混淆 Major GC 循環和 Full GC 循環,在討論時一定要先搞清楚雙方對這些名詞的理解是否一致。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"青年代的垃圾回收只會掃描整個堆的一部分,這能夠減少一次垃圾回收需要的掃描的堆大小和程序的暫停時間,提高垃圾回收的吞吐量。然而分代也爲垃圾回收引入了複雜度,其中最常見的問題是"},{"type":"text","marks":[{"type":"italic"}],"text":"跨代引用(Intergenerational Pointer)"},{"type":"text","text":",即老年代引用了青年代的對象,如果堆中存在跨代引用,那麼在 Minor GC 循環中我們不僅應該遍歷垃圾回收的根對象,還需要從包含跨代引用的對象出發標記青年代中的對象。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/c9\/c99e5ff022e0ff4bf9e0d4845d9f0d11.png","alt":"intergenerational-pointer","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 26 - 跨代引用"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了處理分代垃圾回收的跨代引用,我們需要解決兩個問題,分別是如何"},{"type":"text","marks":[{"type":"strong"}],"text":"識別"},{"type":"text","text":"堆中的跨代引用以及如何"},{"type":"text","marks":[{"type":"strong"}],"text":"存儲"},{"type":"text","text":"識別的跨代引用,在通常情況下我們會使用*寫屏障(Write Barrier)"},{"type":"text","marks":[{"type":"italic"}],"text":"識別跨代引用並使用"},{"type":"text","text":"卡表(Card Table)*存儲相關的數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"注意:卡表只是標記或者存儲跨代引用的一種方式,除了卡表我們也可以使用記錄集(Record Set)存儲跨代引用的老年代對象或者使用頁面標記按照操作系統內存頁的維度標記老年代的對象。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"寫屏障是當對象之間的指針發生改變時調用的代碼片段,這段代碼會判斷該指針是不是從老年代對象指向青年代對象的跨代引用。如果該指針是跨代引用,我們會在如下所示的卡表中標記老年代對象所在的區域:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/77\/77644d4178464aa5ddbcefcb6c9a9fa6.png","alt":"card-table","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 27 - 卡表"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"卡表與位圖比較相似,它也由一系列的比特位組成,其中每一個比特位都對應着老年區中的一塊內存,如果該內存中的對象存在指向青年代對象的指針,那麼這塊內存在卡表中就會被標記,當觸發 Minor GC 循環時,除了從根對象遍歷青年代堆之外,我們還會從卡表標記區域內的全部老年代對象開始遍歷青年代。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分代垃圾回收基於弱分代假說,結合了複製垃圾回收、寫屏障以及卡表等技術,將內存中的堆區分割成了青年代和老年代等區域,爲不同的代使用不同的內存分配和垃圾回收算法,可以有效地減少 GC 循環遍歷的堆大小和處理時間,但是寫屏障技術也會帶了額外開銷,移動收集器的特性也使它無法在 C、C++ 等編程語言中使用,在部分場景下弱分代假說不一定會成立,如果大多數的對象都會活得很久,那麼使用分代垃圾回收可能會起到反效果。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"標記區域收集器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"標記區域收集器(Mark-Region Garbage Collector)是 2008 年提出的垃圾收集算法"},{"type":"sup","content":[{"type":"text","text":"6"}]},{"type":"text","text":",這個算法也被稱作混合垃圾回收(Immix GC),它結合了標記清除和複製垃圾回收算法,我們使用前者來追蹤堆中的存活對象,使用後者減少內存中存在的碎片。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/35\/35f052ea20b5b3cd0f12eb623001f7f6.png","alt":"mark-region-garbage-collector","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 28 - 標記區域收集器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Immix 垃圾回收算法包含兩個組件,分別是用於標記區域的收集器和去碎片化機制"},{"type":"sup","content":[{"type":"text","text":"7"}]},{"type":"text","text":"。標記區域收集器與標記清除收集器比較類似,它將堆內存拆分成特定大小的內存塊,再將所有的內存塊拆分成特定大小的線。當用戶程序申請內存時,它會在上述內存塊中查找空閒的線並使用線性分配器快速分配內存;通過引入粗粒度的內存塊和細粒度的線,可以更好地控制內存的分配和釋放。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/2f\/2f868794367ad2c542d96d807a7846f7.png","alt":"sequential-allocator-cursor","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 29 - 線性分配器的光標"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"標記區域收集器與標記清除收集器比較類似,因爲它們不會移動對象,所以都會面臨內存碎片化的問題。如下圖所示,標記區域收集器在回收內存時都是以塊和線爲單位進行回收的,所以只要當前內存線中包含存活對象,收集器就會保留該片內存區域,這會帶來我們在上面提到的內存碎片。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Immix 引入的機會轉移(Opportunistic Evacuation)機制能夠有效地減少程序中的碎片化,當收集器在內存塊中遇到可以被轉移的對象,它就會使用複製垃圾回收算法將當前塊中的存活對象移動到新的塊中並釋放原塊中的內存。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"標記區域收集器將堆內存分成了粗粒度的內存塊和細粒度的內存線,結合了標記清除算法和複製垃圾回收幾種基本垃圾收集器的特性,既能夠提升垃圾收集器的吞吐量,還能夠利用線性分配器提高內存的分配速度,但是該收集器的實現相對比較複雜。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"增量併發收集器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"相信很多人對垃圾收集器的印象都是暫停程序(Stop the world,STW),隨着用戶程序申請越來越多的內存,系統中的垃圾也逐漸增多;當程序的內存佔用達到一定閾值時,整個應用程序就會全部暫停,垃圾收集器會掃描已經分配的所有對象並回收不再使用的內存空間,當這個過程結束後,用戶程序纔可以繼續執行。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"傳統的垃圾收集算法會在垃圾收集的執行期間暫停應用程序,一旦觸發垃圾收集,垃圾收集器就會搶佔 CPU 的使用權佔據大量的計算資源以完成標記和清除工作,然而很多追求實時的應用程序無法接受長時間的 STW。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/0d\/0d40cdda98f49324ae68dc2f8df462bd.png","alt":"stop-the-world-collector","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 30 - 垃圾收集與暫停程序"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"遠古時代的計算資源還沒有今天這麼豐富,今天的計算機往往都是多核的處理器,垃圾收集器一旦開始執行就會浪費大量的計算資源,爲了減少應用程序暫停的最長時間和垃圾收集的總暫停時間,我們會使用下面的策略優化現代的垃圾收集器:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"增量垃圾收集 — 增量地標記和清除垃圾,降低應用程序暫停的最長時間;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"併發垃圾收集 — 利用多核的計算資源,在用戶程序執行時併發標記和清除垃圾;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲增量和併發兩種方式都可以與用戶程序交替運行,所以我們需要"},{"type":"text","marks":[{"type":"strong"}],"text":"使用屏障技術"},{"type":"text","text":"保證垃圾收集的正確性;與此同時,應用程序也不能等到內存溢出時觸發垃圾收集,因爲當內存不足時,應用程序已經無法分配內存,這與直接暫停程序沒有什麼區別,增量和併發的垃圾收集需要提前觸發並在內存不足前完成整個循環,避免程序的長時間暫停。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"增量式(Incremental)的垃圾收集是減少程序最長暫停時間的一種方案,它可以將原本時間較長的暫停時間切分成多個更小的 GC 時間片,雖然從垃圾收集開始到結束的時間更長了,但是這也減少了應用程序暫停的最大時間:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/07\/07f93b1992fe22cf7567b01803d191cd.png","alt":"incremental-collector","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 31 - 增量垃圾收集器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"需要注意的是,增量式的垃圾收集需要與三色標記法一起使用,爲了保證垃圾收集的正確性,我們需要在垃圾收集開始前打開寫屏障,這樣用戶程序對內存的修改都會先經過寫屏障的處理,保證了堆內存中對象關係的強三色不變性或者弱三色不變性。雖然增量式的垃圾收集能夠減少最大的程序暫停時間,但是增量式收集也會增加一次 GC 循環的總時間,在垃圾收集期間,因爲寫屏障的影響用戶程序也需要承擔額外的計算開銷,所以增量式的垃圾收集也不是隻有優點的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"併發(Concurrent)的垃圾收集不僅能夠減少程序的最長暫停時間,還能減少整個垃圾收集階段的時間,通過開啓讀寫屏障、"},{"type":"text","marks":[{"type":"strong"}],"text":"利用多核優勢與用戶程序並行執行"},{"type":"text","text":",併發垃圾收集器確實能夠減少垃圾收集對應用程序的影響:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/97\/9763112d58beeda265593ca3525dc935.png","alt":"concurrent-collector","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 32 - 併發垃圾收集器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"雖然併發收集器能夠與用戶程序一起運行,但是並不是所有階段都可以與用戶程序一起運行,部分階段還是需要暫停用戶程序的,不過與傳統的算法相比,併發的垃圾收集可以將能夠併發執行的工作儘量併發執行;當然,因爲讀寫屏障的引入,併發的垃圾收集器也一定會帶來額外開銷,不僅會增加垃圾收集的總時間,還會影響用戶程序,這是我們在設計垃圾收集策略時必須要注意的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是因爲增量併發收集器的併發標記階段會與用戶程序一同或者交替運行,所以可能出現"},{"type":"text","marks":[{"type":"strong"}],"text":"標記爲垃圾的對象被用戶程序中的其他對象重新引用"},{"type":"text","text":",當垃圾回收的標記階段結束後,被錯誤標記爲垃圾的對象會被直接回收,這就會帶來非常嚴重的問題,想要解決增量併發收集器的這個問題,我們需要了解三色抽象和屏障技術。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"三色抽象"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了解決原始標記清除算法帶來的長時間 STW,多數現代的追蹤式垃圾收集器都會實現三色標記算法的變種以縮短 STW 的時間。三色標記算法將程序中的對象分成白色、黑色和灰色三類"},{"type":"sup","content":[{"type":"text","text":"8"}]},{"type":"text","text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"白色對象 — 潛在的垃圾,其內存可能會被垃圾收集器回收;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"黑色對象 — 活躍的對象,包括不存在任何引用外部指針的對象以及從根對象可達的對象;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"灰色對象 — 活躍的對象,因爲存在指向白色對象的外部指針,垃圾收集器會掃描這些對象的子對象;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/02\/02469b39daccd9774df7a19d819eb57d.png","alt":"tri-color-objects","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 33 - 三色的對象"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在垃圾收集器開始工作時,程序中不存在任何的黑色對象,垃圾收集的根對象會被標記成灰色,垃圾收集器只會從灰色對象集合中取出對象開始掃描,當灰色集合中不存在任何對象時,標記階段就會結束。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/92\/9229323d893674e1817633f2c4deebc8.png","alt":"tri-color-mark-sweep","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 34 - 三色標記垃圾收集器的執行過程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"三色標記垃圾收集器的工作原理很簡單,我們可以將其歸納成以下幾個步驟:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"從灰色對象的集合中選擇一個灰色對象並將其標記成黑色;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"將黑色對象指向的所有對象都標記成灰色,保證該對象和被該對象引用的對象都不會被回收;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"重複上述兩個步驟直到對象圖中不存在灰色對象;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當三色的標記清除的標記階段結束之後,應用程序的堆中就不存在任何的灰色對象,我們只能看到黑色的存活對象以及白色的垃圾對象,垃圾收集器可以回收這些白色的垃圾,下面是使用三色標記垃圾收集器執行標記後的堆內存,堆中只有對象 D 爲待回收的垃圾:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ad\/ad506c51e342ef7c0d8a4985116f14ff.png","alt":"tri-color-mark-sweep-after-mark-phase","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 35 - 三色標記後的堆"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲用戶程序可能在標記執行的過程中修改對象的指針,所以三色標記清除算法本身是不可以併發或者增量執行的,它仍然需要 STW,在如下所示的三色標記過程中,用戶程序建立了從 A 對象到 D 對象的引用,但是因爲程序中已經不存在灰色對象了,所以 D 對象會被垃圾收集器錯誤地回收。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/6c\/6cc8fe6b420e1a7ed1589a5f1f216c2b.png","alt":"tri-color-mark-sweep-and-mutator","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 36 - 三色標記與用戶程序"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本來不應該被回收的對象卻被回收了,這在內存管理中是非常嚴重的錯誤,我們將這種錯誤成爲懸掛指針,即指針沒有指向特定類型的合法對象,影響了內存的安全性"},{"type":"sup","content":[{"type":"text","text":"9"}]},{"type":"text","text":",想要併發或者增量地標記對象還是需要使用屏障技術。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"垃圾回收屏障"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"內存屏障技術是一種屏障指令,它可以讓 CPU 或者編譯器在執行內存相關操作時遵循特定的約束,目前的多數的現代處理器都會亂序執行指令以最大化性能,但是該技術能夠保證代碼對內存操作的順序性,在內存屏障前執行的操作一定會先於內存屏障後執行的操作"},{"type":"sup","content":[{"type":"text","text":"10"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"想要在併發或者增量的標記算法中保證正確性,我們需要達成以下兩種三色不變性(Tri-color invariant)中的任意一種:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"強三色不變性 — 黑色對象不會指向白色對象,只會指向灰色對象或者黑色對象;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"弱三色不變性 — 黑色對象指向的白色對象必須包含一條從灰色對象經由多個白色對象的可達路徑"},{"type":"sup","content":[{"type":"text","text":"11"}]},{"type":"text","text":";"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/22\/2267a763b097fd9433bf6defceb82a51.png","alt":"strong-weak-tricolor-invariant","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 37 - 三色不變性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖分別展示了遵循強三色不變性和弱三色不變性的堆內存,遵循上述兩個不變性中的任意一個,我們都能保證垃圾收集算法的正確性,而屏障技術就是在併發或者增量標記過程中保證三色不變性的重要技術。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"垃圾收集中的屏障技術更像是一個鉤子方法,它是在用戶程序讀取對象、創建新對象以及更新對象指針時執行的一段代碼,根據操作類型的不同,我們可以將它們分成讀屏障(Read barrier)和寫屏障(Write barrier)兩種,因爲讀屏障需要在讀操作中加入代碼片段,對用戶程序的性能影響很大,所以編程語言往往都會採用寫屏障保證三色不變性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們在這裏想要介紹的是以下幾種寫屏障技術,分別是 Dijkstra 提出的插入寫屏障"},{"type":"sup","content":[{"type":"text","text":"12"}]},{"type":"text","text":"和 Yuasa 提出的刪除寫屏障"},{"type":"sup","content":[{"type":"text","text":"13"}]},{"type":"text","text":",這裏會分析它們如何保證三色不變性和垃圾收集器的正確性。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"插入寫屏障"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Dijkstra 在 1978 年提出了插入寫屏障,通過如下所示的寫屏障,用戶程序和垃圾收集器可以在交替工作的情況下保證程序執行的正確性:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"writePointer(slot, ptr):\n shade(ptr)\n *field = ptr\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上述插入寫屏障的僞代碼非常好理解,每當我們執行類似 "},{"type":"codeinline","content":[{"type":"text","text":"*slot = ptr"}]},{"type":"text","text":" 的表達式時,我們會執行上述寫屏障通過 "},{"type":"codeinline","content":[{"type":"text","text":"shade"}]},{"type":"text","text":" 函數嘗試改變指針的顏色。如果 "},{"type":"codeinline","content":[{"type":"text","text":"ptr"}]},{"type":"text","text":" 指針是白色的,那麼該函數會將該對象設置成灰色,其他情況則保持不變。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/35\/35139f92e3d8a3f87e5b7e84d447c07b.png","alt":"dijkstra-insert-write-barrier","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 38 - Dijkstra 插入寫屏障"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"假設我們在應用程序中使用 Dijkstra 提出的插入寫屏障,在一個垃圾收集器和用戶程序交替運行的場景中會出現如上圖所示的標記過程:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"垃圾收集器將根對象指向 A 對象標記成黑色並將 A 對象指向的對象 B 標記成灰色;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"用戶程序修改 A 對象的指針,將原本指向 B 對象的指針指向 C 對象,這時觸發寫屏障將 C 對象標記成灰色;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"垃圾收集器依次遍歷程序中的其他灰色對象,將它們分別標記成黑色;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Dijkstra 的插入寫屏障是一種相對保守的屏障技術,它會將"},{"type":"text","marks":[{"type":"strong"}],"text":"有存活可能的對象都標記成灰色"},{"type":"text","text":"以滿足強三色不變性。在如上所示的垃圾收集過程中,實際上不再存活的 B 對象最後沒有被回收;而如果我們在第二和第三步之間將指向 C 對象的指針改回指向 B,垃圾收集器仍然認爲 C 對象是存活的,這些被錯誤標記的垃圾對象只有在下一個循環纔會被回收。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"插入式的 Dijkstra 寫屏障雖然實現非常簡單並且也能保證強三色不變性,但是它也有很明顯的缺點。因爲棧上的對象在垃圾收集中也會被認爲是根對象,所以爲了保證內存的安全,Dijkstra 必須爲棧上的對象增加寫屏障或者在標記階段完成重新對棧上的對象對象進行掃描,這兩種方法各有各的缺點,前者會大幅度增加寫入指針的額外開銷,後者重新掃描棧對象時需要暫停程序,垃圾收集算法的設計者需要在這兩者之前做出權衡。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"刪除寫屏障"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Yuasa 在 1990 年的論文 Real-time garbage collection on general-purpose machines 中提出了刪除寫屏障,因爲一旦該寫屏障開始工作,它就會保證開啓寫屏障時堆上所有對象的可達,所以也被稱作快照垃圾收集(Snapshot GC)"},{"type":"sup","content":[{"type":"text","text":"14"}]},{"type":"text","text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"This guarantees that no objects will become unreachable to the garbage collector traversal all objects which are live at the beginning of garbage collection will be reached even if the pointers to them are overwritten."}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"該算法會使用如下所示的寫屏障保證增量或者併發執行垃圾收集時程序的正確性:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"writePointer(slot, ptr)\n shade(*slot)\n *slot = ptr\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上述代碼會在老對象的引用被刪除時,將白色的老對象塗成灰色,這樣刪除寫屏障就可以保證弱三色不變性,老對象引用的下游對象一定可以被灰色對象引用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/bd\/bdbca825189e27868f6af0fa5e7008fd.png","alt":"yuasa-delete-write-barrier","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 39 - Yuasa 刪除寫屏障"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"假設我們在應用程序中使用 Yuasa 提出的刪除寫屏障,在一個垃圾收集器和用戶程序交替運行的場景中會出現如上圖所示的標記過程:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"垃圾收集器將根對象指向 A 對象標記成黑色並將 A 對象指向的對象 B 標記成灰色;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"用戶程序將 A 對象原本指向 B 的指針指向 C,觸發刪除寫屏障,但是因爲 B 對象已經是灰色的,所以不做改變;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"用戶程序將 B 對象原本指向 C 的指針刪除,觸發刪除寫屏障,白色的 C 對象被塗成灰色"},{"type":"text","text":";"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"垃圾收集器依次遍歷程序中的其他灰色對象,將它們分別標記成黑色;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上述過程中的第三步觸發了 Yuasa 刪除寫屏障的着色,因爲用戶程序刪除了 B 指向 C 對象的指針,所以 C 和 D 兩個對象會分別違反強三色不變性和弱三色不變性:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"強三色不變性 — 黑色的 A 對象直接指向白色的 C 對象;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"弱三色不變性 — 垃圾收集器無法從某個灰色對象出發,經過幾個連續的白色對象訪問白色的 C 和 D 兩個對象;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Yuasa 刪除寫屏障通過對 C 對象的着色,保證了 C 對象和下游的 D 對象能夠在這一次垃圾收集的循環中存活,避免發生懸掛指針以保證用戶程序的正確性。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"內存管理在今天仍然是十分重要的話題,當我們在討論編程語言的性能和便利程度時,內存管理機制都是繞不開的。編程語言在設計內存管理機制時,往往需要在手動管理和自動管理之間進行抉擇,現代的大多數編程語言爲了減少工程師的負擔,多數都會選擇使用垃圾回收的方式自動管理內存,但是也有少數編程語言通過手動管理追求極致的性能。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"想要在一篇文章中詳盡展示內存管理的方方面面是不可能的,我們可能需要一本書或者幾本書的厚度才能詳細地展示內存管理的相關技術,這裏更多側重的還是垃圾回收,Rust 的所有權、生命週期以及 C++ 的智能指針等機制在文章中都沒有提及,感興趣的讀者可以自行了解。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"Wikipedia: Static variable "},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Static_variable","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/en.wikipedia.org\/wiki\/Static_variable"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-memory-management\/#fnref:1","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"Wikipedia: Stack-based memory allocation "},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Stack-based_memory_allocation","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/en.wikipedia.org\/wiki\/Stack-based_memory_allocation"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-memory-management\/#fnref:2","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"Wikipedia: Garbage collection (computer science) "},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Garbage_collection_(computer_science)","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/en.wikipedia.org\/wiki\/Garbage_collection_(computer_science)"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-memory-management\/#fnref:3","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"Wikipedia: Garbage (computer science) "},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Garbage_(computer_science)","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/en.wikipedia.org\/wiki\/Garbage_(computer_science)"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-memory-management\/#fnref:4","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"Garbage Collection in Java (1) - Heap Overview "},{"type":"link","attrs":{"href":"http:\/\/insightfullogic.com\/2013\/Feb\/20\/garbage-collection-java-1\/","title":null,"type":null},"content":[{"type":"text","text":"http:\/\/insightfullogic.com\/2013\/Feb\/20\/garbage-collection-java-1\/"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-memory-management\/#fnref:5","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":6,"align":null,"origin":null},"content":[{"type":"text","text":"Immix: A Mark-Region Garbage Collector with Space Efficiency, Fast Collection, and Mutator Performance. Stephen M. Blackburn. Kathryn S. McKinley. 2008. "},{"type":"link","attrs":{"href":"http:\/\/www.cs.utexas.edu\/users\/speedway\/DaCapo\/papers\/immix-pldi-2008.pdf","title":null,"type":null},"content":[{"type":"text","text":"http:\/\/www.cs.utexas.edu\/users\/speedway\/DaCapo\/papers\/immix-pldi-2008.pdf"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-memory-management\/#fnref:6","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":7,"align":null,"origin":null},"content":[{"type":"text","text":"The CS 6120 Course Blog. Siqiu Yao. 2019. "},{"type":"link","attrs":{"href":"https:\/\/www.cs.cornell.edu\/courses\/cs6120\/2019fa\/blog\/immix\/","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/www.cs.cornell.edu\/courses\/cs6120\/2019fa\/blog\/immix\/"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-memory-management\/#fnref:7","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":8,"align":null,"origin":null},"content":[{"type":"text","text":"“Tri-color marking” "},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Tracing_garbage_collection#Tri-color_marking","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/en.wikipedia.org\/wiki\/Tracing_garbage_collection#Tri-color_marking"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-memory-management\/#fnref:8","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":9,"align":null,"origin":null},"content":[{"type":"text","text":"“Dangling pointer” "},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Dangling_pointer","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/en.wikipedia.org\/wiki\/Dangling_pointer"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-memory-management\/#fnref:9","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":10,"align":null,"origin":null},"content":[{"type":"text","text":"“Wikpedia: Memory barrier” "},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Memory_barrier","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/en.wikipedia.org\/wiki\/Memory_barrier"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-memory-management\/#fnref:10","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":11,"align":null,"origin":null},"content":[{"type":"text","text":"P. P. Pirinen. Barrier techniques for incremental tracing. In ACM SIGPLAN Notices, 34(3), 20–25, October 1998. "},{"type":"link","attrs":{"href":"https:\/\/dl.acm.org\/doi\/10.1145\/301589.286863","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/dl.acm.org\/doi\/10.1145\/301589.286863"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-memory-management\/#fnref:11","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":12,"align":null,"origin":null},"content":[{"type":"text","text":"E. W. Dijkstra, L. Lamport, A. J. Martin, C. S. Scholten, and E. F. Steffens. On-the-fly garbage collection: An exercise in cooperation. Communications of the ACM, 21(11), 966–975, 1978. "},{"type":"link","attrs":{"href":"https:\/\/www.cs.utexas.edu\/users\/EWD\/transcriptions\/EWD05xx\/EWD520.html","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/www.cs.utexas.edu\/users\/EWD\/transcriptions\/EWD05xx\/EWD520.html"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-memory-management\/#fnref:12","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":13,"align":null,"origin":null},"content":[{"type":"text","text":"T. Yuasa. Real-time garbage collection on general-purpose machines. Journal of Systems and Software, 11(3):181–198, 1990. "},{"type":"link","attrs":{"href":"https:\/\/www.sciencedirect.com\/science\/article\/pii\/016412129090084Y","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/www.sciencedirect.com\/science\/article\/pii\/016412129090084Y"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-memory-management\/#fnref:13","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":14,"align":null,"origin":null},"content":[{"type":"text","text":"Paul R Wilson. “Uniprocessor Garbage Collection Techniques” "},{"type":"link","attrs":{"href":"https:\/\/www.cs.cmu.edu\/~fp\/courses\/15411-f14\/misc\/wilson94-gc.pdf","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/www.cs.cmu.edu\/~fp\/courses\/15411-f14\/misc\/wilson94-gc.pdf"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-memory-management\/#fnref:14","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":15,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":15,"align":null,"origin":null},"content":[{"type":"text","text":"本文轉載自:"},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/","title":"xxx","type":null},"content":[{"type":"text","text":"Draveness"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":15,"align":null,"origin":null},"content":[{"type":"text","text":"原文鏈接:"},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/system-design-memory-management\/","title":"xxx","type":null},"content":[{"type":"text","text":"內存管理設計精要"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":15,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":15,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章