Serverless 與輕量級虛擬化 Firecracker · NSDI '20

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文要介紹的是 2020 年 NSDI 期刊中的論文 —— "},{"type":"link","attrs":{"href":"https:\/\/www.usenix.org\/system\/files\/nsdi20-paper-agache.pdf","title":null,"type":null},"content":[{"type":"text","text":"Firecracker: Lightweight Virtualization for Serverless Applications"}]},{"type":"sup","content":[{"type":"text","text":"1"}]},{"type":"text","text":",該論文實現的 Firecracker 能夠在宿主機上提供輕量級的虛擬化支持。很多開發者在今天都會選擇使用 Serverless 的容器和服務以爲了減少系統的運維開銷、提高硬件的資源利用並實現快速的擴縮容,然而 Serverless 的場景卻對容器的隔離性、安全性以及性能都提出了更高的要求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當利用相同的硬件爲多個租戶提供服務時,我們期望不同的工作負載可以在最小化額外開銷的情況下保證安全和性能上的隔離。然而在過去很長一段時間內,大多數的觀點都認爲在強安全性和低延遲之間我們只能二選一。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/30\/3073f7d0e5b7e1a7f9a90dd9d2ada60a.png","alt":"virtualization-and-container","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 1 - 安全性和低延遲"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"虛擬化技術可以提供較強的安全性但是會引入較大的額外開銷,而容器技術與之相反,它提供了弱安全性保證以及較小的額外開銷。在這種前提下,公有云和私有云都會根據需求做出了自己的選擇:"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"公有云爲了保證安全性會使用虛擬機,雖然額外開銷較高,但是可以將成本都轉嫁給用戶;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"私有云爲了保證性能會使用容器技術,雖然容器之間的隔離性較差,但是面向的客戶一般都是公司內的業務方,所以安全性一般不是首要考慮的問題,可以優先保證整體的性能;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不同租戶之間的隔離性永遠都是公有云首先要考慮的問題,租戶之間存在一些資源的競爭有時還是可以接受的,但是沒有客戶能夠接受花錢購買的服務可能受到其他租戶的攻擊。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/36\/3632adcb3102a236f206bde84d04adef.png","alt":"firecracker-logo","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 2 - Firecracker"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這篇文章介紹的 Firecracker"},{"type":"sup","content":[{"type":"text","text":"2"}]},{"type":"text","text":" 是新的虛擬機監視程序(Virtual Machine Monitor、VMM),它可以同時提供強安全性保證和較低的額外開銷,目前爲 AWS 函數和計算引擎提供支持,支持數百萬的工作負載和每個月數萬億的請求。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"隔離性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Firecracker 可以爲運行的工作負載提供良好的兼容性、性能、具有極低的額外開銷並且可以在單個主機上支持上千個函數,但是這些都不是這篇文章要關注的重點,我們在這裏展開介紹論文中提到的幾種隔離機制。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/2e\/2ede8231963607ab0bd26abcf3f1e47c.png","alt":"isolation-options","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 3 - 隔離選項"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Linux 容器、語言特定的隔離機制和虛擬化技術是今天比較常見的幾種隔離選項,我們在這裏花一些時間簡單介紹它們三者的異同。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Linux 容器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Linux 的容器組合了內核的多種功能提供運維和安全上的隔離性,其中包括:"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"控制組(cgroups):提供 CPU、內存和其他資源限制;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"命名空間(namespaces):爲用戶、進程標識符和網絡接口等內核資源提供命名空間;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"安全計算(seccomp-bpf):限制進程可以使用的系統調用和傳入的參數;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"更改根目錄(chroot):提供隔離的文件系統;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"容器往往依賴系統調用上的限制來保證安全性,很多容器的運行時都會在系統調用上做文章來保證安全性,例如 Google 的 gvisor "},{"type":"sup","content":[{"type":"text","text":"3"}]},{"type":"text","text":"就在用戶空間模擬一些系統調用,這能明顯地減少內核需要提供的能力。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"語言特定隔離"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一個用於隔離工作負載的常用技術就是編程語言的虛擬機了,例如 Java 虛擬機(Java Virtual Machine、JVM)以及運行 Javascript 的 V8 引擎。這種通過犧牲靈活性和兼容性以獲得安全性的方式在一些場景下還是比較適合的,Chrome 的底層引擎 Chromium 就爲每個網站單獨分配進程資源防止網站訪問不相關的信息造成安全問題。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"虛擬化技術"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"現代的虛擬化技術都會使用硬件提供的功能保證虛擬硬件、頁表和操作系統內核的隔離性。雖然虛擬化技術能夠解決安全性的問題,但是這個世界的問題很多都是按下葫蘆浮起瓢,重量級的虛擬化技術也會帶來下面的挑戰:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"部署密度低、額外開銷大:虛擬機監控程序和獨立運行的內核都會佔用額外的 CPU 和內存資源從而限制單機能夠部署的虛擬機上限;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"啓動時間長:虛擬機的啓動時間也會影響它的使用體驗,相信很多開發者都經歷過運行虛擬機所需要的漫長等待時間,感覺在本地的虛擬機啓動時間僅次於 Jetbrains 全家桶;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"實現複雜、容易出錯:虛擬化技術的實現往往異常複雜,虛擬機監控程序 QEMU 中包含 1,400,000 行代碼並調用 270 個不同的系統調用"},{"type":"sup","content":[{"type":"text","text":"4"}]},{"type":"text","text":",我們很難保證如此龐大的代碼倉庫的可靠性;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/af\/afbb2a894902f868b33c1b451c052c7b.png","alt":"challenges-on-virtualization","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 4 - 虛擬化技術的挑戰"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Firecracker 選擇更加安全的語言 Rust 並使用 50,000 行代碼實現最小可用的虛擬機監控程序以替代 QEMU,新的 VMM 會與 Linux 的內核虛擬機(Kernel Virtual Machine、KVM)一起爲不同的工作負載提供運行環境。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Firecracker 作爲虛擬機監控程序,它依賴 Linux 的內核虛擬機(KVM)提供最小的虛擬機 MicroVM,這主要因爲 Linux 的組件提供了正確的功能、性能以及設計,繞過這些組件會帶來巨大的實現成本,同時也會增加運維工程師理解新系統的成本並影響運維工作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這篇論文詳細地分析了 Firecracker 的性能,以啓動時間爲例,預先配置版本的啓動時間在 100 ~ 150ms,而沒有預先配置的 Firecracker 啓動時間大約爲 150 ~ 250ms;除了提供毫秒級別的啓動時間之外,Firecracker 僅需要 3MB 的內存額外開銷,這可以顯著地提升單機的部署密度。雖然 Firecracker 在啓動時間和額外開銷上都有着不錯的表現,但是它的 I\/O 吞吐量與其他系統相比卻差很多。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ac\/ac733fadbfef5bd72a270ca2ca9920c6.png","alt":"firecracker-io-thoughput","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖 5 - I\/O 吞吐量"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"值得注意的是,論文中提到 Firecracker 在測試環境中超售二十倍資源,在生產環境中超售十倍資源都沒有帶來任何問題。看起來 AWS 的函數引擎 Lambda 的確可以帶來相當高的利潤,果然想要賺錢就是要把一份資源當成十份甚至二十份來賣。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"推薦閱讀"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/draveness.me\/papers-mesos\/","title":null,"type":null},"content":[{"type":"text","text":"集羣管理系統 Mesos 的設計原理 · NSDI ‘11"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/draveness.me\/papers-thunderbolt\/","title":null,"type":null},"content":[{"type":"text","text":"數據中心的電力超售 · OSDI ‘20"}]}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"Agache, Alexandru, et al. “Firecracker: Lightweight virtualization for serverless applications.” 17th {usenix} symposium on networked systems design and implementation ({nsdi} 20). 2020. "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/papers-firecracker\/#fnref:1","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"Firecracker · Secure and fast microVMs for serverless computing. "},{"type":"link","attrs":{"href":"https:\/\/github.com\/firecracker-microvm\/firecracker","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/github.com\/firecracker-microvm\/firecracker"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/papers-firecracker\/#fnref:2","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"gvisor · Application Kernel for Containers "},{"type":"link","attrs":{"href":"https:\/\/github.com\/google\/gvisor","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/github.com\/google\/gvisor"}]},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/papers-firecracker\/#fnref:3","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"Chia-Che Tsai, Bhushan Jain, Nafees Ahmed Abdul, and Donald E. Porter. A study of modern linux api usage and compatibility: What to support when you’re supporting. In Proceedings of the Eleventh European Conference on Computer Systems, EuroSys ’16, pages 16:1–16:16, New York, NY, USA, 2016. ACM. URL: "},{"type":"link","attrs":{"href":"http:\/\/doi.acm.org\/10.1145\/2901318.2901341,","title":null,"type":null},"content":[{"type":"text","text":"http:\/\/doi.acm.org\/10.1145\/2901318.2901341,"}]},{"type":"text","text":" doi:10.1145\/2901318.2901341. "},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/papers-firecracker\/#fnref:4","title":null,"type":null},"content":[{"type":"text","text":"↩︎"}]}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"本文轉載自:"},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/","title":"xxx","type":null},"content":[{"type":"text","text":"面向信仰編程"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"原文鏈接:"},{"type":"link","attrs":{"href":"https:\/\/draveness.me\/papers-firecracker\/","title":"xxx","type":null},"content":[{"type":"text","text":"Serverless 與輕量級虛擬化 Firecracker · NSDI '20"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章