國內首篇雲廠商 Serverless 論文入選全球頂會:突發流量下,如何加速容器啓動?

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作者 | 王驁","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"來源 | ","attrs":{}},{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s/YrPS9vGw2pAZXyyFzXwRlg","title":"","type":null},"content":[{"type":"text","text":"Serverless 公衆號","attrs":{}}]}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"導讀","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​USENIX ATC (USENIX Annual Technical Conference) 學術會議是計算機系統領域的頂級會議,入選中國計算機協會(CCF)推薦 A 類國際會議列表;本次會議共投稿 341 篇論文,接收 64 篇,錄用率 18.8%。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"阿里雲 Serverless 團隊第一個提出在 FaaS 場景下的去中心化快速鏡像分發技術,團隊創作的論文被 USENIX ATC’21 錄用。以下是論文核心內容解讀,重點在縮短阿里雲函數計算產品 Custom Container Runtime 的函數冷啓動端到端延遲。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"USENIX ATC 將於 7.14-7.16 在線上舉辦,論文信息見:","attrs":{}},{"type":"link","attrs":{"href":"https://www.usenix.org/conference/atc21/presentation/wang-ao","title":"","type":null},"content":[{"type":"text","text":"https://www.usenix.org/conference/atc21/presentation/wang-ao","attrs":{}}]},{"type":"text","text":"​","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"摘要","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Serverless Computing(FaaS)是一種新的雲計算範式,它允許客戶只關注自身的代碼業務邏輯,系統底層的虛擬化、資源管理、彈性伸縮等都交給雲系統服務商進行維護。Serverless Computing 上支持容器生態,解鎖了多種業務場景,但是由於容器鏡像複雜,體積較大,FaaS 的 workload 動態性高且難以預測等特性,諸多業界領先的產品和技術並不能很好的應用於 FaaS 平臺之上,所以高效的容器分發技術在 FaaS 平臺上面臨着挑戰。​在這篇論文中,我們設計並提出 FaaSNet。FaaSNet 是一個具有高伸縮性的輕量級系統中間件,它利用到鏡像加速格式進行容器分發,目標作用場景是 FaaS 中突發流量下的大規模容器鏡像啓動(函數冷啓動)。FaaSNet 的核心組件包含 Function Tree (FT),是一個去中心化的、自平衡的二叉樹狀拓撲結構,樹狀拓撲結構中的所有節點全部等價。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們將 FaaSNet 集成在函數計算產品上,實驗結果表明,在高併發下的請求量下,相比原生函數計算(Function Compute, 下稱 FC),FaaSNet 可以爲 FC 提供 13.4 倍的容器啓動速度。並且對於由於突發請求量帶來的端到端延遲不穩定時間,FaaSNet 相比 FC 少用 75.2% 的時間可以將端到端延遲恢復到正常水平。​","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"論文介紹","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"1. 背景與挑戰","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"FC 於 2020 年 9 月支持自定義容器鏡像(https://developer.aliyun.com/article/772788)功能,相繼 AWS Lambda 在同年 12 月公佈了 Lambda container image 支持,表明 FaaS 擁抱容器生態的大趨勢。並且函數計算在 2021 年 2 月上線了函數計算鏡像加速(https://developer.aliyun.com/article/781992)功能。函數計算這兩項功能解鎖了更多的 FaaS 應用場景,允許用戶無縫將自己的容器業務邏輯遷移到函數計算平臺上,並且可以做到 GB 級別的鏡像在秒級啓動。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當函數計算後臺遇到大規模請求導致過多的函數冷啓動時,即使有鏡像加速功能的加持,也會對 container registry 帶寬帶來巨大壓力,多臺機器同時對同一個 container registry 進行鏡像數據的拉取,導致容器鏡像服務帶寬瓶頸或限流,使得拉取下載鏡像數據時間變長(即使在鏡像加速格式下)。較爲直接的做法可以提高函數計算後臺 Registry 的帶寬能力,但是這個方法不能解決根本問題,同時還會帶來額外的系統開銷。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1)Workload 分析","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們首先對 FC 兩大區域(北京和上海)的線上數據進行了分析:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/76/76aaa187d49b5ef2fbfc7d473399eb76.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖(a)分析了函數冷啓動中,FC 系統 pull image 的延遲,可以看到在北京和上海分別有 ~80% 和 ~90% 的拉去鏡像延遲大於 10 秒;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖(b)展示 pull image 在整個冷啓動中的佔比,同樣可以發現,北京區域內 80% 的函數,上海區域內 90% 的函數拉取鏡像時間會佔據大於整個冷啓動中 60% 的延遲;","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"workload 的分析表明,函數的冷啓動絕大多數時間花在了容器鏡像數據的獲取之上,所以優化此部分延遲可以大大提高函數的冷啓動表現。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"根據線上運維的歷史記錄,某大用戶的代表在瞬時會併發拉起 4000 個函數鏡像,這些鏡像的大小解壓前爲 1.8GB,解壓後大小爲 3-4GB,在大流量的請求到達開始拉起容器的瞬間,就收到了容器服務的流控報警,造成了部分請求延遲被延長,嚴重的時候會收到了容器啓動失敗的提示。這類問題場景都是亟需我們來解決的。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2)State-of-the-art 對比","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"學術界和工業界有若干相關技術可以加速鏡像的分發速度,例如:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"阿里巴巴的 DADI:","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}}],"text":"https://www.usenix.org/conference/atc20/presentation/li-huiba","attrs":{}},{"type":"text","text":"蜻蜓:","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}}],"text":"https://github.com/dragonfly/dragonfly","attrs":{}},{"type":"text","text":"以及 Uber 開源的 Kraken:","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}}],"text":"https://github.com/uber/kraken/","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"DADI","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"DADI 提供了一種非常高效的鏡像加速格式,可以實現按需讀取(FaaSNet 也利用到了容器加速格式)。在鏡像分發技術上,DADI 採用了樹狀拓撲結構,以鏡像 layer 粒度進行節點間的組網,一個 layer 對應一個樹狀拓撲結構,每一個 VM 會存在於多顆邏輯樹中。DADI 的 P2P 分發需要依賴若干性能規格(CPU、帶寬)較大的 root 節點來擔任數據回源角色、維護拓撲中 peer 的管理者角色;DADI 的樹狀結構偏靜態,因爲容器 provisioning 的速度一般不會持續很久,所以默認情況下,DADI 的 root 節點會在 20 分鐘後將拓撲邏輯解散,並不會一直維護下去。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"蜻蜓","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"蜻蜓同樣也是一個基於 P2P 的鏡像、文件分發網絡,其中的組件包塊Supernode(Master 節點),dfget(Peer 節點)。類似於 DADI,蜻蜓同樣依賴於若干大規格的 Supernode 纔可以撐起整個集羣,蜻蜓同樣通過中央 Supernode 節點來管理維護了一個全鏈接的拓撲結構(多個 dfget 節點分別貢獻同一個文件的不同 pieces 已達到給目標節點點對點傳輸的目的),Supernode 性能會是整個集羣吞吐性能的潛在瓶頸。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Kraken","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kraken 的 origin、tracker 節點作爲中央節點管理整個網絡,agent 存在於每個 peer 節點上。Kraken 的 traker 節點只是管理組織集羣中 peer 的連接,Kraken 會讓 peer 節點之間自行溝通數據傳輸。但 Kraken 同樣是一個以 layer 爲單位的容器鏡像分發網絡,組網邏輯也會成爲較爲複雜的全連接模式。​通過對上述三種業界領先的技術闡釋,我們可以看到幾個共同點:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一,三者均以 image layer 作爲分發單位,組網邏輯過於細粒度,導致每個 peer 節點上可能會同時有多個 active 數據連接;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二,三者都依賴於中央節點進行組網邏輯的管理以及集羣內的 peer 節點協調,DADI 和蜻蜓的中央節點還會負責數據回源,這樣的設計要求在生產使用中,需要部署若干大規格的機器來承擔非常高的流量,同時還需要進行調參來達到預期的性能指標。","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們帶着上述的一些前提條件來反觀在 FC ECS 架構下的設計,FC ECS 架構中的每個機器的規格爲 2 CPU 核、4GB 內存以及 1Gbps 內網帶寬,並且這些機器的生命週期是不可靠的,隨時可能被回收。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這樣帶來了三個較爲嚴重的問題:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"內網帶寬不足導致在全連接中較爲容易出現帶寬擠兌,導致數據傳輸性能下降。全連接的拓撲結構沒有做到 function-aware,在 FC 下極易引起系統安全問題,因爲每臺執行函數邏輯的機器是不被FC系統組件信任的,會留下租戶 A 截取到租戶 B 數據的安全隱患;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CPU 和帶寬規格受限。由於函數計算 Pay-per-use 的計費特性,我們集羣內的機器生命週期是不可靠的,無法在機器池中拿出若干機器作爲中央節點管理整個集羣。這部分機器的系統開銷會成爲一大部分負擔,還有就是可靠性不能被保證,機器會導致 failure 的情況;FC 所需要的是繼承按需付費特性,提供可以瞬時組網的技術。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"多函數問題。上述三者並沒有 function-awareness 機制,例如 DADI P2P 中,可能存在單節點存有過多鏡像成爲熱點,造成性能下降的問題。更嚴重的問題是多函數拉取本質上是不可預測的,當多函數併發拉取打滿帶寬,同期的從遠端下載的服務也會受到影響,如代碼包,第三方依賴下載,導致整個系統出現了可用性的問題。","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"帶着這些問題,我們在下一節中詳細闡釋 FaaSNet 設計方案。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"2. 設計方案 - FaaSNet","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"根據上述三種業界成熟的 P2P 方案,沒有做到 function 級別的感知,並且集羣內的拓撲邏輯大多爲全連接的網絡模式,並且對機器的性能提出了一定需求,這些前置設定不適配 FC ECS 的系統實現。所以我們提出了 Function Tree (下稱 FT),一個函數級別並且是 function-aware 的邏輯樹狀拓撲結構。​","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1)FaaSNet 架構","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/87/87303d65a4e1eea081a3290bca7d27f2.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖中灰色的部分是我們 FaaSNet 進行了系統改造的部分,其他白色模塊均延續了 FC 現有的系統架構。值得注意的是,FaaSNet 所有 Function Tree 均在 FC 的調度器上進行管理;在每一個 VM 上,有 VM agent 來配合 scheduler 進行 gRPC 通信接受上下游消息;並且,VM agent 也負責上下游的鏡像數據獲取與分發。​","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2)去中心化的函數/鏡像級別自平衡樹狀拓撲結構","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了解決上述三個問題,我們首先將拓撲結構提升到了函數/鏡像級別,這樣可以有效降低每一個 VM 上的網絡連接數,另外,我們設計了一種基於 AVL tree 的樹狀拓撲結構。接下來,我們詳細闡述我們的 Function Tree 設計。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Function Tree","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"去中心化自平衡二叉樹拓撲結構","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"FT 的設計來源於 AVL tree 算法的啓發,在 FT 中,目前不存在節點權重這個概念,所有節點等價(包括根節點),當樹中添加或刪除任意個節點時,整個樹都會保持一個 perfect-balanced 結構,既保證任意一個節點的左右子樹的高度差的絕對值不超過 1。當有節點加入或刪除後,FT 會自己調整樹的形狀(左/右旋)從而達到平衡結構,如下圖右旋示例所示,節點 6 即將被回收,它的回收導致了以節點 1 作爲父節點的左右子樹高度不平衡,需要進行右旋操作已達到平衡狀態,State2 代表旋轉後的終態,節點 2 成爲了新的樹根節點。注:所有節點均代表 FC 中的 ECS 機器。​","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/82/828146c3a268187ef0a71d932cf5f088.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 FT 中,所有節點全部等價,主要職責包括:1. 從上游節點拉取數據;2. 向下遊兩個孩子節點分發數據。(注意,在 FT 中,我們不指定根節點,根節點與其他節點的唯一區別是他的上游爲源站,根節點不負責任何的 metadata 管理,下一部分我們會介紹我們如何進行元信息的管理)。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"多個 FT 在多個 peer 節點上的重疊","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/8d/8d3ffc1b1acc8ca6d7ea1ae86592a2c1.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個 peer 節點上勢必會存在同一用戶下的不同函數,所以一定會出現一個 peer 節點位於多個 FT 的情況。如上圖所示,實例中有三個 FT 分別屬於 func 0-2。但是由於 FT 的管理是互相獨立的,所以即使有重疊下的傳輸,FT 也是可以幫助每個節點找到對應的正確的上有節點。​另外我們會將一個機器可以 hold 最大數量函數做限制已達到 function-awareness 的特性,進一步解決了多函數下拉取數據不可控的問題。​","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"設計的正確性討論","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過在 FC 上集成,我們可以看到因爲 FT 中的所有節點等價,我們不需要依賴於任何的中央節點;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"拓撲邏輯的管理者不存在於集羣之中,而是由 FC 的系統組件(scheduler)來維護這一內存狀態,並通過 gRPC 隨着創建容器的操作請求下發給每一個 peer 節點;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"FT 完美適配 FaaS workload 的高動態性,以及集羣中任何規模的節點加入於離開,FT 會自動更新形態;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以函數這一較粗粒度進行組網,並且利用二叉樹數據結構來實現 FT,可以大大降低每個 peer 節點上的網絡連接數;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以函數爲隔離進行組網,可以天然實現 function-aware 以提高的系統的安全性和穩定性。","attrs":{}}]}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"3. 性能評測","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​實驗中我們選取了阿里雲數據庫 DAS 應用場景的鏡像,以 python 作爲 base image,容器鏡像解壓前大小爲 700MB+,擁有 29 層 layers。我們選取壓力測試部分進行解讀,全部測試結果請參考論文原文。測試系統我們對比了阿里巴巴的 DADI、蜻蜓技術和 Uber 開源的 Kraken 框架。​","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1)壓力測試","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​壓測部分記錄的延遲爲用戶感知的端到端冷啓動平均延遲。首先我們可以看出鏡像加速功能相比於傳統的 FC 可以顯著提升端到端延遲,但是隨着併發量的提高,更多的機器同時對中央的 container registry 拉取數據,造成了網絡帶寬的競爭導致端到端延遲上升(橘色和紫色 bar)。但是在 FaaSNet 中,由於我們去中心化的設計,對源站的壓力無論併發壓力多大,只會有一個 root 節點會從源站拉取數據,並向下分發,所以具有極高系統伸縮性,平均延遲不會由於併發壓力的提高而上升。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/b1/b132d1f81c61583373dccda8d0979cf1.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在壓測部分的最後,我們探究了同一個 VM 上如果放置不同 image 的函數(多函數)會帶來如何的性能表現,這裏我們比較了開啓鏡像加速功能並且裝配 DADI P2P 的 FC(DADI+P2P)和 FaaSNet。​","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/75/75ab319efc47272bd6adc1592b16da55.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖縱軸表示標準化後的端到端延遲水平,隨着不同鏡像的函數的數量增多,DADI P2P 由於 layer 變多,並且 FC 內每臺 ECS 的規格較小,對每臺 VM 的帶寬壓力過大,造成了性能下降,端到端延遲已被拉長至 200% 多。但是 FaaSNet 由於在鏡像級別建立連接,連接數目遠遠低於 DADI P2P 的 layer tree,所以仍然可以保持較好的性能。​","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"總結","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​高伸縮性和快速的鏡像分發速度可以爲 FaaS 服務商更好的解鎖自定義容器鏡像場景。FaaSNet 利用輕量級的、去中心化、自平衡的 Function Tree 來避免中央節點帶來的性能瓶頸,沒有引入額外的系統化開銷且完全複用了現有 FC 的系統組件與架構。FaaSNet 可以根據 workload 的動態性實現實時組網已達到 function-awareness,無須做預先的 workload分析與預處理。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"FaaSNet 的目標場景不單單侷限於 FaaS,在衆多的雲原生場景中,例如 Kubernetes,阿里巴巴 SAE 在應對突發流量的處理上都可以施展拳腳,來解決由於冷啓動過多影響用戶體驗的痛點,從根本上解決了容器冷啓動慢的問題。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"FaaSNet 是國內首個雲廠商在國際頂級會議發表 Serverless 場景下應對突發流量的加速容器啓動技術的論文。我們希望這一工作可以爲以容器爲基礎的 FaaS 平臺提供新的機會,可以完全打開擁抱容器生態的大門,解鎖更多的應用場景,如機器學習、大數據分析等任務。","attrs":{}}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章