Serverless over Storage

{"type":"doc","content":[{"type":"heading","attrs":{"align":"center","level":3},"content":[{"type":"text","text":"什麼是 Serverless?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從英文的字面意思,跟 Serverful 對比,看起來好像是無服務器?但這顯然不可能,畢竟無論如何,任何的程序最終都要在機器上執行。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要想理解 Serverless,我們有必要回顧一下我們通常的 Serverful 服務運行方式。","attrs":{}}]},{"type":"heading","attrs":{"align":"center","level":3},"content":[{"type":"text","text":"從 Serverful 到 Serverless 的演變","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"物理機的使用方式似乎從來都沒有變化,無論是 10 年前,還是現在。典型的流程就是硬件採購、拆箱、上電、做 Raid、插網線、調整交換機、做全面的配置檢查,順便還得檢查一些內存、硬盤、固件的質量等等,因爲說不定跑兩天就掛了。整個環境上線,就是體力活。","attrs":{}}]},{"type":"heading","attrs":{"align":"center","level":3},"content":[{"type":"text","text":"虛擬化","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"感受到了痛苦,就會促使工程師們去改變。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"已經辛苦地準備了硬件,當一個開發小哥需要一臺機器的時候,作爲 IT 管理人員,難道還要再次重複這個過程?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以說虛擬化解放了 IT 管理者,通過在一臺物理機上運行更多的虛擬機,提升了資源利用率以達到更好的財務收益之外,虛擬化還給部署以及運行一臺 “機器” 提供了極大的便利。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過操作系統虛擬化一套硬件,再結合虛擬機模板鏡像的機制,意味着在物理機上創建和移動虛擬機也只是分分鐘的事情而已。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以往的機器上線繁重、重複的體力工作消失殆盡。","attrs":{}}]},{"type":"heading","attrs":{"align":"center","level":3},"content":[{"type":"text","text":"雲","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"單機的虛擬化無法滿足大規模的場景,包括對調度,網絡虛擬化的需求等等。此時,雲橫空出世,你既可以選擇公有云,也可以選擇自己搭建私有云,如 OpenStack。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"甚至你都不需要關心底層的硬件,只要是通用的架構即可,操作系統、網絡、存儲等均可以自動化安裝、擴展出來。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"然而,即使在雲時代,應用軟件的運行方式也沒有變化,無非是軟件看到的是一個虛擬的硬件環境而已。對於使用者而言,不同的一點只是爲軟件準備基礎環境的過程變快了。","attrs":{}}]},{"type":"heading","attrs":{"align":"center","level":3},"content":[{"type":"text","text":"容器化","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"儘管虛擬化充分利用了資源,極大的提高了便利性,但技術發展的車輪滾滾向前,工程師們總是得隴望蜀。虛擬化依舊存在比較 “重” 的問題,鏡像太大,多個虛擬機基本都包含重複的操作系統,物理機上無法運行過多的虛擬機。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"容器化,尤其是 2017 年以來 Kubernetes 的流行,又一次帶來了改變。容器只是一個輕量級的進程,而軟件提供者只要維護一個 Dockerfile , 生成一個小得多的鏡像,在容器平臺部署即可。應用的上線不再關心依賴、衝突,以及諸如 “我這裏運行沒問題,肯定是你的環境問題” 等等困擾。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Serverless,更容易理解的一個名字是 functions-as-a-service,我想這樣起名的一個初衷是讓你不再關心服務器,也不需要考慮他們,只要執行你的代碼就好。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"設想一下即使是在容器化加持的情況下,應用開發者依然要關注諸如 RestAPI 框架如何搭建、工作流怎麼處理、壓力來了怎麼進行負載均衡、消息中間件如何處理等問題,有可能還要關心安全升級、漏洞掃描這些與業務邏輯關聯不大的瑣事。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2019,UC 伯克利大學發表了一篇“ Cloud Programming Simplified: A Berkeley View on Serverless Computing”的論文(https://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-3.pdf),論文中有一個很形象的比喻,描述如下:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"In the cloud context, serverful computing is like programming in low-level assembly language whereas serverless computing is like programming in a higher-level language such as Python. An assembly language programmer computing a simple expression such as c = a + b must select one or more registers to use, load the values into those registers, perform the arithmetic, and then store the result. This mirrors several of the steps of serverful cloud programming, where one first provisions resources or identifies available ones, then loads those resources with necessary code and data, performs the computation, returns or stores the results, and eventually manages resource release. The aim and opportunity in serverless computing is to give cloud programmers benefits similar to those in the transition to high-level programming languages.","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那麼總結起來什麼是 Serverless 呢?其實就是你只需要關心的業務邏輯,所有其他的全部交給外圍所運行的平臺工具等去處理。","attrs":{}}]},{"type":"heading","attrs":{"align":"center","level":3},"content":[{"type":"text","text":"實現方式","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"各大公有云有各自的實現方式,例如 AWS Lambda, 阿里的 BatchCompute, Azure Function 等等, 但每家都有不同的使用方式,存在 Lock-in 的風險。那麼如果在私有環境上實現,有什麼選擇呢?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Fn project","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Fn project 是 Oracle 開源的一個項目,看起來是非常簡單直接的,有一個 Docker 即可運行起來。問題就是不夠活躍,半年內似乎都沒有新的 commit。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e4/e4657150d1db0e7147f4b8317a2ca5f2.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Kubeless","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"聽起來最正統的名字,是由 Bitnami 貢獻的一個項目,基於原生的 Kubernetes ,通過自定義資源 CRD 的方式來實現,但由於受到 Knative 的影響,前途不太明朗,連創始人都建議關閉掉這個項目。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/18/1864c74f67271b34b8df7ef2bd1bddd0.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由 Platform9 貢獻的一個項目,它既可以利用到 Kubernetes 的豐富功能,也可以在需要時候獲得更好的性能,例如冷啓動。這個也是筆者在 Fn project 之後跑起來的第二個項目。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Fission","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/c9/c951ff46c8abf1ee75a41eb4ed9d1cbb.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Knative","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"名門出品,集大成者。Google 開源的項目,目前參與的公司主要是 Google、Pivotal、IBM、RedHat, 基於 Kubernetes 以及 Istio。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/4b/4bda7a8c4a2ef4fde1f09f296d4c8282.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":"center","level":3},"content":[{"type":"text","text":"Serverless Over 存儲","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"阿里王堅博士的《在線》裏有一句話說的特別好,“需求才是競爭力”,想到,做到;做到,用到。在與 AI 的同仁交流過程中,壓榨整個工作流過程中的每一點性能,都是對整個結果的很大提升,這不禁促使我們思考如何才能更加高效的存儲和處理數據。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"先不提 Computable Storage , 以 AI 的場景爲例,AI 作爲一種新的數據處理技術,它涵蓋了採集、準備、訓練和推理四個階段,每個階段都伴隨着數據的流動以及處理。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"數據採集階段:","attrs":{}},{"type":"text","text":"數據從不同來源聚攏並存儲起來,數據的大小和格式存在各種差異,數據類型往往是文件形式的非結構化數據。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"數據準備階段:","attrs":{}},{"type":"text","text":"由於數據的大小和格式不一樣,爲了便於進行 AI 模型訓練,必須改爲統一格式,以便後續訓練階段使用,這一過程要對不同格式和尺寸的數據進行規範化處理。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"訓練階段:","attrs":{}},{"type":"text","text":"AI 訓練過程的工作負載非常密集,往往需要高性能的 GPU 或者加速器等來執行一系列的數學函數,對資源要求非常高,在做特定訓練時,AI 訓練所需的時間更加取決於所部署的存儲的性能。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"推理階段:","attrs":{}},{"type":"text","text":"推理過程是檢驗人工智能的階段。推理基礎設施根據不同的場景,所需配置的處理器、內存、存儲不盡相同。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通常的數據準備階段,都是利用 Hadoop 等批量處理工具對數據進行清洗,在 Hadoop 計算節點與分佈式數據存儲節點分離的情況,一個典型的過程就是,讀出、計算、寫入,意味着數據要流出存儲集羣,再流入存儲集羣,能否儘量避免數據的流動,讓計算離存儲更近一些呢?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於 Serverless 框架,我們在 YRCloudFile 的基礎上,可以運行更加實用的功能,例如數據複製,數據壓縮,數據解壓等等更適合發生在存儲端的操作。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以下示例演示了向 Serverless 框架提交一個數據拷貝的請求(函數),讓這個請求在後端存儲自動執行,提交請求者無需關心後端數據的處理過程。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/6b/6bf8b14795edac39f437dbdd7dd22d24.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"利用對應的框架創建 Function 以及 Trigger 之後,只要訪問對應的URL即可完成相應的動作。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/40/40036e0d8f60a4d23251dd38f11e9f7b.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果你的動作夠快,進入到對應的 Function 容器內,你會看到裏面存在的對應的目錄對存儲的引用。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/fe/fe358048e248cfac80fdd895c5fa79ed.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這只是一個最簡單的數據拷貝例子,我們可以編寫更復雜的數據處理函數(Function),並直接提交到 Serverless 框架上,由後端的數據存儲去針對複雜操作完成相應優化和處理。工程師們可以快速地實現用戶需要的功能,甚至可以完成工作流你 Pipeline,從而賦予應用更多可能。","attrs":{}}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章