1 |
伴魚實時計算平臺 Palink 的設計與實現
{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在"},{"type":"link","attrs":{"href":"https:\/\/tech.ipalfish.com\/blog\/2021\/06\/01\/palink\/","title":"xxx","type":null},"content":[{"type":"text","text":"伴魚"}]},{"type":"text","text":"發展早期,出現了一系列實時性相關的需求,比如算法工程師期望可以拿到用戶的實時特徵數據做實時推薦,產品經理希望數據方可以提供實時指標看板做實時運營分析。這個階段中臺數據開發工程師主要是基於「Spark」實時計算引擎開發作業來滿足業務方提出的需求。然而,這類作業並沒有統一的平臺進行管理,任務的開發形式、提交方式、可用性保障等也完全因人而異。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"伴隨着業務的加速發展,越來越多的實時場景湧現出來,對實時作業的開發效率和質量保障提出了更高的要求。爲此,我們從去年開始着手打造伴魚公司級的實時計算平臺,平臺代號「Palink」,由「Palfish」 + 「Flink」組合而來。之所以選擇"},{"type":"link","attrs":{"href":"https:\/\/s.geekbang.org\/search\/c=0\/k=flink\/t=","title":"xxx","type":null},"content":[{"type":"text","text":"「Flink」"}]},{"type":"text","text":"作爲平臺唯一的實時計算引擎,是因爲近些年來其在實時領域的優秀表現和主導地位,同時活躍的社區氛圍也提供了非常多不錯的實踐經驗可供借鑑。目前「Palink」項目已經落地並投入使用,很好地滿足了伴魚業務在實時場景的需求。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"核心原則"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過調研阿里雲、網易等各大廠商提供的實時計算服務,我們基本確定了「Palink」的整個產品形態。同時,在系統設計過程中緊緊圍繞以下幾個核心原則:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"極簡性:保持簡易設計,快速落地,不過度追求功能的完整性,滿足核心需求爲主。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"高質量:保持項目質量嚴要求,核心模塊思慮周全。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可擴展:保持較高的可擴展性,便於後續方案的迭代升級。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"系統設計"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"平臺整體架構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以下是平臺整體的架構示意圖:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/8c\/8ca26dab825b3e04022b48c3d11d42a6.png","alt":"palink","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"整個平臺由四部分組成:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Web UI:前端操作頁面。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Palink(GO) 服務:實時作業管理服務,負責作業元信息及作業生命週期內全部狀態的管理,承接全部的前端流量。包括作業調度、作業提交、作業狀態同步及作業 HA 管理幾個核心模塊。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"PalinkProxy(JAVA) 服務:SQL 化服務,Flink SQL 作業將由此模塊編譯、提交至遠端集羣。包括 SQL 語法校驗、SQL 作業調試及 SQL 作業編譯和提交幾個核心模塊。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Flink On Yarn:基於 Hadoop Yarn 做集羣的資源管理。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏之所以將後臺服務拆分成兩塊,並且分別使用 GO 和 JAVA 語言實現,原因主要有三個方面:一是伴魚擁有一套非常完善的基於 GO 語言實現的"},{"type":"link","attrs":{"href":"https:\/\/s.geekbang.org\/search\/c=0\/k=%E5%BE%AE%E6%9C%8D%E5%8A%A1\/t=","title":"xxx","type":null},"content":[{"type":"text","text":"微服務"}]},{"type":"text","text":"基礎框架,基於它可以快速構建服務並擁有包括服務監控在內的一系列周邊配套,公司目前 95% 以上的服務是基於此服務框架構建的;二是 SQL 化模塊是基於開源項目二次開發實現的(這個在後文會做詳細介紹),而該開源項目使用的是 JAVA 語言;三是內部服務增加一次遠程調用的成本是可以接受的。這裏也體現了我們極簡性原則中對快速落地的要求。事實上,以 GO 爲核心開發語言是非常具有「Palfish」特色的,在接下來伴魚大數據系列的相關文章中也會有所體現。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接下來本文將着重介紹「Palink」幾個核心模塊的設計。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"作業調度&執行"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"後端服務接收到前端創建作業的請求後,將生成一條 PalinkJob 記錄和 一條 PalinkJobCommand 記錄並持久化到 DB,PalinkJobCommand 爲作業提交執行階段抽象出的一個實體,整個作業調度過程將圍繞該實體的狀態變更向前推進。其結構如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.