Apache Kylin 在中通快遞的實踐

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大數據技術自誕生之日起就一直在不斷的發展,痛點推動着技術的革新。2019 年雙十一當天,中通快遞的日訂單量超 2 億單,平均每日產生的數據量超過 20 TB,實時計算每天處理的數據量超過 1000 億條。面臨如此體量的數據,給存儲和計算帶來了極大的挑戰。那麼,中通是如何進行海量數據的分析呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"1 OLAP 在中通的演進"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1.1 平臺架構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/c7\/34\/c77095a9a07aae41f88288d7ed05de34.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖是中通快遞的“大數據平臺架構”圖。OLAP 計算引擎以 Kylin 和 Presto 爲主,最右側是每個大數據組件對應的監控系統,最上層則是平臺工具層,包括調度系統、Ad-hoc 查詢系統等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1.2 OLAP 在中通的發展歷程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"1)Impala"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在2017年以前,是以 Impala 爲主進行數據分析與報表計算。相較於 Hive,Impala 有以下幾個顯著優點:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"查詢速度快:對比Hive有着顯著的性能提升。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"兼容Hive數倉:可以分析Hive中的數據。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但隨着數據量的不斷增長和業務需求的不斷複雜,Impala 也暴露出來了一些問題:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"內存要求高且不夠穩定:偶爾會出現進程掛掉的情況。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"C++技術棧:帶來了額外的運維成本,難以進行二次開發。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"2)Presto"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這樣一個背景下,中通在 2017 年引入了 Presto,並在今年上半年引入 Alluxio 對 Presto 常用 Hive 表進行加速,進一步提高 Presto 的查詢速度。Presto 具有以下幾個優點:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"服務很穩定:很少會出現 server 掛掉的情況。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"性能非常好:可滿足交互式查詢甚至是跑一些 ETL 任務。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"豐富的數據源支持:基於插件機制可以很方便的分析 Hive、Kudu、kafka 和 Tidb 等其他組件中的數據,甚至可以進行不同數據源的關聯分析,例如在一個 SQL 中關聯 Hive 與 Kafka 中的數據。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Presto 雖然優點很多,卻也存在幾點不足:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"集羣要求較高:想要更快的查詢速度就需要更多的機器,更好的網絡帶寬,更大的內存以及更強的CPU去支撐。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"重複計算:相同的查詢也要重複的拉數據進行分佈式計算。這個重複計算有時會給我們帶來痛苦,比如說集羣繁忙,有時namenode負載高、網絡出現抖動等都會給查詢速度帶來影響。爲此,我們引入了alluxio,對Presto常用的hive表進行加速,如此一來可以大幅提升scan hive table的速度。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"需要權衡和妥協:你需要在查詢速度和查詢複雜度上面妥協。這一點先賣個關子,將在後面的“中通爲什麼選擇Apache Kylin”中重點說明。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"3)Apache Kylin"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了解決這個問題,我們在 2018 年調研並引入了 Apache Kylin。Kylin 可以很好的解決海量數據的多維分析問題,並且具有亞秒級的查詢響應速度。不但如此,Kylin 還具有以下幾個無可比擬的優點:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"具有標準化的 SQL 支持:提供了 JDBC\/ODBC\/Rest API 接口,便於做系統集成。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"亞秒級的查詢速度:這一點是難能可貴的,在大數據領域,將查詢速度提高到亞秒級,鳳毛麟角。這不單單是查詢速度的提升,更是用戶體驗的巨大提升。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"事實表大小不影響查詢速度:隨着數據量的不斷增長,其他的 OLAP 引擎都會有不同程度的查詢速度下降。反觀 Kylin,數據的增長只會影響 cube 的構建速度,對查詢速度影響很小。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"集羣要求低且服務穩定,中小企業也養得起 Kylin。在過去2年多的時間裏,Kylin 集羣一直很穩定,沒有出現過進程異常退出的情況。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如此可見,Kylin 的優點很多很突出,但不可否認的是它也存在着不足:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"cube 優化門檻較高:需要專門的學習與實踐。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"只適用於模式固定的多維分析:也就是說模型不能總變。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"2 爲什麼選擇 Apache Kylin"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"中通爲什麼會選擇使用 Kylin 呢?只因爲它能更好的解決剛剛提到的 Presto 面臨的權衡問題嗎?不盡然。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.1 Apache Kylin 簡介"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/c8\/a5\/c885150f4abfyy8700a23e1a5d8242a5.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"先來回顧一下官網的定義:Apache Kylin™是一個開源的、分佈式的分析型數據倉庫,提供 Hadoop\/Spark 之上的 SQL 查詢接口及多維分析(OLAP)能力以支持超大規模數據,並且能在亞秒內查詢巨大的表。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"從這段話中可以提煉出幾個關鍵字,開源:意味着免費,可自由研究與修改源碼。亞秒:查詢性能出衆。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kylin的特點衆多,以下4項是比較突出的:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"預計算:以空間換時間的方式事先根據模型計算出各種可能,讓查詢引擎做很更少的計算。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"高性能:Kylin 在中通97%以上的查詢都能在1s內返回結果。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分佈式:部署多臺可成倍提升查詢吞吐率。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"易集成:提供 JDBC\/Rest API,易於做系統集成。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.2 基於 Presto 的經典實現"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/fb\/9e\/fbc65d38e7c8b926a79a8e096a92e79e.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"剛剛在分析 Presto 優缺點時有提到需要在查詢性能和查詢複雜度上面做一個權衡。如果要在 3s 內返回查詢結果,查詢條件就不能過於複雜,數據量也不能過大。如果想要兼得魚和熊掌,也不是沒有辦法,那就是通過 ETL 任務預計算的方式先將數據打平,變成大寬表,再將這張大寬表拉到 alluxio 內存中,最後通過 Presto 做很簡單的查詢。雖然這種做法能解決問題,但不可避免的引入了更多問題:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"開發週期長:首先需要ETL的同學先將數據預計算成大寬表,然後利用 alluxio 對這張寬表加速,最後應用組的同學寫 sql 寫代碼,開發成本很高。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"靈活性差:可能一個很小的改動都會導致重大的調整。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"浪費集羣資源:presto會不厭其煩的重複去拉數據,重複去計算,帶來了集羣壓力和網絡帶寬壓力。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.3 Apache Kylin VS Presto"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/70\/7e\/70543b05e3a12b9fa954da8c620b877e.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"1)查詢耗時對比"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這樣一個背景下,我們引入了 Kylin。經測試,Kylin 在查詢性能方面相較於 Presto 少則十幾倍多則幾百倍的性能提升。某些場景下,Presto 因內存限制等原因乾脆就跑不出來,查詢被主動 kill 掉。反觀 Kylin,表現非常好。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"2)Kylin 查詢耗時佔比"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於內部監控系統抓取到的歷史查詢數據,進行了簡單的統計,其中97%以上的查詢都能在1s以內返回結果,1s~3s 的查詢有1.35%,而3s以上的查詢則只有0.95%。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"3)集羣規模對比"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以上的測試是在這樣一個集羣規模下測得的,Presto 50多臺,配置是64核,256G。而 Kylin 則是與 HBase 混布,共計5臺,其中4臺節點用於 query,每個節點分配了16G的內存。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"4)小結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kylin 對比 Presto 帶來了上百倍的查詢性能提升。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"絕大多數的查詢在亞秒內返回結果。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"集羣要求更低,更少的機器帶來了更高的查詢性能。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"3 Apache Kylin 在中通的實踐"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"引入 Kylin 以後,我們是如何使用這個瑞獸的呢?這個瑞獸又是怎樣賦能中通快遞的呢?帶着這個疑問,先來看一個路由件量分析的案例。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/73\/1f\/7368873ce51fc6970b2fb49d3aacb81f.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.1 業務描述"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所謂路由件量分析是指統計指定路由線路上的件量、重量以及經過的中心數。裏面包含的維度很多,比如說快件的發件省份、經過的首中心、第二中心、末端中心等,這些可以看做是快件路由線路上的維度。另外還包括重量段這個維度,重量段是用來描述快件所在的重量區間,比如1~3公斤,3~5公斤等。最後的指標包括件量、總重量以及經過的中心數等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於這個報表,我們有以下幾個痛點,"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"維度多:大概有 20 多個維度;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"查詢慢:現有的技術方案不能很好的滿足查詢需求"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要求高:要求 5s 內出結果"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據量大,日新增 2 億多條。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們通過 Presto 去跑,根據篩選條件的不同,查詢時間從20s到60s不等,根本無法滿足需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.2 Kylin 如何賦能"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於 Kylin 提供的 JDBC,可以與帆軟無縫集成,經過多輪的 cube 優化,最終首次查詢耗時 2.9s,後續走緩存查詢耗時穩定在亞秒級別,很好的滿足了業務需求,解決了痛點。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.3 Apache Kylin 在中通的規模"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/38\/94\/384b3e77752d1050b2698bb9a7aa4894.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kylin 共部署了 5 臺節點,其中1臺 job 節點, 4臺 query 節點;HBase 有40多臺;當前 cube 總數63個;總 cube 大小是 33 TB以上;源數據總數有 800 多億條;每天響應查詢數有1萬以上;其中 "},{"type":"text","marks":[{"type":"strong"}],"text":"97%以上的查詢耗時小於 1s"},{"type":"text","text":" 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.4 Apache Kylin 與調度系統集成"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/04\/76\/04187f722f9ee1bf05192f8e15e7c376.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kylin 提供了大量好用的 Rest API,通過這些 Rest API,可以很方便的與調度系統集成,進行構建任務實例的管理。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"1)整合調度系統的意義"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲什麼要將 Kylin 計算任務集成到調度系統呢?因爲通過調度系統,可以很好的解決任務間的依賴問題,任務失敗也可以自動重跑,失敗的任務會有電話、釘釘告警,便於第一時間發現問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"2)已有功能"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"中通的調度系統目前支持指定 cube 的構建、cube 的回刷以及運行中的任務 kill 等功能,基本滿足了構建任務的常規管理需要。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"3)調度系統如何集成Kylin"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"調度系統如何集成 Kylin 進行構建任務的管理呢?Kylin 提供了豐富的 Rest API,可用於和第三方系統做集成。整合過程大體分爲兩步:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先調用認證 API 進行用戶認證,然後再調用構建 API 進行 cube 的構建。如此一來,就將 Kylin 計算任務納入到調度系統的管理,非常方便。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.5 Apache Kylin 監控系統--分鐘級監控"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/4b\/56\/4b6caf4f7612110dbf4b8a6d8f975c56.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接下來是監控部分,監控系統是大數據平臺的重要一環。我們針對 Kylin 引擎的特點自研了一套監控告警系統。由於沒有找到有關 Kylin 查詢相關的 Rest API,所以對源碼做了二次開發,將查詢請求信息主動吐到 Kafka,再由Kylin監控系統實時消費落庫,用於更進一步的分析。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kylin 監控主要涵蓋三個方面,分別是分鐘級的查詢監控、天級別的監控分析以及異常監控。首先來看分鐘級別的監控,這類監控的特點是粒度比較細,包含的主要功能如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"每分鐘的查詢量統計:統計每分鐘請求 Kylin 的查詢數,可用於查看業務系統在不同時段的查詢量,快速定位異常流量。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"每分鐘失敗的查詢:統計每分鐘裏失敗的查詢數,可用於判斷 Kylin 是否存在問題或者是應用系統發出來的查詢是否有異常。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第三個是異常 SQL 自動 kill:壞查詢會給集羣帶來波動,異常 SQL 自動 kill 可降低這種影響,保障集羣穩定。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.6 Apache Kylin 監控系統--天級別監控"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/04\/97\/04af1c166129c7ccdcd2be5deca6d297.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接下來是天級別的監控分析,包括以下三個功能:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"每日查詢統計:通過這個功能可以清晰的看到每天Kylin 響應的查詢總數,命中緩存的查詢也能被統計到。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"慢 SQL TOPN:很有用的一個功能,它可以對每天的查詢按耗時降序排列,有哪些壞查詢,對應的cube 是否具有優化的空間,一目瞭然。用戶查詢佔比:這個功能可用來統計各應用系統每日的查詢量佔比,輔助分析各系統的使用情況。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.7 Apache Kylin 監控系統--異常監控"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/b3\/3e\/b3684182bbb5d10b1a434728bc8de53e.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後一個是異常監控,沒有界面,只有告警推送,是Kylin監控中最重要的部分:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"cube 膨脹率監控:根據官方的建議,當掃描到某些cube 的膨脹率超過1000%時會發出釘釘告警。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Segment 異常監控:我們曾經有一個線上的 cube,跑着跑着發現數據對不上,經定位發現幾天前的一個構建任務沒跑成功,導致了 segment 存在空洞,有了這個監控可以很好的避免此類問題的發生。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"job失敗監控:當構建任務失敗時,會主動推送告警信息。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"TTL 未設置監控:這個監控是爲了防止有些 cube 在創建過程中忘記設置 TTL 時間,避免歷史數據無法得到清理。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kylin 進程監控:7*24小時監控 Kylin 進程,重中之重。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.8 Apache Kylin 的優化實踐"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/b2\/e3\/b2718d03cf2e2a9yy90a28bd24fcyye3.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接下來是 Kylin 在中通的優化實踐,優化這一塊,大體分爲兩個方面,分別是 HBase 相關參數的優化和 MapReduce 相關參數的優化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"1)HBase 優化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kylin 默認的 HBase 參數中沒有開啓表壓縮,隨着segment數的不斷上漲,給存儲帶來了負擔。通過開啓 HBase 表壓縮,整體節約 70% 左右的磁盤空間,效果還是很可觀的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"kylin.storage.hbase.compression-codec=snappy\nkylin.metadata.hbase-rpc-timeout=60000\nkylin.metadata.hbase-client-scanner-timeout-period=60000\nkylin.metadata.hbase-client-retries-number=10\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一個是 HBase 超時和重試次數的參數調整,由於後半夜 ETL 高峯,HBase 集羣壓力很大,這個時候就會出現構建任務讀寫 HBase 超時,爲此,我們調大了超時時間,並增加了重試次數。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"2)MapReduce 優化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一方面是 MapReduce 相關參數的調整,Kylin 默認的最大reduce 數是500,在某些情況下會成爲瓶頸,爲此調大了 reduce 最大數的限制,並將用於計算 Reduce 數量的這個參數從500調整爲250,這樣一來,reduce 數會增多。經過以上幾個參數的調整,部分構建任務時間縮短近了1\/3。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"kylin.engine.mr.mapper-input-rows=500000\nkylin.engine.mr.reduce-input-mb=250\nkylin.engine.mr.max-reducer-number=50000\nkylin.job.max-concurrent-jobs=15\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"3)數據管理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"根據官方的建議,要定期清理元數據、cube 構建臨時數據和過期的 HBase 表數據等,並定期備份元數據信息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.9 Apache Kylin 的源碼與升級"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/d9\/73\/d91a1a9cdb9d0f767ba6773598e62d73.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"實踐的最後一部分,是關於源碼和版本升級。研究 Kylin 源碼好處很多,可以更深入的瞭解 Kylin,解決棘手的問題,甚至可以進行二次開發。到目前爲止,我們對 Kylin 源碼修改還比較少,以滿足需求和解決問題爲主。主要的改動是查詢信息發 送Kafka 和爲更新數據字典時間戳添加分佈式鎖。之所以添加這個分佈式鎖是因爲我們線上遇到過這個問題,右側上圖是異常的堆棧,當同時回刷一個cube 的多個 segment 時會偶發性的報錯。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後是重要的 patch 合併和升級,在今年 7 月份完成了一次從Kylin 2.5.1到3.0.2的升級。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"4 未來規劃"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/06\/yy\/06cecce8d627f5b1feb78580a79437yy.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後是未來規劃,未來,我們主要的探索方向是以下幾個方面:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kylin智能診斷:作爲監控系統的補充,智能診斷同樣具有重要的作用。它可以根據預設的規則給出初步的診斷結果,輔助用戶排查問題。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"查詢下壓Presto:Kylin已經支持查詢下壓的功能,未來將探索將Kylin作爲統一的查詢入口,對於未命中cube的查詢下壓到presto,形成優勢互補。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自助分析系統:最後一個則是自助分析系統,相信Kylin在這個系統中會發揮更大的作用。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"作者介紹"},{"type":"text","text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"王成龍,2016年加入中通快遞,任 OLAP 組與實時計算平臺組負責人,自 2018 年起推動公司 OLAP 引擎的落地工作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"本文轉載自公衆號apachekylin(ID:ApacheKylin)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文鏈接"},{"type":"text","text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/mp.weixin.qq.com\/s?__biz=MzAwODE3ODU5MA==&mid=2653081667&idx=1&sn=f32e5185970ce704f2c3dc45e68e550b&chksm=80a4aeb2b7d327a4b350f6d9150efe14c2fbe060190cf527ab574878851bd51ece4600304d97&token=1340822333&lang=zh_CN#rd","title":"","type":null},"content":[{"type":"text","text":"Apache Kylin 在中通快遞的實踐"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章