別眨眼！58 集團 Kylin 平臺已完成一次查詢！

原創

2021-03-22 18:35

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"查詢響應時間 P90 0.5s，700 個 Cube，122 個 Project，16000 多個 Segment，單副本的存儲 500T，日查詢量 20w，日輸入量 200 億。從 16 年至今，58 集團已使用 Apache Kylin 近五年，目前 20 多個業務線和子公司都在使用 Kylin。同時，58 集團也持續對 Kylin 進行了一系列優化，並貢獻到了社區，讓更多 Kylin 用戶從中受益。在上週結束的 Kylin Meetup 中，我們邀請到了來自 58 集團大數據平臺的楊正，跟大家分享 Kylin 在 58 同城的實踐與優化。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"演講大綱"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"58 集團 Kylin 平臺介紹"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kylin 版本統一、跨集羣讀寫、Cube 治理等實踐"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kylin 性能與易用性優化"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"未來規劃"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以下爲會議實錄👇（文末有視頻回顧哦！）"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"58 集團 Kylin 平臺介紹"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/87\/60\/875eba9eb4a1cf692b873f188f509560.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先，介紹 58 集團的 Kylin 平臺，Kylin 有很多很好的特性，這裏列舉了 5 點："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"亞秒級的響應"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"標準 SQL 的支持"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"與BI工具無縫整合"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"高吞吐支持千億級數據規模"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"界面友好，使用簡單"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"憑藉這些特性，Kylin 在 58 集團得到了廣泛的應用，目前在 58 集團服務於 20 多個業務線和子公司中，廣泛應用於 BI 報表，用戶行爲分析、推薦、商業數據產品等場景中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/ba\/yy\/ba57552fb748d801fffde8ef342810yy.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖爲 58 集團 Kylin 平臺的架構，最底下是存儲層，使用了 HDFS 和 HBase 作爲 Cube 的存儲；倒數第二層是計算層，同時支持使用 MR 和 Spark 進行構建 Cube；中間是 Kylin 的服務，包括 Kylin 的 Job 服務、查詢服務、Web 和 Rest 服務，以及元數據的管理中心；再上面是運營監控，我們有一個 Kylin 的運營平臺，包括 Kylin 的工單，Kylin 的任務、查詢和資源的統計指標，以及 Cube 的治理功能。在監控中心包含 Kylin 的進程監控、查詢監控以及任務監控。同時，我們也提供了 Kylin 的任務調度功能。最上面是 Kylin 的應用，應用於 BI 報表、推薦、行爲分析、商業數據產品以及其他的場景中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/04\/yy\/04cd8ac2c5e4100ff2fa469290d883yy.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kylin 是在 2015 年的 11 月份正式成爲 Apache 頂級項目，在 16 年 7 月份正式發佈了 Kylin 1.5.3，在 58 集團是 16 年 8 月份正式上線了 Kylin 1.5.3，在 19 年 3 月份又上線了 Kylin 2.6.0，在 2020 年的 5 月份我們將 1.5 中的所有 Cube 遷移到 2.6.0，至此我們版本統一爲 2.6.0 版本，現在我們正在調研和測試 Kylin 4.0。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/33\/e9\/33e87f98df3edcbc297856ff6fdc53e9.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這是我們的平臺現狀，目前一共有 700 個 Cube，122 個 Project，16000 多個 Segment，單副本的存儲有 500T，日查詢量有 20w，平均響應時間是 0.4s，P90 和 P99 分別是 0.5s 和 4s，日輸入量有 200 億。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Kylin 在 58 集團的實踐與優化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面介紹我們的一些實踐和優化經驗。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"版本統一實踐"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/c0\/54\/c0d3e59ae73e593a97526ae194742054.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先介紹的是 Kylin 的版本統一實踐，背景是在 58 集團有多個 Kylin 集羣，並且是不同版本，維護成本比較高。並且 Kylin 1.5 的版本比較老，很多特性不支持，它不支持使用 Spark 構建，不支持雪花模型，也不支持 Cube Planner 自動剪枝等高級特性，Kylin 1.5 的整體性能也不如 Kylin 2.6，因此我們決定將 Kylin 1.5 中的所有 Cube 遷移到 Kylin 2.6 中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們的方案是將 1.5 上的所有 Cube 元數據遷移到 2.6 集羣上，然後在 2.6 集羣上重新構建歷史數據。遷移步驟第一步是元數據的遷移，Kylin 提供了元數據的遷移工具，但是 Kylin 1.5 和 Kylin 2.6 的元數據是不兼容的，我們的做法是將 Kylin 1.5 的元數據先克隆到本地，我們開發了元數據的適配工具，將它適配成 Kylin 2.6 可以識別的元數據文件，然後再 put 到 Kylin 2.6 集羣。當元數據全部遷移到 2.6 集羣上以後，我們會基於這些元數據去對 Cube 做一些優化，常用的一些優化手段包括使用 Spark 構建引擎、維度順序、維度編碼、分片、聚合組、TTL、合併閾值、併發粒度等配置的調整。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Cube 優化後，我們開始構建 Cube，這裏分爲歷史數據的構建和增量數據的構建。歷史數據我們會盡可能的構建時間跨度比較大的 Segment，這樣的話相當於我們對原來的 Cube 做了合併操作。構建完之後再對比同一個 Cube 在兩個集羣中它們的 Segment 的日期，以及我們會做一些查詢的驗證，保證查詢數據一致性，再讓用戶切換 Kylin 的域名，運行一段時間沒有問題後，再清理 Kylin 1.5 中的數據，通過以上的遷移流程，我們花了一個月的時間將 Kylin 1.5 上 400 多個 Cube 成功遷移到了 Kylin 2.6 集羣上。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"跨集羣存儲與查詢實踐"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接下來介紹的是 Kylin 跨集羣存儲與查詢實踐，背景是 Kylin 所使用的 HBase 集羣所在的機房的機架數達到了物理上限，沒有辦法再擴容。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/c1\/4e\/c1cd646a25b8c1116bcd69fc2b66cd4e.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"方案是我們有多套 HBase 集羣，並且是在不同的機房的，我們想同時去使用這多套 HBase 集羣，去作爲一個 Kylin 集羣的 Cube 存儲。上圖是我們的一個 Cube 的兩個 Segment 元數據展示，可以看到它們存儲在不同的 HBase 集羣。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/d8\/b4\/d817ccf4b6fc51d1e340d89fc2d33db4.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面要介紹的是 Kylin 的跨集羣的存儲原理。介紹跨集羣存儲原理之前我先簡單介紹一下 Cube 的構建和合並流程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Cube 的構建流程如下："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"抽取數據，從 Hive 或者是 Kafka 中抽取數據到一張臨時表中；"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"構建字典，這裏包含維度字典和全局字典；"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"創建 HBase 表；"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"構建 Cuboids，Cuboids 就是各個維度組合下的聚合值這樣的數據，然後再轉化成 HBase 的底層文件，HFile 文件；"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"最後通過 BulkLoad 程序將 HFile 文件導入到 HBase 中。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以上就是 Cube 的構建流程。Cube 的合併是基於之前Cube 構建的時候保存下來的字典文件和 Cuboid 數據，首先對字典進行合併，再創建 HBase 表，再合併 Cuboids 數據，後面的流程和 Cube 構建的流程一致，這就是 Cube 合併的流程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在瞭解 Cube 構建和合並的流程之後，我們羅列了整個流程中所有與 HBase 連接的步驟："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"創建 HBase 表；"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Cuboids 數據轉 HFile 文件；"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"BulkLoad 任務；"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Lookup 表的存儲與查詢；"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Merge 任務垃圾清理時，刪除無用的 HTable；"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"獲取 HBase 集羣的 HDFS 存儲路徑"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們要實現 Kylin 的跨集羣存儲，就是在這些步驟執行之前去獲取當前構建的 Cube 的 HBase 存儲配置，根據獲取到的配置去選擇對應的 HBase 集羣創建連接，再進行相應的操作，比如創建 HBase 表，執行 BulkLoad 任務，這樣的話就實現了跨 HBase 集羣的存儲。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們實現的跨集羣存儲支持 Cube、Project 以及全局的配置，也就是說我們可以配置不同的 Cube 可以存儲在不同的集羣，也可以配置不同的 Project 下所有的 Cube 存儲在不同的集羣，也可以有一個全局的配置，配置比較靈活，其原理是 Cube 的配置優先級會大於 Project 的配置，Project 的配置優先級大於系統配置。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/23\/a0\/23d688e0453575e3f37a52ac86d3ffa0.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面介紹 Kylin 的跨集羣查詢原理，先簡單介紹一下 Kylin 的查詢流程，首先第一步是使用 Calcite 對 SQL 進行解析，解析成抽象語法樹，再將抽象語法樹轉換成邏輯計劃，再對邏輯計劃進行優化，然後是 Calcite 代碼生成和編譯，第五步是封裝 HBase 的 Scan 請求，一個 Kylin 的查詢可能是跨 Segment的，會去查 HBase 的多張表，這樣一個查詢可能會發送多個 Scan 請求去請求不同的表，當 RegionServer 接收到 Scan 請求之後，會使用協處理器去掃描數據、過濾和聚合數據，處理完成之後再返回給 Kylin，Kylin 接收到數據之後會對這些數據進行解碼，再提交給 Calcite 去做進一步的迭代計算，這就是 Kylin 的整個查詢流程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們要實現跨 HBase 集羣的查詢，就是在封裝 HBase Scan 請求的時候，在向對應的 Segment 去發送請求之前，會去從元數據中獲取 Segment 所屬的 HBase 集羣，然後與對應的 HBase 集羣去創建 Connection，再發送請求，多個 Scan 請求發送到多個 HBase 集羣。多個 HBase 集羣處理完成後，再將結果統一的返回給 Kylin，Kylin 再做後續的處理，從而實現 Kylin 的跨集羣查詢。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"查詢管控與診斷優化"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/c5\/44\/c5ef3ddfd0yy10358faf2e46a128fe44.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏要介紹的是我們在 Kylin 的查詢管控以及診斷上面的一些優化。我們第一步是對查詢鏈路進行了階段的劃分，我們一共劃分成了 5 個階段。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"Calcite 的處理階段，這裏包含了前面說的 Calcite 的解析轉化與優化等步驟，我們統一成了一個 Calcite 的處理階段；"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"封裝 HBase Scan 請求的階段；"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"啓動多個線程去發送請求的階段；"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"在 RegionServer 上去掃描數據、過濾數據、聚合數據階段；"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"Kylin 節點對 HBase 返回的結果進行二次聚合以及合併排序階段。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們會把各個階段的耗時情況給保存下來，如果後續用戶反饋查詢比較慢，我們就可以快速地診斷出是在哪一個階段出現了問題，可以快速地定位性能問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們做了以下工作："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"第一個是完善了查詢信息"},{"type":"text","text":" ，這裏不僅包括剛纔所說的各個階段的耗時情況，我們還完善了一些其他信息，比如說 Kylin 默認只保存了 SQL 所擊中的 Cuboid id 信息，我們是把用戶的 SQL 所對應的 Cuboid id 也保存下來，診斷時會將這個信息和它所擊中的 Cuboid 做一個對比。同時我們把這個 id 由十進制轉換成二進制，這樣的話我們可以清楚的看到用戶查詢了哪些維度，以及這些維度在 Rowkey 中的順序。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"第二個是我們對慢查詢進行了收集"},{"type":"text","text":" ，我們會把慢查詢信息保持在 HBase 表中，並且展示在 Kylin 的日誌、慢查詢頁面以及 Kylin 的日報中。Kylin 默認的慢查詢收集是通過巡檢的方式實現的，這樣會存在一些不確定性因素，導致有一些查詢不會被收集，我們改造了這一塊邏輯，改成當一個查詢達到了我們配置的延遲閾值之後，會主動的進行處理，比如我們配置了當一個查詢的延遲大於 10 秒之後會收集這個慢查詢，大於 60 秒之後會去中斷這個查詢。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/34\/66\/3487567447ce6018c5ed741ff6463666.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖是 Kylin 的慢查詢頁面，我們完善了一些信息，比如擊中的 Cube 名稱，Cuboid 信息，以及各個階段的查詢耗時。通過完善這些信息，我們提高了慢查詢的診斷效率，便於進行查詢的治理，提高查詢的性能。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/d2\/25\/d2c5410b97252345df4aa5aed081d925.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們對查詢做了一些限制，Kylin 提供了一些查詢限制的參數，重點提一下前面兩個參數，因爲他們的默認值是 0，也就是沒有限制，這樣當一個用戶他有一個不合理的查詢請求，可能會掃 HBase 全表，這樣是一個很危險的操作，很容易將的 HBase 節點或者是 Kylin 節點打掛。通過以上的這些參數配置，可以有效的避免用戶的不合理 Cube 設計或者是查詢導致集羣的性能和穩定性受影響的現象。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接下來分享我們在 Kylin 查詢監控上做的一些工作，我們將 Kylin 的各個節點內存中的 Query Metrics 給暴露到 HTTP API 中，這個方案我們也貢獻給了社區，再通過 tcollect 程序將JMX中的數據實時寫入到 OpenTSDB 中，OpenTSDB 也是一個基於 HBase 的高性能高吞吐的時序數據庫，然後在 Grafana 上進行配置各種監控指標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/01\/31\/0193969fbd37878058de8596fdda3c31.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這幅圖就是我們的監控頁面，包含對各個 Kylin 節點和 Cube 的監控。包括各個節點、各個 Cube 的 P99、P95 等信息，以及各個 Cube Scan 的耗時和數據量等監控。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"構建性能優化"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/4b\/44\/4ba3849c4d31f3b86c9350f3c62f9d44.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接下里介紹一下我們對構建性能上的一些優化。首先我們優先讓用戶使用 Spark 構建引擎， Kylin 常用的構建算法是層級構建算法，如果使用 MR 進行構建的話，每一層都會去創建一個 MR 任務，MR 之間數據需要多次落地和讀取，如果使用 Spark 進行構建的話，RDD 之間的轉換都是在內存中進行的，因此構建速度可以大大提升，右邊這幅圖是我們做一些測試，可以發現使用 Spark 進行構建，構建的性能提升了約 25%。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同時，我們對全局字典也進行了一些優化，首先介紹一下爲什麼需要全局字典，Kylin 同時支持精確去重和近似去重，Kylin 的精確去重是基於 Bitmap 實現的，而 Bitmap 只能接受 Int 型參數，要想對字符串型數據進行精確去重，就要藉助全局字典來對字符串映射成 Int 型的值，這樣就可以對字符串類型的數據進行精確去重。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/2c\/cc\/2cdd1c5d0415c04b25c75e18571ccbcc.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"全局字典在 Kylin 中有兩種實現方案，第一種是基於 Trie 樹的實現方案，第二種是基於 Hive 表的全局字典實現方案。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在構建方式上，基於 Trie 樹的全局字典，只能在單一節點上進行構建，多個全局字典之間只能串行構建，而基於 Hive 表的全局字典可以多節點的並行構建；"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在彈性方面，如果開啓 UHC（超高基數列），基於 Trie 樹的全局字典的多個字典可以通過 MR 進行並行構建，但是每個全局字典只能分配到一個 Reduce 中進行構建，基於Hive表的全局字典可以通過增加資源提升性能；"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在資源使用上，使用 Trie 樹相對節省資源，但是會佔用的是 Kylin 的實例資源，而 Kylin 的實例資源是比較稀缺的。基於 Hive 表的實現方案，會執行多個 MR 和 Hive SQL，中間會有包含多次 Shuffle 階段，資源開銷比較大，但是他不佔用 Kylin 的資源。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在瓶頸上，基於 Trie 樹的全局字典在遇到高基數和多個去重列時，內存很容易成爲瓶頸，基於 Hive 表的全局字典在性能上在某些情況下會遇到數據傾斜的情況，導致整體構建性能比較慢。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於 Trie 樹的全局字典是在 Kylin 1.5.3 上發佈的，Hive 全局字典是在 Kylin 3.1.0 上發佈的，經過對比之後發現基於 Hive 的全局字典有明顯的優勢，於是我們就將 Kylin 3.1.0 這個特性合入到內部的 2.6.0 這個版本上。順便提一下 Kylin 4.0 擁有一個新的全局字典實現方案，這個等我們後面 4.0 落地後再去做一些對比。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於 Trie 樹的全局字典的優化手段如下："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"構建時切分小字典"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"開啓 UHC，增加 Reduce 內存"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"考慮 Segment 級別全局字典"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"複用全局字典"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"拆分 Cube"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於以上優化手段，Trie 樹全局字典可以滿足我們集團大部分的場景，但是在有些場景中使用 Trie 樹構建的性能會非常慢，或者是會出現 OOM，導致無法構建，這個時候我們會推薦用戶去使用 Hive 全局字典，Hive 全局字典也有一些優化手段，比如說使用全局字典去跨 Cube 複用字典，這樣可以避免有一些資源的浪費，還有使用 Map join 去解決數據傾斜的問題；還可以增加資源，提升併發力度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/9e\/8e\/9e72baa52f982a48648a4cd0ec6d148e.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖是我們對比了兩種全局字典進行構建的時間消耗，在兩次測試案例中使用 Hive 全局字典相比 Trie 樹全局字典，Cube 的構建效率提升了大約 40%。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"任務調度"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接下來介紹 Kylin 任務調度的功能，Kylin 本身沒有提供任務調度功能，只提供了構建相關的 RESTful API，我們最早期的方案是基於 Crontab+Shell 腳本去實現的，這樣會有很多弊端，比如任務很難管理，第二個是 Shell 腳本維護比較困難，後來我們基於集團內部的調度系統實現的 Kylin 的任務調度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/9d\/87\/9d73dbe5099f082fe26cfb73413ff187.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"左邊這幅圖是添加任務調度的工單，用戶提交工單，管理員審批之後就可以創建這個任務，任務可以進行可視化的管理，不僅有基於時間的調度，還可以配置依賴任務，這樣可以第一時間產出報表，還一些重試和報警機制，保證了任務調度的穩定可靠性。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Cube 治理"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/0b\/78\/0b0cafcdc0903b746525b1b929112b78.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面要介紹的是 Cube 治理功能，我們可以快速定位到一些不合理的 Cube，我們可以通過以上這些過濾條件，比如上線狀態，未構建天數，未查詢天數，Cube 存儲大小，查詢延遲以及膨脹率等，去快速的篩選出不合理的 Cube，然後再進行相應的治理。我們會對每個 Cube 的所有 Segment 進行狀態評估，會判斷 Cube 的 Segment 是否有空洞，是否有空的 Segment，Segment 的大小評估，以及合併閾值的檢查，綜合以上這些因素，會對每個 Cube 進行狀態評分，然後給出相應的治理建議。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"多租戶優化"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/7e\/f2\/7e8a4ff95fceb08db94e4c95d78a64f2.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接下來，介紹的是我們對 Kylin 多租戶的優化，背景是 Kylin 只能使用默認用戶去使用存儲和計算資源，在多業務支持的場景下會存在不能有效的進行資源隔離和成本覈算的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們的方案基於 HBase 的 UGI 和 Hadoop 的代理去實現 Kylin 在 58 集團的多租戶打通。下面這幅圖可以看出不同的用戶在 HDFS、YARN 和 HBase 中都有相應的隔離：在 HDFS 中不同的用戶會有不同的目錄以及相應的權限進行隔離；在 YARN 中不同的用戶會對應一個資源隊列進行計算隔離；對於 HBase 每個用戶會有對應的 NameSpace 進行權限隔離，並且每個用戶會對應一個 RsGroup 進行物理隔離，並且 HDSF，YARN 和 HBase 都有一套 Quota 的機制，這樣方便我們對各個用戶進行資源限制和成本覈算。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後要介紹的是我們在 HBase 上的優化，首先第一個是使用 RSGroup 進行物理隔離，這樣可以保障一些重點業務的查詢性能；第二個是我們會定時的清理 Kylin 的垃圾數據，合併小的 Segment，可以減少一些小表的數量；第三個是我們開啓了短路讀，在 HBase 表本地率比較高的情況下，可以有效的減少網絡 IO，第四個開啓了 Hedged Read 特性，可以有效的降低讀毛刺，第五個是我們關閉了所有 Kylin Scan 的BlockCache，第六個是我們將 BulkLoad HFile 的 copy 模式改爲 move 模式，減少一些不必要的磁盤 IO，最後一個是我們對 HBase 做了讀寫分離，避免讀寫資源的搶佔與飢餓。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/41\/cb\/41ab9eb7856bea6a6e5ce05785a2eecb.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這是我們的一些其他優化，也貢獻給了 Kylin 社區。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Kylin 未來願景"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後介紹一下我們對未來的展望。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先是 Kylin 4.0 的落地，我們目前正在基於 Kylin 4.0.0-alpha 版本進行調研和測試。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kylin 4.0 簡化了 Kylin 的架構，使用 Spark 去構建和查詢，並且完全去除了 HBase，使用 Parquet 進行存儲，更加輕量化，由於是列式存儲，在查詢和存儲佔用上都比較友好。除此之外，是 Schema 的動態更新，現在給 Cube 添加或刪除維度都是需要去回刷數據，這樣代價會比較大，所以這個特性我們比較期待。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二個方向是自助治理，目前的 Cube 治理還是需要一些人爲的參與，未來我們希望可以完全的自動處理。第三個是 Kylin 4.0 這個架構非常適合上雲，未來我們也會考慮上雲。最後一個是我們目前使用的版本是 Kylin 2.6.0，還不支持實時 OLAP 功能，未來我們希望在 Kylin 4.1 發佈後支持 Kylin 的實時 OLAP。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"視頻回顧看這裏👇"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.bilibili.com\/video\/BV1Th41117u4?from=search&seid=328426979158923842","title":"","type":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"完整視頻"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"作者介紹"},{"type":"text","text":"："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"楊正，58 集團大數據平臺工程師，負責 58 集團海量數據實時存儲及離線分析平臺建設，基於 HBase、Kylin 等基礎組件爲集團各業務線和子公司提供海量數據存儲、離線分析等工作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"本文轉載自公衆號apachekylin（ID：Apachekylin）。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文鏈接"},{"type":"text","text":"："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/mp.weixin.qq.com\/s?__biz=MzAwODE3ODU5MA==&mid=2653082143&idx=1&sn=f1521d85f57c962bd71cabd6cf7895a3&chksm=80a4aceeb7d325f85240d7e506f25c1dd1d872e3cfb7e7a69e4486c4184f14cb89e4a1ef5622&token=1845978438&lang=zh_CN#rd","title":"","type":null},"content":[{"type":"text","text":"別眨眼！58 集團 Kylin 平臺已完成一次查詢！"}]}]}]}