貝殼OLAP平臺架構演進

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文根據貝殼找房資深工程師肖贊老師在2020年\"面向AI技術的工程架構實踐\"大會上的演講速記整理而成。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1 貝殼OLAP平臺架的構演化歷程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/9f\/28\/9fc134cee7b2aa82eef2866a5ea3bf28.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如上圖所示,貝殼OLAP平臺架構的演化歷程大致可以分成三個階段:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一個階段是從 2015 年到 2016 年,Hive to MySQL的初期階段,這是一個無到有的階段;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二個階段是從 2016 年到 2019 年初,基於Kylin的OLAP平臺建設階段,這個階段圍繞着 Apache Kylin引擎構建OLAP 平臺;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第三個階段是從 2019 年初到現在,靈活支持多種OLAP引擎的OLAP平臺建設階段,這個階段解耦了OLAP平臺與Kylin的強綁定,具備靈活支持Kylin之外多種不同OLAP引擎的能力;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/dc\/90\/dc85c2ffc332740dc08928823015b590.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先是第0階段 - Hive2MySQL初期階段"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這個階段,數據處理流程比較簡單,數據包括日誌、Dblog等,經過Sqoop批量或Kafka實時接入大數據平臺HDFS裏,在大數據平臺完成ETL處理後,通過大數據調度系統Ooize每天定時寫入到關係型數據庫MySQL中,然後再以MySQL中的數據爲基礎產出各種報表。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這是一個從無到有的過程,很多公司初期也都是這麼做的。這種方式的優點是簡單,很快能夠落地跑通,但是問題也很明顯:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"整個平臺受限於MySQL的能力,MySQL無法支持大數據量的快速查詢和分析,一般幾百萬上千萬後,MySQL就難以支撐;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"這種方式缺少共性能力的沉澱,Case by Case的處理用戶需求,需求開發時間長;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這也是我們爲什麼稱爲第0階段,因爲這個階段平臺化工作比較少,嚴格說還稱不上一個平臺。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/8e\/d4\/8e97fe996fc865533d15649e08cdd6d4.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"隨着公司業務的迅速發展,數據應用需求的急劇增加,Hive2MySQL的問題逐步突顯,對這種原始架構進行升級改造是一個必然的選擇。改造的目標很直接,首先解決MySQL無法支持海量數據分析查詢的問題;第二就是要平臺化,沉澱共性能力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章