接口需求Hive | Mysql | Redis | 性能要求 | 請求QPS | >1s | <1 | √ | √ | √ | <1s | <10 | √ | √ | <500ms | >100 | √ | <100ms | >100 | √"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這就造成了機票大數據部門的redis集羣內存需求暴漲,目前我們統計redis使用的數據:掛在機票大數據部門的redis集羣數量有幾十個,內存達到了十幾個T。當然接口的性能也達到了前所未有的快速和高效,基本都是10ms左右。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.2 如何查詢"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Redis的查詢方式比較單一:通過唯一key去查詢value。這種查詢方式在簡單的唯一值查詢中比較有效,但是當遇到,同一個數據源多關鍵字查詢的時候,就得維護多份數據源。舉例:在價格趨勢的接口中,我們提供了多種價格趨勢組合:國內、國際、單程、往返、航線、航班。如果使用redis存儲,需要維護同一份數據多種key的存儲方式,極大地浪費了存儲空間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Redis還有一個問題是時間範圍的篩選,還是在上面的價格趨勢接口中,需要按照查詢時間返回歷史同期在一定起飛時間範圍的價格數據,所以我們需要存儲多個時間日期的數據(當然也可以用set等結構,但是會面臨如何刪除過期數據的問題),同時在查詢的時候需要循環取一定時間範圍的價格。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.3 如何維護"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"1)接口維護"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大數據基礎平臺團隊一共維護了幾百個接口,其中1\/3的接口是提供數據給調用方的,這當中又有一些接口只是提供簡單的查詢操作,但就是這些簡單的查詢,需要我們提供海量的數據存儲、快速精準的查詢。每個接口的上線需要經過項目資源申請(包括機器資源、人員資源)、數據同步、開發、測試流程,最後才能上線。一整套流程走下來,耗費2-3天\/人,而且基本上都是是重複性的工作。如何解放這些人力和機器資源,就變得很迫切了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"2)數據同步"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"提供給外部使用的數據大部分都是存儲在hive中,在不使用presto api的方式訪問時,我們需要將hive數據導入到redis或者mysql中,供接口訪問。在zeus平臺上,我們建立了各種導數據的流程,如何將這些簡單、重複度高的流程自動化呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"整個接口的架構圖如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/c3\/d6\/c36256d24c1152de3515d8f8772367d6.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖1 redis\/mysql作爲主要存儲的架構圖"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"三、機票大數據接口的大道之旅"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"認真研究了接口調用方本身的性能,我們發現調用方在調用第三方提供的接口時,基本都是異步進行的。如果把調用方調用的所有第三方接口當成一個木桶,機票大數據基礎架構團隊的接口就是其中的一塊木板,只要不是最短的木板,就可以在保證性能的情況下降低整個接口的響應時間(當然這不是技術上的退步,而是選擇合適的方案)。通過上面的存儲選型對比之後,發現在100ms-500ms這個性能段裏面沒有一個合適的存儲方案能夠提供。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們調研了幾種NOSQL數據庫方案,綜合存儲、查詢等指標發現CrateDB比較符合現實需求。將幾種存儲做了一個對比,如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":" |
|
---|
CrateDb在攜程機票BI的實踐
{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"一、前言"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"隨着整個互聯網流量紅利進入末期,各大廠在着力吸引新客的同時,在既有客戶羣體的運營上也是煞費苦心,各種提高客戶體驗、個性化服務的場景層出不窮。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"攜程機票大數據部門在實踐過程中需要同步數據、選型引擎來存儲處理數據,利用接口將模型結果開放給生產環境調用,因此我們的數據存儲修煉之旅會涉及到"},{"type":"text","marks":[{"type":"strong"}],"text":"接口現狀、接口大道之旅、安裝部署、同步數據、生產應用以及未來的趨勢-如何實現容器化"},{"type":"text","text":"。這當中,我們遇到了很多問題,也解決了很多問題,本文將分享機票大數據平臺在數據存儲這一塊的實踐經驗。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"二、機票大數據接口現狀"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"攜程機票大數據平臺接口組碰到的問題:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如何存儲"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如何查詢"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如何維護"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.1 如何存儲"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"機票大數據基礎架構團隊接口組在2018年之前,數據的存儲方案基本是:hive、mysql、redis。以下是我們現有的存儲選型:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.