揭祕Hologres如何支持超高QPS在線服務(點查)場景

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Hologres(中文名交互式分析)是阿里雲自研的一站式實時數倉,這個雲原生系統融合了實時服務和分析大數據的場景,全面兼容PostgreSQL協議並與大數據生態無縫打通,能用同一套數據架構同時支持實時寫入實時查詢以及實時離線聯邦分析。它的出現簡化了業務的架構,爲業務提供實時決策的能力,讓大數據發揮出更大的商業價值。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本期將爲大家揭祕Hologres如何支持超高QPS點查。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"傳統的 OLAP 系統在業務中往往扮演着比較靜態的角色,以通過分析海量的數據得到業務的洞察(比如說預計算好的視圖、模型等),從這些海量數據分析到的結果再通過另外一個系統提供在線數據服務(比如HBase、Redis、MySQL等)。這裏的服務(Serving)和分析(Analytical)是個割裂的過程。與此不同的是,實際的業務決策過程往往是一個持續優化的在線過程。服務的過程會產生大量的新數據,我們需要對這些新數據進行復雜的分析。分析產生的洞察實時反饋到服務,讓業務的決策更實時,從而創造更大的商業價值。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Hologres定位是一站式實時數倉,融合分析能力(Analytical)與在線服務(Serving)爲一體,減少數據的割裂和移動。本文的內容將會針對Hologres的服務能力(核心爲點查能力),介紹Hologres到底具備哪些服務能力,以及背後的實現原理。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通常我們所說的點查場景是指Key/Value查詢的場景,廣泛用於在線服務。由於點查場景的廣泛需求,市場上存在多種KV數據庫定位於支持高吞吐、低延時的點查場景,例如被大家廣而熟知的HBase,它通過自定義的一套API來提供點查的能力,在許多業務場景都能夠獲得較好的效果。但是HBase在實際使用中也會存在一定的缺點,這也使得很多業務從HBase遷移至Hologres,主要有以下幾點:","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當數據規模大到一定程度的時候,HBase在性能方面將會有所下降,無法滿足大規模的點查計算,同時在穩定性上也變得不如人意,需要有經驗的運維支持","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"HBase提供的是自定義API,上手有一定的成本。Hologres直接通過SQL提供高吞吐、低延時的點查服務。相比於其它KV系統提供自定義API,SQL接口無疑更加的簡單易用。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"HBase採用Schema Free設計,沒有數據類型,對於檢查數據質量,修正數據質量也帶來了複雜度,查錯難,修正難。Hologres具備與Postgres兼容的幾乎所有主流數據類型,可以通過Insert/Select/Update/Delete標準SQL語句對數據進行查看、更新。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"在Hologres中的點查場景是指行存表基於主鍵(PK)的查詢。","attrs":{}}]}]}],"attrs":{}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"--建行存表BEGIN;CREATE TABLE public.holotest ( \"a\" text NOT NULL, \"b\" text NOT NULL, \"c\" text NOT NULL, \"d\" text NOT NULL, \"e\" text NOT NULL,PRIMARY KEY (a,b));CALL SET_TABLE_PROPERTY('public.holotest', 'orientation', 'row');CALL SET_TABLE_PROPERTY('public.holotest', 'time_to_live_in_seconds', '3153600000');COMMIT;-- Hologres通過SQL進行點查select * from table where pk = ?; -- 一次查詢單個點select * from table where pk in (?, ?, ?, ?, ?); -- 一次查詢多個點\n","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"點查場景技術實現難點","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"正常情況下,一條SQL語句的執行,需要經過SQL Parser進行解析成AST(抽象語法樹),再由Query Optimizer處理生成Plan(可執行計劃),最終通過執行Plan拿到計算結果。而要想通過SQL做到高吞吐、低延時、穩定的點查服務,則必須要克服如下困難:","attrs":{}}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"在不破壞PostgreSQL生態的情況下,SQL接口如何做到高QPS?","attrs":{}}]}]}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如何做低甚至避免SQL解析與優化器的開銷","attrs":{}}]}]}],"attrs":{}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"一套高效的Client SDK如何與後端存儲進行交互?","attrs":{}}]}]}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如何在低消耗的情況下,做到高併發的交互","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如何減少消息傳遞過程中的開銷","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如何感知後端的壓力、配合做到最好的吞吐與延遲","attrs":{}}]}]}],"attrs":{}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"後端存儲如何在高性能的情況下更加穩定?","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"如何最大化利用cpu資源","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"如何減少各種內存的分配與拷貝、避免熱點key等問題對系統帶來的不穩定性","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"如何減少冷數據IO的影響","attrs":{}}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在克服上述3大類困難後,整體的工作方式就可以非常的簡潔:在接入層(FrontEnd)上直接通過Client SDK與後端存儲通信。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ba/ba54d4cfa7bb58ce28e6e7b51d542eaf.png","alt":"點查1.png","title":"點查1.png","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面將會介紹Hologres是如何克服以上3大困難,從而實現高吞吐低延時的點查。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"降低、避免SQL解析與優化器的開銷","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Query Optimizer進行Short Cut","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於點查的Query足夠簡單,Hologres的Query Optimizer進行了相應的short cut,點查Query並不會進入Opimizer的完整流程。Query進入FrontEnd後它會交由Fixed Planner進行處理,並由其生成對於的Fixed Plan(點查的物理Plan),Fixed Planner非常輕,無需經過任何的等價變換、邏輯優化、物理優化等步驟,僅僅是基於AST樹進行了一些簡單的分析並構建出對應的Fixed Plan,從而儘量規避掉優化器的開銷。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Prepared Statement","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"儘管Query Optimizer對點查Query進行了short cut,但是Query進入到FrontEnd後的解析開銷依然存在、Query Optimizer的開銷也沒有完全避免。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Hologres兼容Postgres,Postgres的前、後端通信協議有extended協議與simple協議兩種:","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"simple協議:是一次性交互的協議,Client每次會直接發送待執行的SQL給Server,Server收到SQL後直接進行解析、執行,並將結果返回給Client。simple協議裏Server無可避免的至少需要對收到的SQL進行解析才能理解其語義。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"extended協議:Client與Server的交互分多階段完成,整體大致可以分成兩大階段。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一階段:Client在Server端定義了一個帶名字的Statement,並且生成了該Statement所對應的generic plan(不與特定的參數綁定的通用plan)。","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/96/96d5f18e65730735775357e846e23f43.png","alt":"點查2.png","title":"點查2.png","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二階段:用戶通過發送具體的參數來執行第一階段中定義的Statement。第二階段可以重複執行多次,每次通過帶上第一階段中所定義的Statement名字,以及執行所需要的參數,使用第一階段生成的generic plan進行執行。由於第二階段可以通過Statement名字和附帶的參數來反覆執行第一個階段所準備好的generic plan,因此第二個段在Frontend的開銷幾乎等同於0。","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲此Hologres基於Postgres的extended協議,支持了Prepared Statement,做到了點查Query在Frontend上的開銷接近於0。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"高性能的內部通信","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"BHClient是Hologres實現的一套用於與後端存儲直接通信的高效Private Client SDK,主要有以下幾個優勢:","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"1)Reactor模型、全程無鎖的異步操作","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"BHClient工作方式類似reactor模型,每個目標shard對應一個eventloop,以“死循環”的方式處理該shard上的請求。由於HOS對調度執行單元的抽象,即使是shard很多的情況下,這種工作方式的基礎消耗也足夠低。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"2)高效的數據交換協議binary row","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過自定義一套內部的數據通信協議binary row來減少整個交互鏈路上的內存的分配與拷貝。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"3)反壓與湊批","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"BHClient可以感知後端的壓力,進行自適應的反壓與湊批,在不影響原有Latency的情況下提升系統吞吐。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"穩定可靠的後端存儲","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"1)LSM(Log Structured Merge Tree)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Hologres的行存表採取LSM進行存儲,相比於傳統的B+樹,LSM能夠提供更高的寫吞吐,因爲它不會出現任何的隨機寫,Append Only的操作保證了其只會順序的寫盤。","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個行存tablet上會存在一個memtable,和多個immutable memtable。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據更新都會寫入到memtable中,當memtable寫滿後會轉變爲immtable memtable,immutable memtable會Flush成Key有序的SST(Sorted String Table)文件,SST文件一旦生成則不能修改,因此不會發生隨機寫的操作。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"SST文件在文件系統裏面按層組織,除了level 0上的SST文件間無序,且存在overlap外,其它level上的SST文件間有序,且無overlap。因此查詢的時候,對於level 0上的文件需要逐個遍歷,而其它level的文件可以二分查找。底層的SST文件通過Compaction成新的SST文件去到更高層,因此低層的數據要比高層的新,所以一旦在某層上找到了滿足條件的key則無需往更高層去查詢。","attrs":{}}]}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"2)基於C++純異步的開發","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"採用LSM對數據進行組織存儲的系統並不僅僅只有Hologres,LSM在谷歌的\"BigTable\"論文中被提出後,很多的系統都對其進行了借鑑採用,例如HBase。Hologres採用C++進行開發,相較於Java,native語言使得我們能夠追求到更極致的性能。同時基於HOS(Hologres Operation System)提供的異步接口進行純異步開發,HOS通過抽象ExecutionContext來自我管理CPU的調度執行,能夠最大化的利用硬件資源、達到吞吐最大化。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"3)IO優化與豐富的Cache機制","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Hologres實現了非常豐富的Cache機制row cache、block cache、iterator cache、meta cache等,來加速熱數據的查找、減少IO訪問、避免新內存分配。當無可避免的需要發生IO時,Hologres會對併發IO進行合併、通過wait/notice機制確保只訪問一次IO,減少IO處理量。通過生成文件級別的詞典及壓縮,減少文件物理存儲成本及IO訪問。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Hologres致力於一站式實時數倉,除了具備處理複雜OLAP分析場景的能力之外,還支持超高QPS在線點查服務,通過使用標準的Postgres SDK接口,就能通過SQL獲得低延時、高吞吐的在線服務能力,簡化學習成本,提升開發效率。","attrs":{}}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章