打造全球最大規模Kafka集羣,Uber的多區域災備實踐

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Uber的Kafka生態系統"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Uber擁有世界上最大的Kafka集羣,每天處理數萬億條消息和幾個PB的數據。如圖1所示,Kafka現在成了Uber技術棧的基石,我們基於這個基石構建了一個複雜的生態系統,爲大量不同的工作流提供支持。其中包含了一個用於傳遞來自乘客和司機App事件數據的發佈\/訂閱消息總線、爲流式分析平臺(如Apache Samza、Apache Flink)提供支持、將數據庫變更日誌流到下游訂閱者,並將各種數據接收到Uber的Hadoop數據湖中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/c8\/c8b2b551d79b5f2fd1fdc1ca9be2c221.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"圖1:Uber的Kafka生態系統"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了能夠基於Kafka構建一個可伸縮、可靠、高性能、易於使用的消息傳遞平臺,我們克服了許多挑戰。在這篇文章中,我們將着重介紹在進行災難恢復(因集羣宕機導致)時所面臨的一個挑戰,並分享我們如何構建一個多區域的Kafka基礎設施。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Uber的Kafka多區域部署"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"提供業務彈性和連續性是Uber的首要任務。我們制定了詳細的災難恢復計劃,儘量減少自然和人爲災難(如停電、災難性軟件故障和網絡中斷)對業務的影響。我們採用多區域部署策略,將服務與備份一起部署在分佈式的數據中心中。當一個區域的物理基礎設施不可用時,服務仍然可以在其他區域運行。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們構建了一個多區域Kafka架構,實現了數據冗餘,爲區域故障轉移提供支持。Uber技術棧中的很多服務都依賴Kafka來實現區域級故障轉移。這些服務是Kafka的下游,並假定Kafka中的數據是可用且可靠的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"圖2描繪了多區域Kafka架構。我們有兩種集羣:生產者在本地向區域集羣發佈消息,將來自區域集羣的消息複製到聚合集羣,以此來提供全局視圖。爲簡單起見,圖2只顯示了兩個區域的集羣。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/03\/03970708e68953ea322a4b84c986a3bf.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"圖2:兩個區域之間的Kafka複製拓撲"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在每個區域,生產者總是在本地生產消息,以便獲得更好的性能,當Kafka集羣不可用時,生產者會轉移到另一個區域,然後向該區域的區域集羣生產消息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這個架構中的一個關鍵部分是消息複製。消息從區域集羣異步複製到其他區域的聚合集羣。我們開發了uReplicator("},{"type":"link","attrs":{"href":"https:\/\/eng.uber.com\/ureplicator","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/eng.uber.com\/ureplicator"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":")——Uber的Kafka數據複製解決方案,健壯且可靠。uReplicator擴展了Kafka的MirrorMaker,專注於可靠性、零數據丟失保證和易維護性。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"從多區域Kafka集羣消費消息"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"從多區域集羣消費消息比生產消息更爲複雜。多區域Kafka集羣支持兩種類型的消費模式。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"雙活模式"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"一種常見的類型是雙活(Active\/Active)消費模式,消費者在各自區域中消費聚合集羣的主題。Uber的很多應用程序使用這種模式消費多區域Kafka集羣裏的消息,而不是直接連接到其他區域。當一個區域發生故障時,如果Kafka流在兩個區域都可用,並且包含了相同的數據,那麼消費者就會切換到另一個區域。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"例如,圖3顯示了Uber的動態定價服務(即峯時定價)如何使用雙活模式來構建災備計劃。價格是根據附近地區最近一系列打車數據來計算的。所有的打車事件都被髮送到Kafka區域集羣,然後聚合到聚合集羣中。然後,在每個區域,一個複雜的、佔用大量內存的Flink作業負責計算不同區域的價格。接下來,一個全活服務負責協調各個區域的更新服務,並分配一個區域作爲主區域。主區域的更新服務將定價結果保存到雙活數據庫中,以便進行快速查詢。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/fc\/0d\/fcd9dcc99894593a40f63c198c45be0d.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"圖3:雙活消費模式架構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"當主區域發生災難時,雙活服務會將另一個區域作爲主區域,峯時價格計算會轉移到另一個區域。需要注意的是,Flink作業的計算狀態規模太大了,無法在區域之間同步複製,因此必須使用聚合集羣的輸入消息來計算其狀態。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們從實踐中獲得了一個很關鍵的經驗,可靠的多區域基礎設施服務(如Kafka)可以極大地簡化應用程序針對業務連續性計劃的開發工作。應用程序可以將狀態存儲在基礎設施層中,從而變成無狀態的,將狀態管理的複雜性(如跨區域的同步和複製)留給基礎設施服務。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"主備模式"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"另一種多區域消費模式是主備模式(Active\/Passive):一次只允許一個消費者(通過唯一名稱標識)從一個區域(即主區域)的聚合集羣中消費消息。多區域Kafka集羣跟蹤主區域的消費進度(用偏移量表示),並將偏移量複製到其他區域。在主區域出現故障時,消費者可以故障轉移到另一個區域並恢復消費進度。主備模式通常被支持強一致性的服務(如支付處理和審計)所使用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在使用主備模式時,區域間消費者的偏移量同步是一個關鍵問題。當用戶故障轉移到另一個區域時,它需要重置偏移量,以便恢復消費進度。由於Uber的很多服務不能接受數據丟失,所以消費者無法從高水位(即最新消息)恢復消費。另外,爲了避免過多的積壓,消費者也不能從低水位(即最早的消息)恢復消費。此外,從區域集羣聚合到聚合集羣的消息可能會變得無序。由於跨區域複製延遲,消息從區域集羣複製到本地聚合集羣的速度比遠程聚合集羣要快。因此,聚合集羣中的消息順序可能會不一樣。例如,在圖4a中,消息A1、A2、B1、B2幾乎是同時發佈到區域A和區域B的區域集羣中,但經過聚合後,它們在兩個聚合集羣中的順序是不一樣的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/d7\/d7a9c8ba5126b2cd249f16e351b516f0.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"圖4:a.跨區域消息複製 b.消息複製檢查點"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了管理這些區域的偏移量映射,我們開發了一個複雜的偏移量管理服務,架構如圖5所示。當uReplicator將消息從源集羣複製到目標集羣時,它會定期檢查從源到目標的偏移量映射。例如,圖4b顯示了圖4a消息複製的偏移量映射。表的第一行記錄了區域A區域集羣的消息A2(在區域集羣中的偏移量是1)映射到區域A聚合集羣的消息A2(在聚合集羣中的偏移量是1)。同樣,其餘行記錄了其他複製路線的檢查點。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"偏移量管理服務將這些檢查點保存在雙活數據庫中,並用它們來計算給定的主備消費者的偏移量映射。同時,一個偏移量同步作業負責定期同步兩個區域之間的偏移量。當一個主備消費者從一個區域轉移到另一個區域時,可以獲取到最新的偏移量,並用它來恢復消費。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/53\/ac\/53830d954bbbbbea9b751346c59c73ac.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"圖5:偏移量管理服務架構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"偏移量映射算法的工作原理如下:在活躍的消費者正在消費的聚合集羣中找到每個區域集羣的最近檢查點。然後,對於每個區域檢查點的源偏移量,找到它們在另一個區域聚合集羣對應的檢查點。最後,在另一個區域的聚合集羣中取最小的那個偏移量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在圖6中,假設活躍消費者目前的進度是區域B的A3消息(偏移量爲6)。根據右邊的表檢查點,最近的兩個檢查點分別是偏移量爲3(藍色)的A2和偏移量爲5(紅色)的B4,分別對應區域集羣A中偏移量1(藍色)和區域集羣B的偏移量3(紅色)。這些源偏移量映射到區域A聚合集羣的偏移量1(藍色)和偏移量7(紅色)。根據算法,被動消費者(黑色)取兩者中較小的偏移量,即偏移量1。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/dd\/c9\/dd6418e99448b841c97f1579565cc6c9.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"圖6:主備消費者從一個區域失效轉移到另一個區域 "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"結論"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在Uber,業務的連續性取決於高效、不間斷的跨服務數據流,Kafka在公司的災備計劃中扮演着關鍵角色。在這篇文章中,我們簡要地強調了在Uber多區域Kafka集羣的總體架構,以及當災難發生時不同區域的故障轉移策略。但是,我們還有更具挑戰性的工作要做,目前要解決如何在不進行區域故障轉移的情況下容忍單個集羣故障的細粒度恢復策略。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"英文原文鏈接:"},{"type":"link","attrs":{"href":"https:\/\/eng.uber.com\/kafka\/","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/eng.uber.com\/kafka\/"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章