簡單的Kafka:沒有ZooKeeper的Kafka

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"Apache Kafka®的核心是日誌。日誌是一個簡單的數據結構,它通過順序讀寫與底層硬件密切配合。以日誌爲中心的設計利用了高效的磁盤緩衝、CPU緩存、預讀、零拷貝等許多特性,從而帶來了衆所周知的高效率和吞吐量。對於那些剛接觸Kafka的人來說,這些主題以及它作爲提交日誌的底層實現,通常是他們學習Apache Kafka的第一件事。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"但是日誌本身的代碼在整個系統中只佔相對較小的一部分。Kafka的代碼庫中有很大一部分是負責在集羣中多個broker之間分配分區(即日誌)、分配領導權、處理故障等。這些代碼使Kafka成爲一個可靠和可信的分佈式系統。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"從歷史上看,Apache ZooKeeper是分佈式代碼工作的關鍵部分。ZooKeeper提供了可靠的元數據存儲,這些元數據存儲了系統中最重要的信息:比如分區在哪裏,哪個副本是Leader等等。項目早期使用ZooKeeper是有意義的,因爲它是一個強大且經過驗證的工具。但歸根結底,ZooKeeper是一個基於一致性日誌的特殊文件系統\/觸發器API。Kafka是一個建立在一致性日誌之上的發佈\/訂閱API。這使得操作系統的人員可以跨兩個日誌實現、兩個網絡層和兩個安全實現(每個實現都有不同的工具和監視鉤子)對通信和性能進行調優、配置、監視、保護和評估。它變得不必要的複雜。這種固有的和不可避免的複雜性促使了最近的一個倡議,即用一個完全運行在Kafka內部的仲裁服務來取代ZooKeeper。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"當然,更換ZooKeeper是一項相當大的工作,去年4月,我們啓動了一個社區倡議,以加快進度,並在年底前交付一個工作系統。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/2f\/2f675fdbdcdb9f5278ff0ae888b7c9c3.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我剛剛和Jason, Colin以及KIP-500團隊坐在一起,經歷了Kafka服務器的完整生命週期,生產,消費和所有zookeeper免費。非常甜蜜!"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/twitter.com\/benstopford?ref_src=twsrc^tfw|twcamp^tweetembed|twterm^1338931076979372038|twgr^|twcon^s1_&ref_url=https%3A%2F%2Fwww.confluent.io%2Fblog%2Fkafka-without-zookeeper-a-sneak-peek%2F","title":null,"type":null},"content":[{"type":"text","marks":[{"type":"underline"}],"text":"Ben Stopford@benstopford"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"所以我們很高興地說,KIP-500代碼的早期訪問已經提交到trunk,預計將包括在即將發佈的2.8版本中。第一次,你可以在沒有ZooKeeper的情況下運行Kafka。我們稱之爲 Kafka Raft 元數據模式,通常縮寫爲KRaft(發音像craft)模式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"注意,有一些特性在這個早期版本中是不可用的。我們還不支持使用acl和其他安全特性或事務。而且,在KRaft模式下,不支持分區重分配和JBOD(預計在今年晚些時候的Apache Kafka版本中會提供這些功能)。因此,考慮 Quorum 控制器是一個實驗性的功能,我們不建議將其置於生產工作負載之下。然而,如果你確實嘗試過這個軟件,你會發現它有很多新的優點:它的部署和操作更簡單,你可以把Kafka作爲一個單獨的進程來運行,而且它可以在每個集羣中容納更多的分區(見下面的數據信息)。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#1A1A1A","name":"user"}}],"text":"Quorum 控制器: 事件驅動的共識"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"如果你選擇使用新的 Quorum 控制器運行Kafka,所有以前由Kafka控制器和ZooKeeper承擔的元數據功能,都會合併到這個新的服務中,運行在Kafka集羣中。如果有需要的話,Quorum 控制器還可以在專用硬件上運行。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/10\/109860232d5405aadf758a512d35500c.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"但在內部,它變得有趣起來。Quorum 控制器使用新的 KRaft 協議來確保元數據在仲裁中被精確地複製。這個協議在很多方面與ZooKeeper的ZAB協議和Raft相似,但有一些重要的區別,其中一個顯著且合適的區別是它使用了事件驅動的架構。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"Quorm控制器使用事件源存儲模型存儲其狀態,這確保始終可以準確地重新創建內部狀態機。用於存儲此狀態的事件日誌(也稱爲元數據主題)通過快照定期地進行壓縮,以確保日誌不會無限增長。Quorm中的其他控制器通過響應活動控制器,創建並存儲在其日誌中的事件來跟蹤活動控制器。因此,如果一個節點由於分區事件而暫停,那麼它可以在重新登錄時通過訪問日誌來快速地趕上它錯過的任何事件。這大大減少了不可用窗口,改善了系統的最壞情況恢復時間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/03\/0365ed50557e411cc4343d7ebbc89d1a.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"事件驅動的內部共識"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"KRaft協議的事件驅動特性意味着,與基於ZooKeeper的控制器不同,仲裁控制器在成爲活動狀態之前不需要從ZooKeeper加載狀態。當領導權發生變化時,新的活動控制器已經在內存中擁有所有提交的元數據記錄。此外,KRaft協議中使用的事件驅動機制也用於跨集羣跟蹤元數據。以前使用rpc處理的任務現在得益於事件驅動以及使用實際日誌進行通信。這些改變帶來的一個令人愉快的結果是,Kafka現在可以比以前支持更多的分區。讓我們更詳細地討論一下。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#1A1A1A","name":"user"}}],"text":"擴展Kafka:支持數百萬個分區"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"Kafka集羣可以支持的分區數由兩個屬性決定: 每個節點的分區數限制和集羣範圍的分區限制。兩者都很有趣,但是到目前爲止,元數據管理一直是集羣範圍限制的主要瓶頸。以前的Kafka改進建議(KIPs)已經改進了每個節點的限制,儘管總有更多的事情可以做。但是Kafka的可伸縮性主要依賴於增加節點來獲得更多的容量。這就使集羣範圍限制變得重要的地方,因爲它定義了系統內可伸縮性的上限。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"新的Quorum控制器旨在處理每個集羣中更多的分區。爲了評估這一點,我們進行了類似於2018年運行的那些測試,以公佈Kafka固有的分區限制。這些測試測量關閉和恢復所花費的時間,這是指舊控制器的O(#partitions)操作。正是這個操作爲Kafka在單個集羣中所能支持的分區數量設定了上限。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"之前的實現,正如Jun Rao在上面的文章中解釋的那樣,可以達到200K分區,限制因素是在外部共識(ZooKeeper)和內部leader管理(Kafka controller)之間移動關鍵元數據所花費的時間。使用新的仲裁控制器,這兩個角色由相同的組件提供服務。事件驅動的方法意味着控制器故障轉移現在幾乎是即時的。下面是在我們的實驗室中運行200萬個分區(是上一個上限的10倍)的集羣:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/19\/19fb676d80f6d4b7e4bfeda377c89158.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

With ZooKeeper-Based Controller

With Quorum Controller

Controlled Shutdown Time (2 million partitions)

135 sec.

32 sec.

Recovery from Uncontrolled Shutdown (2 million partitions)

503 sec.

37 sec."}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"控制和不控制停機的兩種措施都很重要。受控關閉會影響常見的操作場景,如滾動重啓:部署軟件更改的標準過程,同時保持整個過程的可用性。從不受控制的關閉中恢復可能更重要,因爲它設置了系統的恢復時間目標(RTO),例如在發生意外故障後,例如VM或pod崩潰或數據中心不可用。雖然這些度量只是更廣泛的系統性能指標,但它們直接度量了衆所周知的ZooKeeper使用帶來的瓶頸。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"注意,控制和非控制的測量是不能直接比較的。不受控制的政府停擺案包括了選出新領導人所需的時間,而控制案則沒有。這種差異是故意的,以保持控制病例與Jun Rao的原始測量。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#1A1A1A","name":"user"}}],"text":"集羣規模下降: 單一進程運行Kafka"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"Kafka經常被認爲是重量級的基礎設施,比如管理zookeeper的複雜性(因爲它是一個單獨的分佈式系統) 就是這種看法存在的重要原因。這通常會導致項目在開始時選擇更輕量級的消息隊列,比如ActiveMQ或rabbitmq這樣的傳統隊列,然後在規模需要時轉移到Kafka。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"這是不幸的,因爲Kafka提供的抽象,形成了一個提交日誌,是適用於小規模的工作負載,你可能看到在一個初創公司,因爲它是在Netflix或Instagram的高吞吐量。更重要的是,如果你想添加流處理,你需要Kafka和它的提交日誌抽象,不管它是使用Kafka Streams, ksqlDB,還是其他的流處理框架。但是由於管理兩個獨立系統kafka和zookeeper的複雜性,用戶常常覺得他們必須在規模和入門的方便性之間做出選擇。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"現在已經不是這樣了。KIP-500和KRaft模式提供了一種很棒的、輕量級的方式來開始使用Kafka,或者使用它作爲ActiveMQ或RabbitMQ等單片代理的替代方案。輕量級的單進程部署也更適合於邊緣場景和那些使用輕量級硬件的場景。云爲這個問題增加了一個有趣的切入角度。像融合雲這樣的託管服務完全消除了操作負擔。因此,無論您是希望運行自己的集羣,還是讓它爲您運行,都可以從小規模開始,隨着底層用例的擴展(可能)擴展到大規模——所有這些都使用相同的基礎設施。讓我們看看單進程部署是什麼樣子的。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#1A1A1A","name":"user"}}],"text":"帶着沒有ZooKeeper的的Kafka兜風"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"新的Quorm控制器今天在trunk中已經以試驗性的功能提供出來,預計將包含在即將發佈的Apache Kafka 2.8版本中。那麼你能用它做什麼呢? 如上所述,一個簡單但非常酷的新特性是創建單個進程Kafka集羣的能力,如下面的簡短演示所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"演示文檔地址: "},{"type":"link","attrs":{"href":"https:\/\/asciinema.org\/a\/403794\/embed","title":null,"type":null},"content":[{"type":"text","marks":[{"type":"underline"}],"text":"https:\/\/asciinema.org\/a\/403794\/embed"}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"當然,如果您想要擴展它以支持更高的吞吐量並添加複製以容錯,您只需要添加新的代理進程。如你所知,這是基於kraft的 Quorm 控制器的早期訪問版本。請不要將它用於高負載的工作環境中。在接下來的幾個月裏,我們將添加最後缺失的部分,執行協議的TLA+建模,並在融合雲中完善 Quorm控制器。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"您現在可以自己嘗試新的Quorm 控制器。在GitHub上"},{"type":"link","attrs":{"href":"https:\/\/github.com\/apache\/kafka\/blob\/6d1d68617ecd023b787f54aafc24a4232663428d\/config\/kraft\/README.md","title":null,"type":null},"content":[{"type":"text","marks":[{"type":"underline"}],"text":"查看完整的描述"}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"。"},{"type":"link","attrs":{"href":"https:\/\/github.com\/apache\/kafka\/blob\/6d1d68617ecd023b787f54aafc24a4232663428d\/config\/kraft\/README.md","title":null,"type":null},"content":[{"type":"text","marks":[{"type":"underline"}],"text":"現在去嘗試"}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#1A1A1A","name":"user"}}],"text":"背後的團隊"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"如果沒有Apache Kafka社區和一羣分佈式系統工程師在大流行期間不知疲倦地工作,在大約9個月的時間裏將它從零變成一個正常工作的系統,這將是(並將繼續是)一個巨大的努力。我們想擴展特別感謝 Colin McCabe, Jason Gustafson, Ron Dagostino, Boyang Chen, David Arthur, Jose Garcia Sancio, Guozhang Wang, Alok Nikhil, Deng Zi Ming, Sagar Rao, Feyman, Chia-Ping Tsai, Jun Rao, Heidi Howard,和Apache Kafka社區的所有成員幫助實現這一目標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文網址"},{"type":"text","text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https:\/\/www.confluent.io\/blog\/kafka-without-zookeeper-a-sneak-peek\/"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"譯者:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"韓欣"},{"type":"text","text":",騰訊雲中間件-微服務產品中心技術總監,微服務平臺TSF、消息隊列CKafka \/ TDMQ、微服務觀測平臺TSW等中間件產品的負責人。負責中間件相關產品的規劃,架構和落地實施,有超過十三年的研發架構經驗。目前關注在雲計算中間件相關領域,致力於整合PaaS技術資源,構建基於微服務的技術中臺,爲企業的數字化轉型提供基礎支持。"}]}]}

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章