Kafka降本實用指南

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#000000","name":"user"}},{"type":"strong"}],"text":"本文最初發佈於leevs.dev網站,經原作者授權由InfoQ中文站翻譯並分享。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/57\/5781e0bb1f349eecdc2f4df490026115.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"根據Gartner的"},{"type":"link","attrs":{"href":"https:\/\/techblog.comsoc.org\/2021\/05\/03\/gartner-global-public-cloud-spending-to-reach-332-3-billion-in-2021-23-1-yoy-increase\/","title":null,"type":null},"content":[{"type":"text","text":"預測"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",預計在2021年,全球終端用戶在公共雲服務上的支出將在2020年的2700億美元基礎上增長23%,達到3320億美元。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Kafka的市場增長趨勢也是一樣的。世界各地的組織都在使用Kafka作爲主要的流處理平臺來大規模收集、處理和分析數據。隨着組織的發展和壯大,"},{"type":"link","attrs":{"href":"https:\/\/www.networkworld.com\/article\/3325397\/idc-expect-175-zettabytes-of-data-worldwide-by-2025.html","title":null,"type":null},"content":[{"type":"text","text":"數據規模"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"也在增長,隨之而來的雲成本同樣日益上升。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"所以我們能做些什麼呢?爲了削減成本,我們是否可以實施一些容易實現的策略?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"本文就此主題列舉了一些可能會有所幫助的提示和KIP(Kafka改進建議)!"}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"免責聲明"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這篇文章不會涵蓋那些旨在降本的託管服務方法。在某些用例中,像Confluent Cloud這樣的託管服務可能會起到作用。要獲取更多信息,你可以參考Confluent網站上的成本效益"},{"type":"link","attrs":{"href":"https:\/\/www.confluent.io\/project-metamorphosis\/cost-effective\/","title":null,"type":null},"content":[{"type":"text","text":"頁面"}]},{"type":"text","text":"。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在開始之前,我們先來了解一些基礎知識。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"我們的錢都花在了什麼地方?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我可以試着列出我們在使用各種雲服務時花錢購買的所有組件,但這樣還不如來看Gwen Shapira在她的文章“"},{"type":"link","attrs":{"href":"https:\/\/www.confluent.io\/blog\/guide-to-kafka-pricing-and-diy-open-source-costs\/","title":null,"type":null},"content":[{"type":"text","text":"Apache Kafka的成本:爲DIY運維定價的工程師指南"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"”中總結的內容。"}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們從一些相當明顯且易於量化的費用開始分析。如果你打算在AWS上運行Kafka,你需要爲"},{"type":"text","marks":[{"type":"strong"}],"text":"EC2 machines"},{"type":"text","text":"支付費用來運行你的broker。如果你使用像EKS這樣的Kubernetes服務,你要爲"},{"type":"text","marks":[{"type":"strong"}],"text":"節點"},{"type":"text","text":"和服務本身("},{"type":"text","marks":[{"type":"strong"}],"text":"Kubernetes主節點"},{"type":"text","text":")付費。大多數相關的EC2類型都是EBS存儲,Kubernetes僅支持EBS作爲一等磁盤選項,這意味着"},{"type":"text","marks":[{"type":"strong"}],"text":"除了EBS數據卷之外,你還需要爲EBS根卷付費"},{"type":"text","text":"。還要記得,在合併KIP-500之前Kafka不只有broker——我們還需要運行Apache ZooKeeper,給它"},{"type":"text","marks":[{"type":"strong"}],"text":"算上三到五個節點及其存儲的費用"},{"type":"text","text":"。我們的Kafka是在一個"},{"type":"text","marks":[{"type":"strong"}],"text":"負載均衡器"},{"type":"text","text":"(部分充當一個NAT層)之後運行的,並且由於每個broker都需要單獨尋址,因此你需要"},{"type":"text","marks":[{"type":"strong"}],"text":"爲“引導”路由和每個broker的路由付費"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所有這些都是固定成本,你一個字節都不向Kafka發送也得花這些錢。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除此之外還有"},{"type":"text","marks":[{"type":"strong"}],"text":"網絡成本"},{"type":"text","text":"。將數據導入EC2需要花錢,並且根據你的網絡設置(VPC、私有鏈接或公共互聯網),你可能"},{"type":"text","marks":[{"type":"strong"}],"text":"在發送和接收數據時都要付費"},{"type":"text","text":"。如果你"},{"type":"text","marks":[{"type":"strong"}],"text":"在不同地區或區域之間複製數據"},{"type":"text","text":",請一定要考慮到這些成本。如果你"},{"type":"text","marks":[{"type":"strong"}],"text":"通過ELB路由流量"},{"type":"text","text":",你將爲此流量支付額外費用。不要忘了算上"},{"type":"text","marks":[{"type":"strong"}],"text":"入口和出口"},{"type":"text","text":",並且要記住,使用Kafka時,你讀取的數據往往是寫入量的3-5倍。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"現在我們算上了運行軟件、攝取數據、存儲和讀取數據的成本,我們就快完成了。你還需要"},{"type":"text","marks":[{"type":"strong"}],"text":"監控Kafka"},{"type":"text","text":",對吧?一定要考慮到監控(Kafka有許多重要的指標)——無論是使用服務還是自託管,你還需要一種"},{"type":"text","marks":[{"type":"strong"}],"text":"收集日誌"},{"type":"text","text":"並搜索它們的方法。最後這些可能成爲系統中最昂貴的部分,特別是如果你有許多分區的時候,這會顯著增加指標的數量。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我加粗了帖子中的要點,還有來自Kafka生態系統的其他一些組件在文中沒有直接提到,如Schema Registry、Connect workers,以及CMAK或Cruise-Control等工具,但這些都適用於同樣的三個要素——"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"機器、存儲和網絡"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"雖然有更多要素是難以衡量的,例如員工工資、停機時間,甚至"},{"type":"link","attrs":{"href":"https:\/\/sre.google\/sre-book\/dealing-with-interrupts\/","title":null,"type":null},"content":[{"type":"text","text":"處理中斷"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(也就是將系統維持在可用狀態所必須完成的工作),但上面這三點是我們在使用雲提供商時花錢購買的主要要素。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"我們可以做什麼?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"除了適用於幾乎所有分佈式系統的一些基本概念之外,隨着時間的推移,Kafka的提交者引入了一些直接或間接影響Kafka TCO的KIP和特性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"但我們還是會從顯而易見的東西開始分析——"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"你是否使用了正確的實例類型?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"AWS上有很多實例類型可供選擇(當然,這裏提到的方法也可以用在其他雲提供商上)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Kafka可以在廉價的貨架硬件上輕鬆運行,並且不會出什麼顯眼的問題。如果你在谷歌上搜索生產級Kafka集羣的推薦實例類型,你會發現人們建議用r4、d2甚至c5與GP2\/3或IO2存儲搭配用於一般用途。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"每種組合都有自己的優點和缺點,你可能需要根據各種權衡來找到最適合自己的選項——這些權衡包括更長的恢復時間(HDD磁盤)、存儲性價比、網絡吞吐量,甚至EBS在極端情況下的"},{"type":"link","attrs":{"href":"https:\/\/www.honeycomb.io\/blog\/kafka-migration-lessons-learned\/","title":null,"type":null},"content":[{"type":"text","text":"性能下降"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"但隨着時間的推移,i3和i3en機器得到了人們的青睞;根據我的經驗,它們是迄今爲止獲推薦最多的高回報率大規模部署實例類型,就算你考慮到了使用非永久存儲器帶來的運維開銷也是如此。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"使用Kafka時,i3和i3en在無數基準測試中都提供了更好的性能,並充分利用了非永久驅動器的優勢(10gbps磁盤帶寬,對比EBS優化的C級實例的875mbps)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"ScyllaDB的這篇很棒的"},{"type":"link","attrs":{"href":"https:\/\/www.scylladb.com\/2019\/05\/28\/aws-new-i3en-meganode\/","title":null,"type":null},"content":[{"type":"text","text":"文章"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",研究了在主要瓶頸因素是存儲容量的系統上使用i3en機器帶來的顯著成本優勢(稍後會詳細介紹):"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/64\/64430a8f19c8791691a842c6701a6486.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"作爲降低成本的第一步,你需要重新評估你的實例類型決策:你的集羣是否"},{"type":"link","attrs":{"href":"http:\/\/www.brendangregg.com\/usemethod.html","title":null,"type":null},"content":[{"type":"text","text":"飽和"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"?在什麼情況下飽和?是否存在其他實例類型,可能比你第一次創建集羣時選擇的類型更合適?EBS優化實例與GP2\/3或IO2驅動器的混合是否真的比i3或i3en機器(及其帶來的優勢)有更好的性價比?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"如果你不熟悉這款"},{"type":"link","attrs":{"href":"https:\/\/instances.vantage.sh\/","title":null,"type":null},"content":[{"type":"text","text":"工具"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",你應該瞭解一下。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"壓縮"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"壓縮在Kafka中並不新鮮,大多數用戶已經知道了自己可以在GZIP、Snappy和LZ4之間做出選擇。但自從"},{"type":"link","attrs":{"href":"https:\/\/cwiki.apache.org\/confluence\/display\/KAFKA\/KIP-110%3A+Add+Codec+for+ZStandard+Compression","title":null,"type":null},"content":[{"type":"text","text":"KIP-110"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"被合併進Kafka,並添加了用於Zstandard壓縮的壓縮器後,它已實現了顯著的性能改進,並且是降低網絡成本的完美方式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/facebook.github.io\/zstd\/","title":null,"type":null},"content":[{"type":"text","text":"Zstandard"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"是Facebook的壓縮算法;與其他壓縮算法相比,它旨在實現更小、更快的數據壓縮。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/f7\/f79fc9b561e2bdbc0d8da7f519c45dfc.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"例如,使用zstd(Zstandard)後,"},{"type":"link","attrs":{"href":"https:\/\/github.com\/apache\/kafka\/pull\/2267#issuecomment-366149315","title":null,"type":null},"content":[{"type":"text","text":"Shopify"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"獲得了4.28倍的壓縮率。關於zstd優勢的另一個很好的例子是Cloudflare的“"},{"type":"link","attrs":{"href":"https:\/\/blog.cloudflare.com\/squeezing-the-firehose\/","title":null,"type":null},"content":[{"type":"text","text":"擠一擠firehose:從Kafka壓縮中獲得最大收益"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"”這篇文章。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這和降低成本又有什麼關係呢?以生產者端略高的CPU使用率爲代價,你將獲得更高的壓縮率並在線上“擠進”更多信息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/amplitude.engineering\/reducing-kafka-costs-with-z-tandard-88e0e489850f","title":null,"type":null},"content":[{"type":"text","text":"Amplitude"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在他們的帖子中介紹,在切換到Zstandard後,他們的帶寬使用量減少了三分之二,僅在處理管道上就可以節省每月數萬美元的數據傳輸成本。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"畢竟你還要支付數據傳輸費用,記得嗎?"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"從最近的副本中獲取數據"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了容錯需求,在同一AWS區域內的幾個可用區上分佈集羣是很常見的做法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"遺憾的是,我們無法通過協調讓消費者的分佈與他們需要消費的分區leader完美對齊,以避免跨區域流量和成本。參閱"},{"type":"link","attrs":{"href":"https:\/\/cwiki.apache.org\/confluence\/display\/KAFKA\/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica","title":null,"type":null},"content":[{"type":"text","text":"KIP-392"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"2015年的"},{"type":"link","attrs":{"href":"https:\/\/cwiki.apache.org\/confluence\/display\/KAFKA\/KIP-36+Rack+aware+replica+assignment","title":null,"type":null},"content":[{"type":"text","text":"KIP-36"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"合併之後,機架感知就可以啓用了,並且只需在配置文件中添加一行代碼就可以輕鬆完成:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"broker.rack= # For example, AWS AZ ID"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在實現"},{"type":"link","attrs":{"href":"https:\/\/cwiki.apache.org\/confluence\/display\/KAFKA\/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica","title":null,"type":null},"content":[{"type":"text","text":"KIP-392"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"之前,此設置僅控制副本位置(將AZ視爲機架)。而這個KIP正好解決了這個麻煩,並允許你利用局部性來減少昂貴的跨dc流量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"一些"},{"type":"link","attrs":{"href":"https:\/\/github.com\/Shopify\/sarama\/pull\/1696\/files","title":null,"type":null},"content":[{"type":"text","text":"客戶端"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"已經實現了這一更改,如果你的客戶端還不支持它,這裏有一個開源項目適合你週末來研究!🙂"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"集羣均衡"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這一點可能看起來沒那麼明顯,但如果你的集羣不平衡,也可能會影響集羣成本。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"不平衡的集羣可能會損害集羣性能,導致某些borker比其他broker的負載更大,讓響應延遲更高,並且在某些情況下會導致這些broker的資源飽和,從而導致不必要的擴容。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"不平衡集羣面臨的另一個風險是在一個broker出故障後出現更高的MTTR(例如當該broker不必要地持有更多分區時),以及更高的數據丟失風險(想象一個複製因子爲2的主題,其中一個節點由於啓動時要加載的segment過多,於是難以啓動)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"不要讓這個相對容易完成的任務去浪費金錢。你可以使用"},{"type":"link","attrs":{"href":"https:\/\/github.com\/yahoo\/CMAK","title":null,"type":null},"content":[{"type":"text","text":"CMAK"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"、"},{"type":"link","attrs":{"href":"https:\/\/github.com\/DataDog\/kafka-kit\/tree\/master\/cmd\/topicmappr","title":null,"type":null},"content":[{"type":"text","text":"Kafka-Kit"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"等工具,更好的是"},{"type":"link","attrs":{"href":"https:\/\/github.com\/linkedin\/cruise-control","title":null,"type":null},"content":[{"type":"text","text":"Cruise-Control"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",它允許你根據多個目標(容量違規、副本計數違規、流量分佈等)"},{"type":"link","attrs":{"href":"https:\/\/www.youtube.com\/watch?v=jdo6F21gI8g","title":null,"type":null},"content":[{"type":"text","text":"自動執行"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這些任務。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"微調你的配置——生產者、broker、消費者"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"聽起來很明顯吧?其實在友好的Kafka集羣幕後,你可以啓用或更改大量設置。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這些設置會極大地影響集羣的工作方式:資源利用率、集羣可用性、保證、延遲等等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"例如,行爲不端的客戶端會影響你的broker資源利用率(CPU、磁盤等)。更改你的客戶端batch.size和linger.ms(與業務邏輯相適應)可以"},{"type":"link","attrs":{"href":"https:\/\/youtu.be\/tjjeaCtsw_M?t=531","title":null,"type":null},"content":[{"type":"text","text":"顯著降低"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"集羣LoadAvg和CPU使用率。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.instaclustr.com\/support\/documentation\/kafka\/monitoring-information\/message-conversions\/","title":null,"type":null},"content":[{"type":"text","text":"消息轉換"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"會帶來處理開銷,因爲Kafka客戶端和broker之間的消息需要轉換才能被雙方理解,從而導致更高的CPU使用率。升級你的生產者和消費者可以解決這一問題,將本來不需要用在這上面的資源釋放出來。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這樣的例子數不勝數。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"通過適當的微調,你可以讓集羣更好地運作、提供更多服務、增加吞吐量、釋放資源,從而避免不必要的擴容,甚至可以將集羣縮小到適合你需求的大小。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"閱讀和處理每一行文檔是一項不可能完成的任務,因此你可以先閱讀其中一篇關於調整客戶端和broker的帖子。我會推薦"},{"type":"link","attrs":{"href":"https:\/\/strimzi.io\/","title":null,"type":null},"content":[{"type":"text","text":"Strimzi"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"的系列相關文章——"},{"type":"link","attrs":{"href":"https:\/\/strimzi.io\/blog\/2021\/06\/08\/broker-tuning\/","title":null,"type":null},"content":[{"type":"text","text":"Broker"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"、"},{"type":"link","attrs":{"href":"https:\/\/strimzi.io\/blog\/2020\/10\/15\/producer-tuning\/","title":null,"type":null},"content":[{"type":"text","text":"生產者"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"和"},{"type":"link","attrs":{"href":"https:\/\/strimzi.io\/blog\/2021\/01\/07\/consumer-tuning\/","title":null,"type":null},"content":[{"type":"text","text":"消費者"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"未來"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"壓縮級別"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"如前所述,使用某種壓縮算法(最好是zstd)可以極大地提高性能並節省數據傳輸費用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"處理壓縮時,你需要面對一種新的權衡決策——CPU vs IO(壓縮後文件的大小)。大多數算法提供了一些壓縮級別可供選擇,這會影響所需的處理能力和壓縮後的數據集大小。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/3b\/3b9f24086e272a471760aa9200fc36a7.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"壓縮器支持的壓縮級別"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"到目前爲止,Kafka僅支持每種壓縮器的默認壓縮級別。這個問題將在"},{"type":"link","attrs":{"href":"https:\/\/cwiki.apache.org\/confluence\/pages\/viewpage.action?pageId=97550583","title":null,"type":null},"content":[{"type":"text","text":"KIP-390"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"上得到解決,KIP-390將隨Kafka3.0.0一起提供。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"實現這個KIP後,我們將能在配置中添加一行,將壓縮級別設置得更高(設置到合適的級別。記住更高的級別對應更多的資源開銷)"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"compression.type=gzip\ncompression.level=4 # NEW: Compression level to be used."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這個新特性針對真實世界的一個數據集(29218個JSON文件,平均大小爲55.25kb)進行了測試,結果可能因編碼、壓縮級別的不同和由此產生的延遲而異,如下所示"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/fa\/fa2b6b1ccfb16e7b0e829d6eba2e93eb.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"分層存儲"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"集羣上所需的總存儲量與主題和分區的數量、消息速率以及最重要的保留期限成正比。集羣上的每個Kafka broker通常都有大量磁盤,於是單個集羣上一般會有數十TB的磁盤存儲。Kafka已經成爲組織中所有數據的主要入口點,讓客戶端不僅可以消費"},{"type":"link","attrs":{"href":"https:\/\/leevs.dev\/apache-kafka-lag-monitoring-for-human-beings\/","title":null,"type":null},"content":[{"type":"text","text":"最近的事件"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",還可以根據主題保留來使用較舊的數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"人們很可能會添加更多broker或磁盤,以便根據客戶的需求在集羣上保存更多數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"另一種"},{"type":"link","attrs":{"href":"https:\/\/twitter.com\/gunnarmorling\/status\/1384903606407729153","title":null,"type":null},"content":[{"type":"text","text":"常見模式"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"是將你的數據管道拆分爲一些更小的集羣,並在上游集羣上設置更長的保留期限,以便在發生故障時進行數據恢復。使用這種方法時,你能完全“停止”管道,修復管道中的錯誤,然後流回所有“丟失”的事件,而不會丟失任何數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這兩種情況都需要你爲集羣增加容量,而且通常還會向集羣添加不必要的內存和CPU,於是與將舊數據存儲在外部存儲的方案相比,它們的總體存儲成本效率較低。具有更多節點的更大集羣也增加了部署的複雜性並增加了運營成本。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了解決這個問題,"},{"type":"link","attrs":{"href":"https:\/\/cwiki.apache.org\/confluence\/display\/KAFKA\/KIP-405%3A+Kafka+Tiered+Storage#KIP405:KafkaTieredStorage-Solution-TieredstorageforKafka","title":null,"type":null},"content":[{"type":"text","text":"KIP-405"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"分層存儲應運而生。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"使用分層存儲時,Kafka集羣配置有兩層存儲:本地和遠程。本地層和現在是一樣的,也就是使用本地磁盤——而新的遠程層會使用外部存儲層(如AWS S3或HDFS)來存儲完整的日誌segment,這會比本地磁盤便宜得多。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"你能爲每一層設置多個保留期限,因此對延遲更敏感且使用實時數據的服務可以從本地磁盤提供,而需要舊數據進行回填,或在事故後恢復的服務可以經濟地從外部層加載數週甚至數月時間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"除了明顯的磁盤成本下降(gp3每GB約0.08美元,S3約0.023美元)之外,你還可以根據服務需求(主要基於計算能力而非存儲)進行更準確的容量規劃,獨立於內存和CPU來擴展存儲,並節省不必要的broker和磁盤的費用。"}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"節省的成本是非常可觀的。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"你的broker恢復速度會更快,因爲它們在啓動時需要加載的本地數據要少得多。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"分層存儲已在Confluent Platform 6.0.0上可用,並將添加到Kafka3.0版本中。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"KIP-500:Kafka不需要Keeper"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"可能你還沒有聽說過,在未來的版本中,Kafka將移除其對ZooKeeper管理集羣元數據的依賴,並移至基於Raft的治理模式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這不僅將提供一種更具擴展性和健壯性的元數據管理方式,還能簡化Kafka的部署和配置(去除外部組件),而且還消除了Zookeeper部署的成本。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"如前所述,Kafka不僅僅是一堆broker,我們需要爲Zookeeper運行三五個節點;除了實例之外,你還需要算上它們的存儲和網絡的成本,以及難以衡量的運營開銷(例如監控、警報、升級、事件和關注)。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"要降低雲成本可以有很多方法。其中一些是可以在幾分鐘內實現的唾手可得的成果,而另一些則需要更深入的理解、嘗試和試錯。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這篇文章僅涵蓋了降低成本的一小部分方法,但更重要的是它試圖強調一個事實,即瞭解在雲上運行Kafka時我們需要支付哪些費用是很重要的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們需要意識到,有時雲成本可能會超出人們的想象。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"原文鏈接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/leevs.dev\/kafka-cost-reduction\/","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/leevs.dev\/kafka-cost-reduction\/"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章