Kafka系列10:面試題是否有必要深入瞭解其背後的原理?我覺得應該刨根究底(下)

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"前言"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在今天文章開始之前,想和粉絲朋友們先分享一個好消息,作者堅持以"},{"type":"text","marks":[{"type":"strong"}],"text":"原創"},{"type":"text","text":"的態度去努力寫好每一篇文章,同時得到了一小部分粉絲朋友們的認可和 InfoQ 寫作平臺的支持。在此非常感謝粉絲朋友的支持,同時也非常感謝 InfoQ 小編的認可。接下來我會繼續努力,不忘初心,用心寫好每一篇文章。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/25/259044ff007181eab9c2b679afc35973.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外最近忙着搬家和工作的事情,導致沒有多餘的時間來更文,希望朋友們能夠多多包涵。好了,今天我們我們來繼續分析 Kafka 的常見面試題。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"文章概覽"}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Kafka 的延遲隊列你有了解嗎?"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Kafka 的冪等性是怎麼實現的?"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"知道 ISR、AR 是什麼嗎?"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Kafka 中的 HW、LEO、LSO 知道是什麼意思嗎?"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Kafka 消息有順序嗎?如果有其順序性是怎麼保證的?"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":6,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"有遇到過消息重複消費的情況嗎?"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Kafka 的延遲隊列你有了解嗎?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kafka 的延遲隊列使用了一個叫“時間輪”的東西來實現的,聽起來牛逼哄哄的樣子,直接來看圖。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/bc/bc25192d526999747cf15d70bf978bd6.png","alt":null,"title":"時間輪原理圖","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"boxShadow"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"TickMs:時間單位,每一格代表一個時間跨度。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"CurrentTime:當前時間,表示當前時間輪運行到的位置。CurrentTime 運行到一個位置,表示當前位置對應的隊列中的任務需要被處理。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"TaskList:雙端任務隊列,其中每個 Task 就是一個實際要執行的任務。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"WheelSize:表示時間輪的容量大小。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Interval:表示時間輪的最長時間跨度,即最長可以存儲的時間跨度有多長。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從上圖可以看出,時間輪是由左側的類似“時鐘”的轉盤和右側的任務隊列組成。其中左側的轉盤是由一個數組實現的循環隊列。看到這裏大家應該會有一個疑問,假設時間單位是毫秒,現在需要一個 30 分鐘大小的延遲隊列,那麼時間輪的大小應該是 30 * 60 * 1000 = 1800000 大小的數組才能實現。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"相信沒幾個人願意直接初始化一個這麼大的數組,聰明的人類總是會想出各種辦法來解決問題,所以就出現了“二級”時間輪和“三級”時間輪。我們暫且把上面的圖稱之爲“一級”時間輪,其每個相鄰位置的時間跨度爲 1ms,總時間跨度爲 1ms * 8 = 8ms,二級時間輪相鄰位置的時間跨度以一級時間輪總時間跨度爲基準,則其二級時間輪的總跨度爲 8ms * 8 = 64ms,即“二級”時間輪可以容納 64ms 內的任務,同理,“三級”時間輪以“二級”時間輪的總跨度爲基準,則“三級”時間輪的總跨度爲 64ms * 8 = 512ms,即三級時間輪可以容納 512ms 內的任務。假設“一級”時間輪的 WheelSize 設置爲 100,30 分鐘的延遲任務只需使用三個時間輪,總容量 300 大小就可以容納所有的任務了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"需要注意的一點是,“二級”和“三級”時間輪是不直接執行任務的,當 CurrentTime 執行到時,會將其對應的任務進行降級操作,降級就是“三級”時間輪的任務降級到“二級”時間輪上,“二級”時間輪上的任務降級到“一級”時間輪上,即所有的任務實際是在“一級”時間輪上執行的。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Kafka 的冪等性是怎麼實現的?"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"冪等性原理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 kafka0.11 之前是沒有冪等性的概念的,在 0.11 之後 Kafka 通過引入 PID 實現了單 Partition 的冪等性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"消息生產端生產的每條消息都會帶上一個 PID 值,Broker 端也會緩存當前 Partition 對應的 PID,Broker 接收到消息以後會判斷 PID 值,此時可能會產生三種情況。"}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"【Broker 中緩存的 PID - 生產端發送的 PID <= 0】,此時認爲消息是發生了重傳,直接丟棄該條消息。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"【Broker 中緩存的 PID - 生產端發送的 PID > 1】,此時認爲中間的消息發生了丟失,直接丟棄掉該條消息。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"【Broker 中緩存的 PID - 生產端發送的 PID = 1】,此時認爲消息是單調遞增,可以正常寫入。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"開啓冪等性"}]},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"Properties props = new Properties();\nprops.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, \"localhost:9092\");\n// 注意,啓用冪等性時,需要將其設置爲all,否則會報錯\nprops.put(ProducerConfig.ACKS_CONFIG, \"all\");\nprops.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());\nprops.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());\n// 啓用冪等性\nprops.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, \"true\");\n\nKafkaProducer kafkaProducer = new KafkaProducer<>(props);\nkafkaProducer.send(new ProducerRecord(\"truman_kafka_center\", \"1\", \"hello world.\")).get();\nkafkaProducer.close();"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"知道 ISR、OSR、AR 是什麼嗎?"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"ISR:全稱 In-Sync-Replicas,表示當前正在同步的從 Partition,該列表是由主 Partition 進行維護的,在該列表中的 Partition 定期從主 Partition 上拉取數據進行同步,若在指定週期內沒同步數據,則認爲該從 Partition 失效,從 ISR 列表中剔除,並將其移入到 OSR 列表中。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"OSR:全稱 Out-Sync-Replicas,沒有進行數據同步的從 Partition 列表,其中包括失效的從 Partition 和剛剛加入進來的新 Partition。處於 OSR 列表中的 Partition 不能夠進行數據的同步操作。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"AR:全稱 Assigned-Replicas,一個 Partition 對應的主從所有的 Partition 列表,AR = ISR + OSR。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Kafka 中的 HW、LEO、LSO 知道是什麼意思?"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/bf/bfceaa15f88f7eaaef44656942ff6608.png","alt":null,"title":"位置示意圖","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"boxShadow"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"HW:全稱 High-Water,即“高水位”,處於 HW 之前的數據纔可能被正常消費,處於 HW 之後的數據不能夠被消費。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"LEO:全稱 Log-End-Offset,表示下一條消息被寫入的位置。注意,每次消息被寫入後會更新 LEO 值,所以 LEO 值代表的不是當前最新一條消息的位置,而是下一條消息要被寫入的位置。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"LSO:全稱 Last-Stable-Offset,表示爲 LSO 之前的消息都已經被確認,而在 LSO 之後的消息還未被確認,其主要被用於事務。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Kafka 消息有順序嗎?如果有其順序性是怎麼保證的?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kafka 無法做到消息全局有序,只能做到 Partition 維度的有序。所以如果想要消息有序,就需要從 Partition 維度入手。一般有兩種解決方案。"}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"單 Partition,單 Consumer。通過此種方案強制消息全部寫入同一個 Partition 內,但是同時也犧牲掉了 Kafka 高吞吐的特性了,所以一般不會採用此方案。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"多 Partition,多 Consumer,指定 key 使用特定的 Hash 策略,使其消息落入指定的 Partition 中,從而保證相同的 key 對應的消息是有序的。此方案也是有一些弊端,比如當 Partition 個數發生變化時,相同的 key 對應的消息會落入到其他的 Partition 上,所以一旦確定 Partition 個數後就不能在修改 Partition 個數了。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"有遇到過消息重複消費的情況嗎?是怎麼解決的?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有,發生過兩次重複消費的情況。發現用戶的\"xx\"計數偶現大於實際情況,排查日誌發現大概意思是心跳檢測異常導致 commit 還沒有來得及提交,對應的 Partition 被重新分配給其他的 Consumer 消費導致消息被重複消費。"}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"解決方式 1:調整降低消費端的消費速率、提高心跳檢測週期。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過方案 1 調整參數後,還是會出現重複消費的情況,只是出現的概率降低了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"2","normalizeStart":"2"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"解決方案 2:在業務層增加 Redis,在一定週期內,相同 key 對應的消息認爲是同一條,如果 Redis 內不存在則正常消費消費,反之直接拋棄。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"常見的Kafka面試題先總結這些,但遠不止這些。本文會持續收集更新常見的面試題,同時也希望朋友們積極留言面試中見過的面試題。"}]},{"type":"horizontalrule"},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"精彩文章推薦"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s?__biz=MzIzNTIzNzYyNw==&mid=2247484048&idx=1&sn=2de06942948b02842fd042e7783cbc61&chksm=e8eb7b04df9cf2121a98984fae0207d422b9d643a2c3ff4d7c91cabfe2e8b4d530a62c376035&token=1378636505&lang=zh_CN&scene=21#wechat_redirect","title":null},"content":[{"type":"text","text":"面試題是否有必要深入瞭解其背後的原理?我覺得應該刨根究底(上)"}],"marks":[{"type":"strong"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s?__biz=MzIzNTIzNzYyNw==&mid=2247483919&idx=1&sn=ee4c393dabadcecd51b34380425d227a&chksm=e8eb7b9bdf9cf28da20a8083d772d6744850256e7b338aa79225889c69f8395ab6019808d946&token=1378636505&lang=zh_CN&scene=21#wechat_redirect","title":null},"content":[{"type":"text","text":"你必須要知道集羣內部工作原理的一些事!"}],"marks":[{"type":"strong"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s?__biz=MzIzNTIzNzYyNw==&mid=2247483881&idx=1&sn=6624157a86dea60abd8a6842c863ba4e&chksm=e8eb787ddf9cf16b85561dcd25144a73304709a14f615291a5374f9ca2bf05a2284b181be1af&token=1378636505&lang=zh_CN&scene=21#wechat_redirect","title":null},"content":[{"type":"text","text":"消息是如何在服務端存儲與讀取的,你真的知道嗎?"}],"marks":[{"type":"strong"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s?__biz=MzIzNTIzNzYyNw==&mid=2247483827&idx=1&sn=44cfb50953b0f745b2719977453c3a42&chksm=e8eb7827df9cf131ea8aa1496e9ec2d1ed22ed245225dad0613511006b9428fffb16c92cdd6a&token=1378636505&lang=zh_CN&scene=21#wechat_redirect","title":null},"content":[{"type":"text","text":"一文讀懂消費者背後的那點\"貓膩\""}],"marks":[{"type":"strong"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"微信公衆號搜索"},{"type":"text","marks":[{"type":"strong"}],"text":"【z小趙】"},{"type":"text","text":",更多系列精彩文章等你解"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/86/862b51ac70a151f66fa21406f2affc39.gif","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章