Kafka和RocketMQ底層存儲之那些你不知道的事

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大家好,我是yes。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們都知道 RocketMQ 和 Kafka 消息都是存在磁盤中的,那爲什麼消息存磁盤讀寫還可以這麼快?有沒有做了什麼優化?都是存磁盤它們兩者的實現之間有什麼區別麼?各自有什麼優缺點?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"今天我們就來一探究竟。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"存儲介質-磁盤"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一般而言消息中間件的消息都存儲在本地文件中,因爲從效率來看直接放本地文件是最快的,並且穩定性最高。畢竟要是放類似數據庫等第三方存儲中的話,就多一個依賴少一份安全,並且還有網絡的開銷。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那對於將消息存入磁盤文件來說一個流程的瓶頸就是磁盤的寫入和讀取。我們知道磁盤相對而言讀寫速度較慢,那通過磁盤作爲存儲介質如何實現高吞吐呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"順序讀寫"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"答案就是"},{"type":"text","marks":[{"type":"strong"}],"text":"順序讀寫"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先了解一下"},{"type":"text","marks":[{"type":"strong"}],"text":"頁緩存"},{"type":"text","text":",頁緩存是操作系統用來作爲磁盤的一種緩存,減少磁盤的I/O操作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在寫入磁盤的時候其實是寫入頁緩存中,使得對磁盤的寫入變成對內存的寫入。寫入的頁變成髒頁,然後操作系統會在合適的時候將髒頁寫入磁盤中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在讀取的時候如果頁緩存命中則直接返回,如果頁緩存 miss 則產生缺頁中斷,從磁盤加載數據至頁緩存中,然後返回數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"並且在讀的時候會"},{"type":"text","marks":[{"type":"strong"}],"text":"預讀"},{"type":"text","text":",根據局部性原理當讀取的時候會把相鄰的磁盤塊讀入頁緩存中。在寫入的時候會"},{"type":"text","marks":[{"type":"strong"}],"text":"後寫"},{"type":"text","text":",寫入的也是頁緩存,這樣存着可以將一些小的寫入操作合併成大的寫入,然後再刷盤。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而且根據磁盤的構造,順序 I/O 的時候,磁頭幾乎不用換道,或者換道的時間很短。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"根據網上的一些測試結果,順序寫盤的速度比隨機寫內存還要快。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當然這樣的寫入存在數據丟失的風險,例如機器突然斷電,那些還未刷盤的髒頁就丟失了。不過可以調用 "},{"type":"codeinline","content":[{"type":"text","text":"fsync"}]},{"type":"text","text":" 強制刷盤,但是這樣對於性能的損耗較大。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此"},{"type":"text","marks":[{"type":"strong"}],"text":"一般建議通過多副本機制來保證消息的可靠,而不是同步刷盤"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以看到順序 I/O 適應磁盤的構造,並且還有預讀和後寫。 RocketMQ 和 Kafka 都是順序寫入和近似順序讀取。它們都採用文件追加的方式來寫入消息,只能在日誌文件尾部寫入新的消息,老的消息無法更改。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"mmap-文件內存映射"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從上面可知訪問磁盤文件會將數據加載到頁緩存中,但是頁緩存屬於內核空間,用戶空間訪問不了,因此數據還需要拷貝到用戶空間緩衝區。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/b5/b5363dc3f31688d6952e7487f0f975b7.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以看到數據需要從頁緩存再經過一次拷貝程序才能訪問的到,因此還可以通過"},{"type":"codeinline","content":[{"type":"text","text":"mmap"}]},{"type":"text","text":"來做一波優化,利用內存映射文件來避免拷貝。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"簡單的說"},{"type":"text","marks":[{"type":"strong"}],"text":"文件映射就是將程序虛擬頁面直接映射到頁緩存上,這樣就無需有內核態再往用戶態的拷貝,而且也避免了重複數據的產生"},{"type":"text","text":"。並且也不必再通過調用"},{"type":"codeinline","content":[{"type":"text","text":"read"}]},{"type":"text","text":"或"},{"type":"codeinline","content":[{"type":"text","text":"write"}]},{"type":"text","text":"方法對文件進行讀寫,"},{"type":"text","marks":[{"type":"strong"}],"text":"可以通過映射地址加偏移量的方式直接操作"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/4c/4cd8afdde5276528896a3e508c7183c9.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"sendfile-零拷貝"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"既然消息是存在磁盤中的,那消費者來拉消息的時候就得從磁盤拿。我們先來看看一般發送文件的流程是如何的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/39/39b78c605ad408bd0547b621e694e665.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"簡單說下"},{"type":"codeinline","content":[{"type":"text","text":"DMA"}]},{"type":"text","text":"是什麼,全稱 Direct Memory Access ,它可以"},{"type":"text","marks":[{"type":"strong"}],"text":"獨立地直接讀寫系統內存"},{"type":"text","text":",不需要 CPU 介入,像顯卡、網卡之類都會用"},{"type":"codeinline","content":[{"type":"text","text":"DMA"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以看到數據其實是冗餘的,那我們來看看"},{"type":"codeinline","content":[{"type":"text","text":"mmap"}]},{"type":"text","text":"之後的發送文件流程是怎樣的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/08/088735092f6fd8c77cdb6d20e7aeed38.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以看到上下文切換的次數沒有變化,但是數據少拷貝一份,這和我們上文提到的"},{"type":"codeinline","content":[{"type":"text","text":"mmap"}]},{"type":"text","text":"能達到的效果是一樣的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是數據還是冗餘了一份,這不是可以直接把數據從頁緩存拷貝到網卡不就好了嘛?"},{"type":"codeinline","content":[{"type":"text","text":"sendfile"}]},{"type":"text","text":"就有這個功效。我們先來看看Linux2.1版本中的"},{"type":"codeinline","content":[{"type":"text","text":"sendfile"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/98/986fcc0b3f99eff79d713d3a2ee9a0dd.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲就一個系統調用就滿足了發送的需求,相比 "},{"type":"codeinline","content":[{"type":"text","text":"read + write"}]},{"type":"text","text":" 或者 "},{"type":"codeinline","content":[{"type":"text","text":"mmap + write"}]},{"type":"text","text":" 上下文切換肯定是少了的,但是好像數據還是有冗餘啊。是的,因此 Linux2.4 版本的 sendfile + 帶 「分散-收集(Scatter-gather)」的DMA。實現了真正的無冗餘。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/80/80aac751a2a9e704683d5893f3aa3901.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這就是我們常說的零拷貝,在 Java 中"},{"type":"codeinline","content":[{"type":"text","text":"FileChannal.transferTo() "}]},{"type":"text","text":"底層用的就是"},{"type":"codeinline","content":[{"type":"text","text":"sendfile"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接下來我們看看以上說的幾點在 RocketMQ 和 Kafka中是如何應用的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"RocketMQ 和 Kafka 的應用"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"RocketMQ "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"採用"},{"type":"codeinline","content":[{"type":"text","text":"Topic混合追加方式"}]},{"type":"text","text":",即一個 CommitLog 文件中會包含分給此 Broker 的所有消息,不論消息屬於哪個 Topic 的哪個 Queue 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/26/26761503e77de61922a2df5139b39957.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所以所有的消息過來都是"},{"type":"text","marks":[{"type":"strong"}],"text":"順序追加寫入到 CommitLog 中"},{"type":"text","text":",並且建立消息對應的 CosumerQueue ,然後消費者是通過 CosumerQueue 得到消息的真實物理地址再去 CommitLog 獲取消息的。可以將 CosumerQueue 理解爲消息的索引。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 RocketMQ 中不論是 CommitLog 還是 CosumerQueue 都採用了 mmap。 "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/c5/c54b4d1ee8fe649ae4ef27dc656da60b.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在發消息的時候默認用的"},{"type":"text","marks":[{"type":"strong"}],"text":"是將數據拷貝到堆內存中,然後再發送"},{"type":"text","text":"。我們來看下代碼。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/55/55332a265a692727426adf2f3e55fd29.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以看到這個配置 "},{"type":"codeinline","content":[{"type":"text","text":"transferMsgByHeap"}]},{"type":"text","text":" 默認是 true ,那我們再看消費者拉消息時候的代碼。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/14/1418e71dcdf305ab64b24264c5e29646.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以看到 RocketMQ 默認把消息拷貝到堆內 Buffer 中,再塞到響應體裏面發送。但是可以通過參數配置不經過堆,不過也並沒有用到真正的零拷貝,而是通過mapedBuffer 發送到 SocketBuffer 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所以 RocketMQ 用了順序寫盤、mmap。並沒有用到 sendfile ,還有一步頁緩存到 SocketBuffer 的拷貝。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"然後拉消息的時候嚴格的說對於 CommitLog 來說讀取是隨機的,因爲 CommitLog 的消息是混合的存儲的,"},{"type":"text","marks":[{"type":"strong"}],"text":"但是從整體上看,消息還是從 CommitLog 順序讀的,都是從舊數據到新數據有序的讀取。"},{"type":"text","text":"並且一般而言消息存進去馬上就會被消費,因此消息這時候應該還在頁緩存中,所以不需要讀盤。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/b7/b7d848fda1c152a4834ba90b46322694.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而且我們在上面提到,"},{"type":"text","marks":[{"type":"strong"}],"text":"頁緩存會定時刷盤,這刷盤不可控,並且內存是有限的,會有swap等情況"},{"type":"text","text":",而且"},{"type":"text","marks":[{"type":"strong"}],"text":"mmap其實只是做了映射,當真正讀取頁面的時候產生缺頁中斷,纔會將數據真正加載到內存中,"},{"type":"text","text":"這對於消息隊列來說可能會產生監控上的毛刺。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此 RocketMQ 做了一些優化,有:"},{"type":"text","marks":[{"type":"strong"}],"text":"文件預分配和文件預熱"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"文件預分配"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CommitLog 的大小默認是1G,當超過大小限制的時候需要準備新的文件,而 RocketMQ 就起了一個後臺線程 "},{"type":"codeinline","content":[{"type":"text","text":"AllocateMappedFileService"}]},{"type":"text","text":",不斷的處理 "},{"type":"codeinline","content":[{"type":"text","text":"AllocateRequest"}]},{"type":"text","text":",AllocateRequest其實就是預分配的請求,會提前準備好下一個文件的分配,防止在消息寫入的過程中分配文件,產生抖動。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"文件預熱"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有一個"},{"type":"codeinline","content":[{"type":"text","text":"warmMappedFile"}]},{"type":"text","text":"方法,它會把當前映射的文件,每一頁遍歷多去,寫入一個0字節,然後再調用"},{"type":"codeinline","content":[{"type":"text","text":"mlock"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"madvise(MADV_WILLNEED)"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/05/059c84abb7182d1bd9ff4df3b7b3be82.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們再來看下"},{"type":"codeinline","content":[{"type":"text","text":"this.mlock"}]},{"type":"text","text":",內部其實就是調用了"},{"type":"codeinline","content":[{"type":"text","text":"mlock"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"madvise(MADV_WILLNEED)"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/7e/7e7cb08ea28f3b439c25a816ecd10c94.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"mlock:可以將進程使用的部分或者全部的地址空間鎖定在物理內存中,防止其被交換到swap空間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"madvise:給操作系統建議,說這文件在不久的將來要訪問的,因此,提前讀幾頁可能是個好主意。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"RocketMQ 小結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"順序寫盤,整體來看是順序讀盤,並且使用了 mmap,不是真正的零拷貝。又因爲頁緩存的不確定性和 mmap 惰性加載(訪問時缺頁中斷纔會真正加載數據),用了文件預先分配和文件預熱即每頁寫入一個0字節,然後再調用"},{"type":"codeinline","content":[{"type":"text","text":"mlock"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"madvise(MADV_WILLNEED)"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Kafka "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kafka 的日誌存儲和 RocketMQ 不一樣,它是一個分區一個文件。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/9a/9a139bec68755f3e867c26ca3efe309a.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kafka 的消息寫入對於單分區來說也是順序寫,如果分區不多的話從整體上看也算順序寫,它的日誌文件並沒有用到 mmap,而索引文件用了 mmap。但發消息 Kafka 用到了零拷貝。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於消息的寫入來說 mmap 其實沒什麼用,因爲消息是從網絡中來。而對於發消息來說 sendfile 對比 mmap+write 我覺得效率更高,因爲少了一次頁緩存到 SocketBuffer 中的拷貝。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"來看下Kafka發消息的源碼,最終調用的是 "},{"type":"codeinline","content":[{"type":"text","text":"FileChannel.transferTo"}]},{"type":"text","text":",底層就是 sendfile。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/f3/f338037e04eedda052f54159aa744096.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從 Kafka 源碼中我沒看到有類似於 RocketMQ的 mlock 等操作,我覺得原因是首先日誌也沒用到 mmap,然後 swap 其實可以通過 Linux 系統參數 "},{"type":"codeinline","content":[{"type":"text","text":"vm.swappiness"}]},{"type":"text","text":" 來調節,這裏建議設置爲1,而不是0。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"假設內存真的不足,設置爲 0 的話,在內存耗盡的情況下,又不能 swap,則會突然中止某些進程。設置個 1,起碼還能拖一下,如果有良好的監控手段,還能給個機會發現一下,不至於突然中止。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"RocketMQ & Kafka 對比"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先"},{"type":"text","marks":[{"type":"strong"}],"text":"都是順序寫入,不過 RocketMQ 是把消息都存一個文件中,而 Kafka 是一個分區一個文件"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"每個分區一個文件在"},{"type":"text","marks":[{"type":"strong"}],"text":"遷移或者數據複製層面上來說更加得靈活"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是"},{"type":"text","marks":[{"type":"strong"}],"text":"分區多了的話,寫入需要頻繁的在多個文件之間來回切換,對於每個文件來說是順序寫入的,但是從全局看其實算隨機寫入,並且讀取的時候也是一樣,算隨機讀"},{"type":"text","text":"。而就一個文件的 RocketMQ 就沒這個問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從發送消息來說 RocketMQ 用到了 mmap + write 的方式,並且通過預熱來減少大文件 mmap 因爲缺頁中斷產生的性能問題。而 Kafka 則用了 sendfile,相對而言我覺得 kafka 發送的效率更高,因爲少了一次頁緩存到 SocketBuffer 中的拷貝。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"並且 swap 問題也可以通過系統參數來設置。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"最後"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這篇文章中間寫 RocketMQ 卡殼了,源碼還是不太熟,有點繞。 多虧丁威大佬的點撥。不然我就陷入了死衚衕出不來了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後再推薦下丁威大佬和周繼鋒大佬的《RocketMQ技術內幕:RocketMQ架構設計與實現原理》。對 RocketMQ 有興趣的同學可以看看。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"文章如果哪裏有紕漏請抓緊聯繫我,感謝!"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"horizontalrule"},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"我是 yes,從一點點到億點點,我們下篇見"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"往期推薦:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"http://mp.weixin.qq.com/s?_biz=Mzg2MjEyNTk1Ng==&mid=2247484130&idx=1&sn=0a936cc7f03074f1eb2e0a8d35de1423&chksm=ce0de809f97a611f4452160478b5ba9361f4e9514362eaf68352edad0c8021bb108e162d4282&scene=21#wechatredirect","title":""},"content":[{"type":"text","text":"消息隊列面試熱點一鍋端"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"http://mp.weixin.qq.com/s?_biz=Mzg2MjEyNTk1Ng==&mid=2247484110&idx=1&sn=ab523ef71a05e2cb1e9c481b933ce320&chksm=ce0de825f97a6133104dc915e5436ded697021855f8883f915576f1bc904b2935c1053445a78&scene=21#wechatredirect","title":""},"content":[{"type":"text","text":"圖解+代碼|常見限流算法以及限流在單機分佈式場景下的思考"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s?_biz=Mzg2MjEyNTk1Ng==&mid=2247484214&idx=1&sn=3b0c6a5a7a4028b5776ed48e8553de42&chksm=ce0de9ddf97a60cb3cd2b6c15495b3d19234f1e690ae0422dc17781ae44b5124ba3458b76a9e&token=907972017&lang=zhCN#rd","title":""},"content":[{"type":"text","text":"表弟面試被虐我教他緩存連招"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"http://mp.weixin.qq.com/s?_biz=Mzg2MjEyNTk1Ng==&mid=2247484067&idx=1&sn=d6e578992cedcc421c0058edeaa876e2&chksm=ce0de848f97a615ee4591f320e0b56fd56b39ffc2af2769a7e724c6c3a5f5bbcbf87f2c84dc4&scene=21#wechatredirect","title":""},"content":[{"type":"text","text":"面試官:說說Kafka處理請求的全流程"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://juejin.im/post/5efdeae7f265da22d017e58d","title":""},"content":[{"type":"text","text":"Kafka索引設計的亮點"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://juejin.im/post/5ef6b94ae51d4534a1236cb0","title":""},"content":[{"type":"text","text":"Kafka日誌段讀寫分析"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章