Pulsar 和 Kafka 基準測試報告(上)

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了更全面地瞭解 Pulsar 和 Kafka,我們“復現”了 Confluent 對 Pulsar 和 Kafka 基準測試。重複這一基準測試的原因有兩個,一是 Confluent 的測試方法存在一些問題;二是 Confluent 的測試範圍和測試場景不夠全面。爲了更準確地對比 Pulsar 和 Kafka,我們在測試中不僅修復了 Confluent 測試中的問題,還擴大了測試範圍,納入更多性能衡量標準,模擬更多實際場景。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"和 Confluent 的測試相比,我們的測試主要有三項改進:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"包含 Pulsar 和 Kafka 支持的所有持久性級別。在同等持久性級別下,對比二者的吞吐量和延遲。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"引入影響性能的其他因素和測試條件,如分區數量、訂閱數量、客戶端數量等。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"測試的混合負載同時包含寫入、追趕讀和追尾讀,模擬實際使用場景。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們進行了最大吞吐量測試、發佈和端到端延遲測試★、追趕讀測試和混合工作負載測試。由於篇幅有限,本篇主要介紹發佈和端到端延遲測試詳情。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"最大吞吐量測試"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"測試目標:觀測在處理髮布和追尾讀工作負載時 Pulsar 和 Kafka 可實現的最大吞吐量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"測試設置:通過調整分區數量,觀測分區數量對吞吐量的影響。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"測試策略:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"將所有消息都複製三次,確保容錯;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"改變 ack 數量,測試在不同持久性保證下,Pulsar 和 Kafka 的最大吞吐量;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"啓用 Pulsar 和 Kafka 的批處理,爲不超過 10 ms 的響應延遲設置最大批處理爲 1 MB 數據;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"改變分區數量(1、100、2000 個分區),分別測試最大吞吐量;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當分區數量爲 100 和 2000 時,使用 2 個 producer 和 2 個 consumer;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當分區數量爲 1 時,改變 producer 和 consumer 的數量,觀測吞吐量變化;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"消息大小爲 1 KB;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在所有測試場景中,改變持久性級別,觀測最大吞吐量。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"測試結果"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"#1 100 個分區,1 個訂閱,2 個生產者\/2 個消費者"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分區數量爲 100,改變持久性保證,分別觀測 Pulsar 和 Kafka 的最大吞吐量。在 Pulsar 和 Kafka 中,都使用 1 個訂閱、2 個生產者和 2 個消費者。測試結果如下。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 1 級持久性保證(同步複製持久性,異步本地持久性)下,Pulsar 的最大吞吐量約爲 300 MB\/s,達到日誌磁盤帶寬物理極限。Kafka 的最大吞吐量約爲 420 MB\/s。值得注意的是,在持久性爲 1 級時,Pulsar 配置一個磁盤爲日誌磁盤進行寫入,另一個磁盤爲 ledger 磁盤進行讀取;而 Kafka 同時使用兩個磁盤進行寫入和讀取。儘管 Pulsar 的設置能夠提供更好的 I\/O 隔離,但單個磁盤最大帶寬(~300 MB\/s)會限制吞吐量。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"將持久性(同步複製持久性和異步本地持久性)配置爲 2 級時,Pulsar 和 Kafka 的最大吞吐量均可達到約 600 MB\/s 。兩個系統都達到了磁盤帶寬的物理極限。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖 1 爲 100 個分區,同步本地持久性下,Pulsar 和 Kafka 的最大吞吐量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/78\/26\/78502bf0363371176d35dee17b76ee26.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖1 有 100 個分區時,Pulsar 和 Kafka 的最大吞吐量(同步本地持久性)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖 2 爲 100 個分區,異步本地持久性下,Pulsar 和 Kafka 的最大吞吐量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/f5\/1a\/f5e4219a0f017cfa138aaf084352461a.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖2 100 個分區時,Pulsar 和 Kafka 的最大吞吐量(異步本地持久性)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"#2 2000 個分區,1 個訂閱,2 個生產者\/2 個消費者"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分區數量從 100 增加到 2000,持久性保證不變(acks = 2),分別觀測 Pulsar 和 Kafka 的最大吞吐量。在 Pulsar 和 Kafka 中,都使用 1 個訂閱、兩個生產者和兩個消費者。測試結果如下。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 1 級持久性保證下,Pulsar 的最大吞吐量保持在約 300 MB\/s,在 2 級持久性保證下則增加到約 600 MB\/s;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"單獨爲每條消息刷新數據時(kafka-ack-all-sync),Kafka 的最大吞吐量從 600MB\/s (100 個分區) 降到了 300MB\/s 左右;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"使用系統默認的持久性設置(kafka-ack-all-nosync)時,Kafka 的最大吞吐量從約500 MB\/s(100 個分區)下降到約 300 MB\/s。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了瞭解 Kafka 吞吐量下降的原因,我們繪製了 Kafka 和 Pulsar 在每一持久性保證下的平均發佈延遲圖。圖 3 表明,分區數量增加到 2000 時,Kafka 的平均發佈延遲增加到 200 毫秒,P99 發佈延遲增加到 1200 毫秒。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/33\/83\/33a0c8c907747yycfa18319f8a6f8c83.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖3 2000 個分區時,Pulsar 和 Kafka 的最大吞吐量"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"發佈延遲通常會對吞吐量造成顯著影響。但由於 Pulsar 客戶端充分利用了 Netty 強大的異步網絡框架,Pulsar 的吞吐量沒有受到影響。而 Kafka 客戶端使用同步實現,Kafka 的吞吐量的確受到影響。把 producer 數量增加一倍可以提高 Kafka 的吞吐量。如果生產者數量增加到 4 個,Kafka 的吞吐量能達到約 600 MB\/s。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖 4:2000 個分區時,Pulsar 和 Kafka 的發佈延遲"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/48\/a8\/48d31fd9bef30fe2c89d79f52e78cba8.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖4 2000 個分區時,Pulsar 和 Kafka 的發佈延遲"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"#3 1 個分區,1 個訂閱,2 個生產者\/2 個消費者"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"增加更多 broker 和分區有助於提高 Pulsar 和 Kafka 的吞吐量。爲了更深入地瞭解這兩個系統的效率,我們把分區數量設爲 1,觀測 Pulsar 和 Kafka 的最大吞吐量。在 Pulsar 和 Kafka 中,都使用 1 個訂閱、2 個生產者和 2 個消費者。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"測試結果如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在所有持久性級別,Pulsar 的最大吞吐量都達到了約 300 MB\/s;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在異步複製持久性下,Kafka 的最大吞吐量達到了約 300 MB\/s,但在同步複製持久性下只有約 160 MB\/s。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖 5:在 1 個分區,同步本地持久性下,Pulsar 和 Kafka 的最大吞吐量"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/39\/71\/3916879b64d6db3dafd3002c7b596871.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖5 1 個分區時,Pulsar 和 Kafka 的最大吞吐量(同步本地持久性)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖 6:在 1 個分區,異步本地持久性下,Pulsar 和 Kafka 的最大吞吐量"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/85\/79\/85295819ecb358c0e6079d867e0ca779.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖6 1 個分區時,Pulsar 和 Kafka 的最大吞吐量(異步本地持久性)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"#4 1 個分區,1 個訂閱,1 個生產者\/1 個消費者"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分區數量爲 1,訂閱數量爲 1(與上個測試相同),分別觀測 Pulsar 和 Kafka 的最大吞吐量。在 Pulsar 和 Kafka 中,只使用 1 個生產者和 1 個消費者。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"測試結果如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在所有持久性級別,Pulsar 的最大吞吐量都保持在約 300 MB\/s;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在異步複製持久性下,Kafka 的最大吞吐量從約 300 MB\/s(測試#3中)下降到約 230 MB\/s;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在同步複製持久性下,Kafka 的吞吐量從約 160 MB\/s(測試#3中)下降到約 100 MB\/s。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖 7 爲在同步本地持久性,一個分區、一個生產者和一個消費者的情況下,Pulsar 和 Kafka 的最大吞吐量"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/e7\/f0\/e7acd0eb629cc1efd0d7c70413yye3f0.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖7 一個分區、一個生產者和一個消費者時,Pulsar 和 Kafka 的最大吞吐量(同步本地持久性)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了瞭解 Kafka 吞吐量下降的原因,我們繪製了 Kafka 和 Pulsar 在不同持久性保證下的平均發佈延遲圖(圖8)和端到端延遲圖(圖9)。從下圖中可以看到,即使只有一個分區,Kafka 的發佈延遲和端到端延遲也從幾毫秒上升到了幾百毫秒。減少生產者和消費者數量會對 Kafka 吞吐量造成顯著影響。相比之下,Pulsar 的延遲始終保持在幾毫秒。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/d4\/36\/d4970b2b13524b6223a91134ac419436.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖8 一個分區、一個生產者和一個消費者時,Pulsar 和 Kafka 的發佈延遲(同步持久性)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/7c\/96\/7c2a12b793579678ce6c85cedda57096.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖9 一個分區、一個生產者和一個消費者時,Pulsar 和 Kafka 的端到端延遲(同步持久性)"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"發佈和端到端延遲測試"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"該測試詳情參見 ★。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"追趕讀測試"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"測試目標:觀測在處理僅包括追趕讀工作負載時 Pulsar 和 Kafka 可實現的最大吞吐量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"測試策略:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"將所有消息都複製三次,確保容錯;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"改變 ack 數量,測試在不同持久性保證下,Pulsar 和 Kafka 的吞吐量變化;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"啓用 Pulsar 和 Kafka 的批處理,爲不超過 10 ms 的響應延遲設置最大批處理爲 1 MB 數據;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 100 個分區的情況下,對兩個系統進行基準測試;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"共運行四個客戶端--兩個生產者和兩個消費者;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"消息大小爲 1 KB。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"測試一開始,生產者開始以 200K\/s 的固定速率發送消息。隊列中積累了 512GB 數據後,消費者開始處理數據,首先從頭讀取累積數據,然後在數據到達時繼續處理傳入數據。在測試期間,生產者持續以相同速率發送消息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們評估了每個系統讀取 512GB 累積數據的速度,在不同持久性設置下對 Kafka 和 Pulsar進行了比較。各項測試結果如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"#1 啓用 Pulsar 繞過 Journal 寫入的異步本地持久性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本項測試中,我們在 Pulsar 和 Kafka 上使用了同等的異步本地持久性保證。我們啓用了新的 Pulsar 繞過 Journal 寫入,來匹配 Kafka 默認 fsync 設置中的本地持久性保證。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖 33 給出了測試結果:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"只處理追趕讀時,Pulsar 的最大吞吐量達到了 370 萬條信息\/秒(3.5 GB\/s)。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kafka 的最大吞吐量只達到 100 萬條消息\/秒(1GB\/s)。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pulsar 處理追趕讀的速度比 Kafka 快 75%。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/09\/fc\/094b7d7797e2bb598b98c528730a5afc.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖33 Pulsar 和 Kafka 的追趕讀吞吐量(繞過 Journal 寫入)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"#2 未啓用 Pulsar 繞過 Journal 寫入的異步本地持久性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本項測試中,我們在 Pulsar 和 Kafka 上使用了異步本地持久性保證,但並未啓用 Pulsar 繞過 Journal 寫入。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖 34 給出了測試結果:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"處理追趕讀時,Pulsar 的最大吞吐量達到 180 萬條信息\/秒(1.7 GB\/s)。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kafka 的最大吞吐量僅達到 100 萬條消息\/秒(1GB\/s)。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pulsar 處理追趕讀的速度是 Kafka 的兩倍。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/c6\/57\/c6334194d8ab0f09e8d8cb2d6b23c157.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖34 Pulsar 和 Kafka 的追趕讀吞吐量(數據同步)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"#3 同步本地持久性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本項測試中,我們在同等同步本地持久性保證下對 Kafka 和 Pulsar 進行了比較。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖 35 給出了測試結果:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"只處理追趕讀時,Pulsar 的最大吞吐量達到 180 萬條\/秒(1.7 GB\/s)。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kafka 的最大吞吐量只達到 100 萬條消息\/秒(1GB\/s)。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pulsar 處理追趕讀的速度是 Kafka 的兩倍。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/23\/49\/231d46c43b439c734f27194db87de849.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖35 Pulsar 和 Kafka 的追趕讀吞吐量(無數據同步)"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"混合工作負載測試"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"測試目標:評估追趕讀對混合工作負載中的發佈和追尾讀的影響。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"測試策略:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"將所有消息都複製三次,確保容錯;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"啓用 Pulsar 和 Kafka 的批處理,爲不超過 10 ms 的響應延遲設置最大批處理爲 1 MB 數據;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 100 個分區的情況下,對兩個系統進行基準測試;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在不同耐久性設置下對 Kafka 和 Pulsar 進行比較;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"共運行四個客戶端--兩個生產者和兩個消費者;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"消息大小爲 1 KB;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"測試一開始,兩個生產者均開始以 200K\/s 的固定速率發送數據,兩個消費者中有一個立即開始處理追尾讀。隊列中積累了 512GB 數據後,另一個(追趕讀)消費者開始從頭讀取累積數據,然後在數據到達時繼續處理傳入數據。在測試期間,兩個生產者持續以相同的速率發佈,而追尾讀消費者繼續以相同的速率消費數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"測試結果如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"#1 啓用 Pulsar 繞過 Journal 寫入的異步本地持久性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本項測試中,我們在同等異步本地持久性保證下對 Kafka 和 Pulsar 進行了比較。我們啓用了新的 Pulsar 繞過 Journal 寫入,來匹配 Kafka 默認 fsync 設置的本地持久性保證。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖 36 顯示,追趕讀對 Kafka 寫入造成顯著影響,但對 Pulsar 影響很小。Kafka 的 P99 發佈延遲增加到 1-3 秒,而 Pulsar 的延遲則穩定在幾毫秒到幾十毫秒間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/7f\/da\/7fa6bca90955eb9d792e0b521aeba5da.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖36 追趕讀對 Pulsar 和 Kafka 發佈延遲的影響(繞過 Journal 寫入)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"#2 未啓用 Pulsar 繞過 Journal 寫入的異步本地持久性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本項測試中,我們在 Pulsar 和 Kafka 上使用了異步本地持久性保證,但未啓用 Pulsar 繞過 Journal 寫入。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖 37 顯示,追趕讀對 Kafka 寫入造成顯著影響,但對 Pulsar 影響很小。Kafka 的 P99 發佈延遲增加到 2-3 秒,而 Pulsar 的延遲則穩定在幾毫秒到幾十毫秒之間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/94\/92\/947a072c2b4bc8e08f0257154577f792.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖37 追趕讀對 Pulsar 和 Kafka 發佈延遲的影響(數據同步)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"#3 同步本地持久性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本項測試中,我們在同等同步本地持久性保證下對 Kafka 和 Pulsar 進行了比較。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖 38 顯示,追趕讀對 Kafka 寫入造成顯著影響,但對 Pulsar 影響很小。Kafka P99 發佈延遲增加到約 1.2 至 1.4 秒,而 Pulsar 的延遲則穩定在幾毫秒到幾十毫秒之間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/cd\/3e\/cdef233b57da1d2d99d435751c89543e.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖38 追趕讀對 Pulsar 和 Kafka 發佈延遲的影響(無數據同步)"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"結論"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於基準測試結果,我們得出以下結論:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"糾正配置和調優錯誤後,Pulsar 與 Kafka 在 Confluent 有限測試用例中實現的端到端延遲基本一致。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在同等持久性保證下,Pulsar 在模擬實際應用場景的工作負載上性能優於 Kafka。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以上測試中,改變了持久性保證設置、訂閱、分區和客戶端數量,結果表明 Pulsar 在延遲和 I\/O 隔離方面均明顯優於 Kafka。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文鏈接:https:\/\/streamnative.io\/en\/blog\/tech\/2020-11-09-benchmark-pulsar-kafka-performance-report"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"相關閱讀鏈接:https:\/\/www.infoq.cn\/article\/xeyeEeNNY5CG0PGyxeVD"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pulsar 用戶調研:https:\/\/forms.office.com\/Pages\/ResponsePage.aspx?id=2zjkx2LkIkypCsNYsWmAs96ZDwmey39DhXAvi6EqbJpUNlZWQzRPMlVWNTc1WUcwUE5CWFMyUlI3QS4u"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章