Pulsar 和 Kafka 基準測試:哪個性能更優?

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"摘要:"},{"type":"text","text":"如今,爲開發新產品和服務,許多公司紛紛開始關注實時數據流應用程序。企業必須首先了解不同事件流系統的優勢和差異,才能選出與其業務需求最匹配的技術。基準測試是各企業比較和衡量不同技術性能的一種方法。爲了使該測試有參考價值,必須準確開展測試,並輸出準確的信息。遺憾的是,總有諸多因素會影響基準測試的準確性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Confluent 最近開展了一次基準測試,對比 Kafka、Pulsar 和 RabbitMQ 的吞吐量和延遲差異。Confluent 博客顯示,Kakfa 能夠以“低延遲”實現“最佳吞吐量”,而 RabbitMQ 能夠以“較低的吞吐量” 達到 “低延遲”。總體而言,基準測試結果顯示 Kafka 在“速度”方面無疑更勝一籌。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kafka 技術成熟完善,但當今衆多公司(從跨國公司到創新型初創公司)還是首先選擇了 Pulsar。在近期舉辦的 Splunk 峯會 conf20 上,Splunk 公司首席產品官 Sendur Sellakumar 對外宣佈,他們決定用 Pulsar 取代 Kafka:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"“...我們已把 Apache Pulsar 作爲基礎流。我們把公司的前途壓在了企業級多租戶流的長期架構上。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"-- Splunk 首席產品官 Sendur Sellakumar "}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"很多公司都在使用 Pulsar,Splunk 只是其中一例。這些公司之所以選擇 Pulsar,是因爲在現代彈性雲環境(如 Kubernetes)中,Pulsar 能夠以經濟有效的方式橫向擴展處理海量數據,不存在單點失效的問題。同時,Pulsar 具有諸多內置特性,諸如數據自動重平衡、多租戶、跨地域複製和持久化分層存儲等,不僅簡化了運維,同時還讓團隊更容易專注於業務目標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"開發者們最終選擇 Pulsar 是因爲 Pulsar 這些獨特的功能和性能,讓 Pulsar 成了流數據的基石。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"瞭解了這些情況後,還需仔細研究 Confluent 的基準測試設置和結論。我們發現有兩個問題存在高度爭議。其一,Confluent 對 Pulsar 的瞭解有限,這正是造成結論不準確的最大根源。如不瞭解 Pulsar,就不能用正確的衡量標準來測試 Pulsar 性能。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其二,Confluent 的性能測試基於一組狹窄的測試參數。這限制了結果的適用性,也無法爲讀者提供不同工作負載和實際應用場景相匹配的準確結果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了向社區提供更準確的測試結果,我們決定解決這些問題並重複測試。重要調整包括:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"我們調整了基準測試設置,包含了 Pulsar 和 Kafka 支持的各持久性級別,在同一持久性級別下對比兩者的吞吐量和延遲。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"我們修復了 OpenMessaging 基準測試(OMB)框架,消除因運用不同實例產生的變量,並糾正了 OMB Pulsar 驅動程序中的配置錯誤。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"最後,我們測量了其他性能因素和條件,例如分區的不同數量和包含 write、tailing-read 和 catch-up read 的混合工作負載,更全面地瞭解性能。 "}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"完成這些工作之後,我們重複了測試。測試結果顯示,對於更接近真實工作負載的場景,Pulsar 的性能明顯優於 Kafka,而對於 Confluent 在測試中所應用的基本場景,Pulsar 性能與 Kafka 性能相當。 "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以下各部分將重點說明本次測試得出的最重要結論。在 StreamNative 基準測試結果章節,我們詳細介紹了測試設置和測試報告。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"StreamNative 基準測試結果概要"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" 在與 Kafka 的持久性保證相同的情況下,Pulsar 可達到 605 MB\/s 的發佈和端到端吞吐量(與 Kafka 相同)以及 3.5 GB\/s 的 catch-up read 吞吐量(比 Kafka 高 3.5 倍)。Pulsar 的吞吐量不會因分區數量的增加和持久性級別的改變而受到影響,而 Kafka 的吞吐量會因分區數量或持久性級別的改變而受到嚴重影響。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

 

持久性級別

分區

Pulsar

Kafka

 

 

發送和實時數據讀取最大吞吐(MB\/s)

 

1 級持久性

1

300 MB\/s

160 MB\/s

100

300 MB\/s

420 MB\/s

2000

300 MB\/s

300 MB\/s

2 級持久性

1

300 MB\/s

180 MB\/s

100

605 MB\/s

605 MB\/s

2000

605 MB\/s

300 MB\/s

發送和歷史數據讀取最大吞吐 (MB\/s)

1 級持久性

100

1.7 GB\/s

1 GB\/s

2 級持久性

100

3.5 GB\/s

1 GB\/s"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"表 1:在不同工作負載及不同持久性保證下,Pulsar 與 Kafka 的吞吐量差異"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":2,"normalizeStart":2},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"在不同的測試實例(包括不同訂閱數量、不同主題數量和不同持久性保證)中,Pulsar 的延遲顯著低於 Kafka。"},{"type":"text","text":"Pulsar P99 延遲在 5 到 15 毫秒之間。Kafka P99 延遲可能長達數秒,並且會因主題數量、訂閱數量和不同持久性保證而受到巨大影響。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

 

分區和訂閱

本地持久性

複製持久性

Pulsar

Kafka

端到端 P99 延遲(毫秒)

 

(發佈+ Tailing Reads)

100 個分區,1 個訂閱

同步

Ack-1

5.86

18.75

Ack-2

11.64

64.62

異步

Ack-1

5.33

6.94

Ack-2

5.55

10.43

100 個分區,10 個訂閱

同步

Ack-1

7.12

145.10

Ack-2

14.65

1599.79

異步

Ack-1

6.84

89.80

Ack-2

6.94

1295.78"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"表 2:在不同訂閱數量及不同持久性保證下,Pulsar 與 Kafka 端到端 P99 延遲差異"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

 

本地持久性

複製持久性

分區

Pulsar

Kafka

端到端 P99 延遲(毫秒)

 

(發佈+ Tailing Reads)

同步

Ack-1

100

5.86

18.75

5000

6.26

79236

10000

6.67

187840

Ack-2

100

11.64

64.62

5000

14.38

157960

10000

15.78

197140

異步

Ack-1

100

5.33

6.94

5000

5.75

86641

10000

6.64

184513

Ack-2

100

5.55

10.43

5000

6.20

116028

10000

7.50

200793"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"表 3:在不同主題數量及不同持久性保證下,Pulsar 與 Kafka 端到端 P99延遲差異"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":3,"normalizeStart":3},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Pulsar 的 I\/O 隔離顯著優於 Kafka。在有消費者 catch up 讀取歷史數據時,Pulsar P99 發佈延遲仍然在 5 毫秒左右。相比之下,Kafka 的延遲會因 catch up read 而受到嚴重影響。Kafka P99 發佈延遲可能會從幾毫秒增加到幾秒。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

 

本地持久性

複製持久性

Pulsar

Kafka

發佈 P99 延遲(毫秒)

 

 

(混合負載)

同步

Ack-1

5.89

13.48

Ack-2

15.39

2091.31

異步

Ack-1

10.44

9.51

Ack-2

35.51

1014.95"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"表 4:在 catch up read 下,Pulsar 和 Kafka P99發佈延遲差異"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們所有的基準測試均"},{"type":"link","attrs":{"href":"https:\/\/github.com\/streamnative\/openmessaging-benchmark","title":"","type":null},"content":[{"type":"text","marks":[{"type":"underline"}],"text":"開源"}]},{"type":"text","marks":[{"type":"underline"}],"text":"(Github網址:"},{"type":"link","attrs":{"href":"https:\/\/github.com\/streamnative\/openmessaging-benchmark","title":"","type":null},"content":[{"type":"text","text":"https:\/\/github.com\/streamnative\/openmessaging-benchmark"}]},{"type":"text","text":"),感興趣的讀者可以自行生成結果,也可更深入研究該測試結果及倉庫中提供的指標。 "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"儘管我們的基準測試比 Confluent 的基準測試更準確全面,但並未涵蓋全部場景。歸根結底,通過自己的硬件\/實際工作負載進行的測試,是任何一個基準測試都替代不了的。我們也鼓勵各位讀者評估其他變量和場景,並利用自己的設置和環境進行測試。 "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"深入探究 Confluent 基準測試"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Confluent 將 "},{"type":"link","attrs":{"href":"http:\/\/openmessaging.cloud\/docs\/benchmarks\/","title":"","type":null},"content":[{"type":"text","marks":[{"type":"underline"}],"text":"OpenMessaging 基準測試(OMB)框架"}]},{"type":"text","text":"作爲其基準測試的依據,並進行了一些修改。在本節中,我們將說明在 Confluent 基準測試中發現的問題,並闡述這些問題如何影響 Confluent 測試結果的準確性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Confluent 的設置問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Confluent 基準測試結論不正確是因爲 Pulsar 參數設置不合理。我們會在 StreamNative 基準測試部分詳細講解這些問題。除了 Pulsar 調優問題,Confluent 針對 Pulsar 和 Kafka 設置了不同的持久性保證。持久性級別會影響性能,兩個系統的持久性設置相同,對比才具有參考價值。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Confluent 工程師對 Pulsar 採用默認持久性保證,該保證比 Kafka 的持久性級別高。增加持久性級別會嚴重影響延遲和吞吐量,所以 Confluent 測試對 Pulsar 提出了比 Kafka 更高的要求。Confluent 使用的 Pulsar 版本尚不支持將持久性降低到與 Kafka 相同的級別,但 Pulsar 即將發佈的版本支持該級別,在本次測試中也使用了該級別。如果 Confluent 工程師在兩個系統上使用的持久性設置相同,那麼測試結果顯示的對比應該是準確的。我們當然不會因 Confluent 工程師未使用尚未發佈的功能而指責他們。然而,測試記錄並不能提供必要的情景,而且將其視爲同等持久性設置的結果。本文會提供額外的情境說明。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"OMB 框架問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Confluent 基準測試遵循 OMB 框架指南,該指南建議在多個事件流系統中使用同一實例類型。但在測試中,我們發現同一類型的不同實例存在大量偏差,尤其是發生磁盤 I\/O 故障的情況下。爲了最大程度地減少這種差異,我們在每次運行 Pulsar 和 Kafka 時都使用了相同實例,我們發現這些實例在很大程度上改進了結果的準確性,磁盤 I\/O 性能的微小差異可能會對系統整體性能造成較大差異。我們提議更新 OMB 框架指南,並在未來考慮採用這個建議。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Confluent 研究方法的問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Confluent 基準測試僅測試了幾種有限的場景。例如,實際工作負載包括寫入、 tailing read 和 catch-up read。當某一消費者正在讀取日誌“尾部”附近的最新消息時,即發生 tailing-read,Confluent 只測試了這一種場景。相比之下,catch-up read 在消費者有大量歷史消息時發生,必須消耗至 “catch-up” 位置到日誌的尾部消息,這是實際系統中常見的關鍵任務。如果不考慮 catch-up read,則會嚴重影響寫入和 tailing read 的延遲。由於 Confluent 基準測試只關注吞吐量和端到端延遲,所以未能就各種工作負載下的預期行爲提供全面結果。爲了進一步讓結果更接近實際應用場景,我們認爲對不同數量的訂閱和分區進行基準測試至關重要。很少有企業只關心具有幾個分區和消費者的少量主題,他們需要有能力容納具有不同主題\/分區的大量不同消費者,以映射到業務用例中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們在下表中概述了 Confluent 研究方法的具體問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

測試參數

排除項

侷限性

Write 和 tailing read

Catch-up read

最大吞吐量和端到端延遲有助於闡明事件流系統的基本性能特徵,但把研究範圍限定在兩個參數中,結果會有片面性。

1 個訂閱

訂閱\/消費者分組數量不等

未顯示訂閱數量如何影響吞吐量和延遲。

100 個分區

分區數量不等

未顯示分區數量如何影響吞吐量和延遲。"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"表 5:Confluent 基準測試研究方法的問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Confluent 基準測試的諸多問題源於對 Pulsar 的瞭解有限。爲幫助大家後續開展基準測試時避免這些問題,我們和大家分享一些 Pulsar 技術見解。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了開展準確的基準測試,需瞭解 Pulsar 的持久性保證。我們將以此問題作爲切入點進行探討,先總體概述分佈式系統的持久性,然後說明 Pulsar 和 Kafka 在持久性保證上的差異。 "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"分佈式系統持久性概述"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"持久性是指在面對諸如硬件或操作系統故障等外部問題時,對系統一致性和可用性的維持能力。諸如 RDBMS 單節點存儲系統依靠 fsync 寫入磁盤來確保最大的持久性。操作系統通常會緩存寫入,在發生故障時,寫入可能會丟失,但 fsync 將確保將這些數據寫入物理存儲中。在分佈式系統中,持久性通常來自數據複製,即將數據的多個副本分佈到可獨立失效的不同節點。但不應將本地持久性(fsync 數據)與複製持久性混爲一談,兩者目的不同。接下來我們會解釋這些特性的重要性及主要區別。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"複製持久性和本地持久性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分佈式系統通常同時具備複製持久性和本地持久性。各種類型的持久性由單獨的機制控制。可以靈活組合使用這些機制,根據需要設置不同的持久性級別。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"複製持久性通過一種算法創建數據的多個副本實現,所以同一數據可存儲在多個位置,可用性和可訪問性均得以提高。副本的數量 N 決定了系統的容錯能力,很多系統需要“仲裁”或 N\/2 + 1 個節點來確認寫入。在任何單個副本仍然可用的情況下,一些系統可以繼續服務現有數據。這種複製機制對於處理徹底丟失實例數據至關重要,新實例可從現有副本中重新複製數據,這對可用性和共識性也至關重要(本節不展開探討該問題)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"相比之下,本地持久性決定了各個節點級別對確認的不同理解。本地持久性要求把數據 fsync 到持久存儲,確保即使發生斷電或硬件故障,也不會丟失任何數據。數據的 fsync 可確保機器在短時間內出現故障恢復後,節點擁有先前確認的全部數據。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"持久性模式:同步和異步"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不同類型的系統提供不同級別的持久性保證。通常,系統的"},{"type":"text","marks":[{"type":"italic"}],"text":"整體持久性受以下因素影響"},{"type":"text","text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"系統是否將數據 fsync 到本地磁盤"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"系統是否將數據複製到多個位置"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"系統何時確認複製到對等系統"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"系統何時確認寫入客戶端"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在不同的系統中,這些選擇差異很大,並非所有系統都支持用戶控制這些值。缺少其中某些機制的系統(例如非分佈式系統中的複製),持久性更低。 "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們可以定義兩種持久性模式,兩者均可控制系統何時確認寫入供內部複製,以及何時 寫入到客戶端,即“同步”和“異步”。這兩種模式操作如下。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"同步持久性:"},{"type":"text","text":"僅在數據成功 fsync 到本地磁盤(本地持久性)或複製到多個位置(複製持久性)後,系統才向對等系統\/客戶端返回寫入響應。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"異步持久性:"},{"type":"text","text":"在數據成功 fsync 到本地磁盤(本地持久性)或複製到多個位置(複製持久性)前,系統會向對等系統\/客戶端返回寫入響應。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"持久性級別:測量持久性保證"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"持久性保證以多種形式存在,這取決於以下變量:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據是否存儲在本地,是否在多個位置複製或符合這兩種情況"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"何時確認寫入(同步\/異步)"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"與持久性模式一樣,爲區分不同的分佈式系統,我們爲持久性定義了四個級別。表 6 列出了從最高持久性到最低持久性的各個級別。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

級別

複製

本地

操作

1

同步

同步

僅在數據已複製到多個(至少大部分)位置並且每個副本已成功 fsync 到本地磁盤之後,系統纔會向客戶端返回寫入響應。

2

同步

異步

僅在數據已複製到多個(至少大部分)位置後,系統纔會向客戶端返回寫入響應,但不保證每個副本會成功 fsync 到本地磁盤。

3

異步

同步

在某個副本已成功 fsync 到本地磁盤後,系統向客戶端返回寫入響應,但不保證將數據複製到其他位置。

4

異步

異步

數據複製到多個位置後,系統會立即向客戶端返回寫入響應,但不對複製或本地持久性作出保證。"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"表 6:分佈式系統的持久性級別"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"多數分佈式關係數據庫管理系統(例如 NewSQL 數據庫)均可保證最高級別的持久性,所以將這類系統歸爲 1 級。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"與數據庫一樣,Pulsar 屬於 1 級系統,默認提供最高級別的持久性。此外,Pulsar 可針對每種應用分別自定義所需的持久性級別。相比之下,Kafka 大部分的生產環境部署都配置在 2 級或 4 級。據悉,通過設置 flush.messages=1 和 flush.ms=0,Kafka 也能達到 1 級標準。但這兩項配置會嚴重影響吞吐量和延遲,我們會在基準測試中詳細討論這個問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面我們從 Pulsar 入手,詳細探究各系統的持久性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Pulsar 的持久性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pulsar 提供各級別的持久性保證,可將數據複製到多個位置,並將數據 fsync 到本地磁盤。Pulsar 擁有兩種持久性模式(即上文所述的同步和異步)。用戶可以根據使用場景自定義設置,單獨使用某一模式,或組合使用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pulsar 利用筏等效、基於仲裁的複製協議來控制複製的持久性。通過調整 ack-quorum-size 和 write-quorum-size 參數可以調整複製持久性模式。表 7 列出了這些參數的設置,表 8 列出了 Pulsar 支持的持久性級別。(Pulsar 複製協議和共識算法不屬於本文探討範圍,我們會在後續的博客中深入探討該領域。)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

位置

配置設定

持久性模式

複製

ackQuorumSize = 1

異步

 

同步

本地

(默認)

journalWriteData = true

journalSyncData = true

同步

journalWriteData = true

journalSyncData =false

異步

journalWriteData = false

journalSyncData =false

異步 "}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"表 7:Pulsar 持久性配置設置"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

持久性級別

複製持久性

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章