急需降低系統複雜性,我們從Kafka遷移到了Pulsar

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"要點總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分佈式消息系統支持流和隊列兩種語義。這兩種語義最適合使用的場景有所不同。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pulsar 的獨特之處在於它同時支持流和隊列使用場景。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pulsar 採用多層架構,可以輕鬆擴展 topic 的數量和大小,比其他消息系統的操作更便捷。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pulsar 實現可擴展性、可靠性和其他特性之間的良好平衡。這有助於替換 Iterable 採用的 RabbitMQ 消息系統,並最終替換其他消息系統(如 Kafka 和 Amazon SQS)。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Iterable公司每天代表客戶發送大量營銷消息,包括電子郵件、通知、短信、應用程序消息等,並且每天處理更多的用戶數據更新、事件、自定義工作流狀態。Iterable 日常處理的很多消息都可能觸發系統中的其他操作,從而導致系統越來越複雜,產品易用性越來越低。隨着客戶數量不斷增加,降低系統複雜性迫在眉睫。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Iterable 可以在架構的某些部分改用分佈式消息系統,主要用於存儲需要 consumer 處理的消息,追蹤 consumer 處理消息時的狀態,從而降低系統複雜性,保證 consumer 專注於處理消息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Iterable 使用工作隊列執行客戶指定的營銷工作流、webhooks 和其他類型的工作安排或進展。其他組件(如提取用戶和事件)使用流模型處理有序消息流。分佈式消息系統通常支持流和隊列兩種語義,而最適合使用這兩種語義的場景則有所不同。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"流和隊列"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在流消息系統中,producer 追加數據到“僅追加”消息流中。在每個消息流中,必須按特定順序處理消息,consumer 在消息流中標記消息的位置。我們可以採取某種策略(如對用戶 ID 進行哈希處理)對消息進行分區,使分區成爲單獨的數據流,增加並行度。由於每個流中的數據不可變,且只保存偏移 entry,因此處理時不會遺漏消息。流適用於重視消息順序(如提取數據)的場景。"},{"type":"link","attrs":{"href":"https:\/\/kafka.apache.org\/","title":"","type":null},"content":[{"type":"text","text":"Kafka"}]},{"type":"text","text":" 和 "},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/kinesis\/","title":"","type":null},"content":[{"type":"text","text":"Amazon Kinesis"}]},{"type":"text","text":" 都使用流語義處理消息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在隊列消息系統中,一個隊列可能有多個 producer 和 consumer。producer 向隊列發送消息,consumer 從隊列中接收消息。接收消息後,consumer 開始處理消息,並在處理完每條消息後向隊列消息系統發送 ack。由於多個 consumer 共用一個隊列,消息順序並不重要,因此基於隊列的系統很容易對 consumer 進行擴展。消息隊列系統適用於不需要按特定順序執行任務的隊列,例如,發送同一封郵件給多個收件人。"},{"type":"link","attrs":{"href":"https:\/\/www.rabbitmq.com\/","title":"","type":null},"content":[{"type":"text","text":"RabbitMQ"}]},{"type":"text","text":" 和 "},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/sqs\/","title":"","type":null},"content":[{"type":"text","text":"Amazon SQS"}]},{"type":"text","text":" 都是基於隊列的消息系統。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通常情況下,消息隊列系統可以簡化消息級別錯誤的處理。例如,在發生錯誤後,RabbitMQ 可以輕鬆地將消息發送到特定隊列,由該隊列保留特定時間後,再將消息發送回到原始隊列進行重試。RabbitMQ 還可以反饋 ack 失敗,這樣可以在消息發送失敗後重新發送。大多數消息隊列在收到 ack 後不會將消息存儲在 backlog 中,因此係統無法找到需要新發送的消息,這就增加了調試和災備的難度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於流的系統(如 Kafka)也可以用於隊列使用場景,但使用起來有些麻煩。Kafka 支持多種特性,很多客戶決定在隊列中使用 Kafka。但是由於 Kafka 不能嚴格按照流指定的順序處理消息,爲開發人員增加很多額外工作。如果 consumer 無法消費消息,導致消息處理速度降低或需要重新消費消息,那麼同一流上其他消息的處理速率也會受到影響。常見的解決方案是將消息發佈到另一個 topic 進行重試,但這會增加應用程序的狀態管理,提高複雜性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"爲什麼 Iterable 需要新消息系統"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Iterable 一直使用 RabbitMQ 的特性,處理大量內部消息。我們自定義存活時間(Time-to-Live,TTL),用於指定重試次數,並實現消息處理中的顯示延遲。例如,我們可能會延遲發送營銷郵件(在收件人最可能查看郵件時,再發送營銷郵件)。我們還需要查閱 ack 失敗,來確定重新發送失敗的隊列消息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Iterable 的架構簡圖如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/3c\/4f\/3c2f1f9238520a7a0a1cdd746b7cc24f.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在評估 Pulsar 時,我們使用 Kafka 提取消息,使用 RabbitMQ 處理上文提到的所有隊列。Kafka 具備相應的性能和排序保證,非常適合提取消息,但由於缺少必要的隊列語義,不適合其他使用場景。RabbitMQ 的特性(如延遲)對我們至關重要,這就增加了我們尋找替代方案的難度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在擴展系統時,RabbitMQ 出現以下問題:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在高負載場景中,RabbitMQ 經常出現流量控制問題。在內存或其他資源受到限制時,broker 落後於 producer,流控制機制降低 producer 的速度。但這會影響 producer,導致服務延遲和其他工作區域的請求失敗。例如,我們發現當大量消息的生存時間同時終止時,流控制發生的頻率增加。在這種情況下,RabbitMQ 嘗試將所有到期的消息一次傳輸到目標隊列,但這會急劇增加 RabbitMQ 實例的內存容量,從而觸發 producer 的流控制機制,阻止 producer 發佈消息。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RabbitMQ 的 broker 在收到 ack 後不會存儲消息,增加了調試的難度。也就是說,broker 端無法設置消息的保留時間。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RabbitMQ 的複製組件不足以應對我們的使用場景,導致難以複製消息,RabbitMQ 因而成爲消息狀態的單點故障。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RabbitMQ 難以處理大量隊列。我們有很多需要專用隊列的使用場景,經常需要一次性處理 1 萬多個隊列。在處理這個數量級的隊列時,RabbitMQ 的管理頁面和 API 經常出現問題。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"評估 Apache Pulsar"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"整體來看,"},{"type":"link","attrs":{"href":"https:\/\/pulsar.apache.org\/","title":"","type":null},"content":[{"type":"text","text":"Apache Pulsar"}]},{"type":"text","text":" 支持我們需要的全部特性。儘管在 Pulsar 和 Kafka 的對比中,Pulsar 雲服務提供商和用戶都在強調 Pulsar 的流處理特性,但我們發現 Pulsar 非常適合處理隊列。Pulsar 的共享訂閱模式支持將 topic 用作隊列,因而可以向同一 topic 內的 consumer 提供多個虛擬隊列。Pulsar 也原生支持延遲發送消息。在我們剛開始測試 Pulsar 的時候,支持這些特性的系統並不多見。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除了上述特性外,Pulsar 的分層架構還簡化了擴展 topic 數量和大小的操作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/0c\/8d\/0cb4da73eb60b529dfdbd988a0a6458d.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pulsar 的頂層爲 broker,負責從 producer 接收消息併發送消息到 consumer,但不存儲消息。一個 broker 負責一個 topic 分區,但 broker 不存儲 topic 狀態,topic 的 owner broker 可以隨意互換。因此用戶可以添加 broker,輕鬆擴大吞吐量,並可以在添加後立即使用新 broker。Pulsar 也因而可以處理 broker 故障。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pulsar 的底層爲 "},{"type":"link","attrs":{"href":"https:\/\/bookkeeper.apache.org\/","title":"","type":null},"content":[{"type":"text","text":"BookKeeper"}]},{"type":"text","text":",負責將 topic 數據分片存儲在整個集羣中。需要增加存儲時,可以添加 BookKeeper 節點(bookie)到集羣中,然後用這些新節點來存儲新的分片。Broker 與 bookie 相互協調,更新 topic 的狀態。Pulsar 使用 BookKeeper 存儲大量 topic,這對 Iterable 當前的使用場景而言非常重要。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在評估了幾個消息系統後,我們決定使用 Pulsar,因爲 Pulsar 的可擴展性、可靠性和特性之間達到了完美的平衡,足以取代 Kafka、Amazon SQS 等消息系統。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"初試 Pulsar:發送消息"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Iterable 平臺的主要任務之一就是代表客戶定時發送營銷電子郵件。因此,我們爲不同的客戶分別創建隊列,將這些消息發送到相應的隊列中,再檢查併發送這些消息。Pulsar 提供的隊列讓我們最終決定放棄 RabbitMQ。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"將營銷郵件作爲對 Pulsar 的第一項測試有兩個原因。一是我們使用 RabbitMQ 主要用於發送消息;二是發送消息是我們使用 RabbitMQ 處理的較爲複雜的使用場景。對 Iterable 來說,這一測試場景的風險並不低。但在對 Pulsar 進行全面測試後,我們發現 Pulsar 更適合爲 Iterable 處理隊列。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Iterable 平臺主要處理以下三種常見的營銷消息:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"同時發送營銷消息給所有收件人。假設客戶希望發送通知郵件給最近一個月的活躍用戶,我們查詢 "},{"type":"link","attrs":{"href":"https:\/\/www.elastic.co\/","title":"","type":null},"content":[{"type":"text","text":"ElasticSearch"}]},{"type":"text","text":" 獲取用戶列表,然後設置定時發送消息,再發送這些消息到相應的 Pulsar topic。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"爲每個收件人指定發送時間。發送時間可能是固定的(如收件人所在時區的早上 9 點),也可能根據我們的發送時間優化算法確定。但無論是哪種情況,我們都需要在指定時間發送隊列消息,即延遲處理消息。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"用戶觸發的消息發送。用戶使用自定義流程或發起交易(如在線購物)時,觸發消息發送。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上述場景中,同一時間發送的消息數量可能會相差很大,因此我們需要消息系統可以根據實際情況擴縮 consumer 的數量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"遷移到 Apache Pulsar"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"雖然在負載測試中,Pulsar 表現良好,但是我們不確定 Pulsar 是否能夠承受實際生產環境的高負載。這也是我們特別關心的問題,因爲我們想要利用 Pulsar 的一些新特性(如"},{"type":"link","attrs":{"href":"https:\/\/pulsar.apache.org\/docs\/en\/concepts-messaging\/#negative-acknowledgement","title":"","type":null},"content":[{"type":"text","text":"Nack"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/pulsar.apache.org\/docs\/en\/2.5.0\/concepts-messaging\/#delayed-message-delivery","title":"","type":null},"content":[{"type":"text","text":"延時發送消息"}]},{"type":"text","text":")。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了檢測 Pulsar 的性能,我們部署了並行管道,同時向 RabbitMQ 和 Pulsar 發送消息,並配置不實際處理消息的 consumer 進行 ack。另外,我們還模擬了延遲消費,以便了解 Pulsar 在特定生產環境中的表現。我們對測試 topic 和生產 topic 同時使用 consumer 級別的特性標記,因此可以逐一遷移 consumer 進行測試,最終用於生產環境。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在測試期間,我們發現了 Pulsar 的一些錯誤。例如一個"},{"type":"link","attrs":{"href":"https:\/\/github.com\/apache\/pulsar\/pull\/5499","title":"","type":null},"content":[{"type":"text","text":"與延遲消息相關的競態條件"}]},{"type":"text","text":"問題,但在 Pulsar 開發人員的幫助下,這些問題都得以定位和解決。這是我們發現的最嚴重的問題,它會導致 consumer 出現假死,消息積壓。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們還發現 Pulsar producer 默認啓用批處理。例如,Pulsar 積壓 metric 返回的是批數量而不是消息數量,增加爲消息積壓設置報警閾值的難度。後來,我們在 Nack 和批處理之間的交互中發現了一個更嚴重的"},{"type":"link","attrs":{"href":"https:\/\/github.com\/apache\/pulsar\/issues\/5969","title":"","type":null},"content":[{"type":"text","text":"錯誤"}]},{"type":"text","text":",Pulsar 團隊也及時修復了這個錯誤。我們最終決定不使用批處理。在 Pulsar 中,禁用 producer 批處理操作簡單,Pulsar 性能也滿足了我們的需求。Pulsar 在新的版本中可能會合並上文提到的錯誤修復。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"消息延遲下發和 Nack 在當時屬於 Pulsar 新特性,我們覺得在使用中可能會出現一些問題,所以我們決定在初試階段只發布消息到測試 topic,並在幾個月內逐步遷移到 Pulsar。如果出現問題,我們可以迅速定位並及時解決問題,不影響客戶的使用。市場營銷業務的整體遷移歷時大約六個月,這期間 Pulsar 實現了預期表現,我們感到十分滿意。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"遷移全部完成後,我們發現增加 consumer 後,業務規模得到拓展,但運營成本降低了一半。遷移到 Pulsar 前,我們的業務成本較高,可能是因爲我們在使用 RabbitMQ 時,爲了提高性能,超額配置了實例。到目前爲止,我們的 Pulsar 集羣已經運行了六個多月,沒有出現任何問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"實施和工具"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在後端,Iterable 主要使用 "},{"type":"link","attrs":{"href":"https:\/\/www.scala-lang.org\/","title":"","type":null},"content":[{"type":"text","text":"Scala"}]},{"type":"text","text":",因此我們需要使用支持 Pulsar 的 Scala 工具。我們一直在使用 "},{"type":"link","attrs":{"href":"https:\/\/github.com\/sksamuel\/pulsar4s","title":"","type":null},"content":[{"type":"text","text":"pulsar4s"}]},{"type":"text","text":" 庫,也對新特性做了一些貢獻,例如延遲發送消息。我們還貢獻了一個"},{"type":"link","attrs":{"href":"https:\/\/doc.akka.io\/docs\/akka\/current\/stream\/index.html","title":"","type":null},"content":[{"type":"text","text":"基於 Akka Streams"}]},{"type":"text","text":" 的連接器,作爲 source 接收消息,還支持 ack。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"例如,我們可以這樣消費命名空間中的所有 topic。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"\/\/ Create a consumer on all topics in this namespace\nval createConsumer = () => client.consumer(ConsumerConfig(\n topicPattern = \"persistent:\/\/email\/project-123\/.*\".r,\n subscription = Subscription(\"email-service\")\n))\n\n\/\/ Create an Akka streams `Source` stage for this consumer\nval pulsarSource = committableSource(createConsumer, Some(MessageId.earliest))\n\n\/\/ Materialize the source and get back a `control` to shut it down later.\nval control = pulsarSource.mapAsync(parallelism)(handleMessage).to(Sink.ignore).run()\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"使用正則表達式爲 consumer 添加訂閱,這樣 consumer 不必瞭解特定的 topic 劃分策略,可以自動訂閱新創建的 topic。由於 Pulsar 支持大量 topic,可以在發佈消息時自動創建新 topic,因此可以輕鬆爲新消息類型或單獨的消息創建新 topic。Pulsar 幫助用戶可以更輕鬆地限制不同 consumer 和消息類型的速率。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"結語"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pulsar 是一個正在快速發展的開源項目,因此我們需要隨時關注 Pulsar 的動態,深入瞭解 Pulsar 的各個方面。Pulsar 的文檔還不太完善,我們經常需要聯繫社區,尋求幫助。社區的小夥伴們十分熱情,我們也樂於參與到 Pulsar 的開發中,爲 Pulsar 的新特性添磚加瓦。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pulsar 採用分層架構,不僅具有高可擴展、高可用、低延遲等特性,還同時支持流和隊列,因而可以代替 Iterable 架構中正在使用的多個分佈式消息系統。Pulsar 支持我們的 Kafka、RabbitMQ 和 SQS 用例。遷移到 Pulsar 後,我們可以專心使用一個統一的架構,熟悉 Pulsar 的各項操作和工具即可。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們在 2019 年初開始接觸 Pulsar。到目前爲止,Pulsar 已經取得了巨大的進展,尤其是入門文檔和相關培訓。Pulsar 也新增了許多工具,例如,"},{"type":"link","attrs":{"href":"https:\/\/github.com\/apache\/pulsar-manager","title":"","type":null},"content":[{"type":"text","text":"Pulsar Manager"}]},{"type":"text","text":" 用於管理集羣。一些公司提供託管和管理 Pulsar 的服務,便於初創公司和小型團隊上手 Pulsar。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"總而言之,Iterable 遷移到 Pulsar 的過程非常成功,期間也遇到了一些挑戰。Iterable 的使用場景目前還不多見。我們原以爲會出現一些問題,但測試解決了大多數問題,將對客戶的影響降到最低。我們對 Pulsar 的表現充滿信心,打算將 Pulsar 同時用於 Iterable 平臺其他的新舊組件中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文鏈接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https:\/\/www.infoq.com\/articles\/pulsar-customer-engagement-platform\/"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章