替代 Kafka?Pinterest 推出高效可擴展雲原生系統 MemQ

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"日誌平臺爲 "},{"type":"link","attrs":{"href":"https:\/\/www.pinterest.com\/","title":"xxx","type":null},"content":[{"type":"text","text":"Pinterest"}]},{"type":"text","text":" 的所有數據攝入(data ingestion)和數據傳輸(Data transportation)提供了動力。Pinterest 日誌平臺的核心是一個分佈式 PubSub 系統,它可以幫助用戶進行數據的傳輸\/緩衝和異步消費。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/fb\/fba0365beb03f166df3a5bbcb754491b.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這篇博客中,我們將會介紹 MemQ(發音爲 mem - queue),它是一種用於 Pinterest 雲端開發的高效可擴展 PubSub 系統,從 2020 年年中開始就爲我們的近實時數據傳輸用例提供了支持,它是 Kafaka 的一個補充,並且在成本效益方面高達 90% 。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"歷史"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"近十年來,Pinterest 一直依賴 "},{"type":"link","attrs":{"href":"https:\/\/kafka.apache.org\/","title":"xxx","type":null},"content":[{"type":"text","text":"Apache Kafka"}]},{"type":"text","text":" 作爲唯一的 "},{"type":"link","attrs":{"href":"https:\/\/zh.wikipedia.org\/zh-cn\/%E5%8F%91%E5%B8%83\/%E8%AE%A2%E9%98%85","title":"xxx","type":null},"content":[{"type":"text","text":"PubSub"}]},{"type":"text","text":" 系統。由於 Pinterest 的發展,數據的數量也在不斷增加,因此,在運行一個超大規模的分佈式 PubSub 平臺上所面臨的挑戰也與日俱增。Apache Kafaka 的規模化經營使我們對構建可擴展的 PubSub 系統有了深刻的認識。在深入研究了 PubSub 的運營和可擴展性之後,我們得到了如下主要的結論:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"並非所有的數據集都要求亞秒級的延遲,延遲和成本應該成反比(更低的延遲應該花費更多的成本)。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"爲了實現基於資源的獨立可擴展性,必須將 PubSub 系統中的存儲和服務組件分離。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"根據讀而非寫來進行排序,爲特定的消費者用例提供所需的靈活性(對於同一數據集,不同的應用程序可能會有不同的需求)。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"在大多數情況下,嚴格的分區排序在 Pinterest 是沒有必要的,並且常常會帶來可擴展性方面的問題。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"Kafaka 中的再平衡成本很高,常常會造成性能的降低,並且會對已飽和的集羣用戶造成不利的影響。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":6,"align":null,"origin":null},"content":[{"type":"text","text":"在雲環境中進行自定義複製會花費很大。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2018 年,我們對一個能夠原生利用雲計算的全新 PubSub 系統進行了測試。2019 年,我們開始正式探討各種方案,以應對 Pinterest 的可擴展性,並基於運營成本以及現有技術的再設計成本,對各種 PubSub 技術進行了評估。最後得出的結論是,我們必須要有一個基於 Apache Kafaka、"},{"type":"link","attrs":{"href":"https:\/\/pulsar.apache.org\/","title":"xxx","type":null},"content":[{"type":"text","text":"Apache Pulsar"}]},{"type":"text","text":" 和 Facebook Logdevice 的 PubSub 技術,它是爲了雲計算而創建的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/d5\/d59cbd017154cb5e5931ad1546e2b808.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"MemQ 是一個新的 PubSub 系統,它加強了 Pinterest 的 Kafaka。該系統採用了與 "},{"type":"link","attrs":{"href":"https:\/\/pulsar.apache.org\/","title":null,"type":null},"content":[{"type":"text","text":"Apache Pulsar"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 和 "},{"type":"link","attrs":{"href":"https:\/\/logdevice.io\/","title":null,"type":null},"content":[{"type":"text","text":"Facebook Logdevice"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 相類似的解耦存儲與服務架構;但是,其主要依靠可插入式複製存儲層,也就是對象存儲\/DFS\/NFS 來存儲數據。最後的結果就是一個 PubSub 系統:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"處理 GB\/s 流量"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"獨立地擴展、寫入和讀取"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不需要昂貴的再平衡來處理流處理的增長"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"比我們的 Kafka 足跡高出 90% 的成本效益"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/db\/db2aefebdcdb82246e3118ed2acd3714.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"祕密武器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MemQ 的祕訣是,它通過微批處理和不可更改的寫入來創建一個架構,在這種架構中,存儲層所需的每秒輸入\/輸出操作數(IOPS)會大幅降低,從而使得像 Amazon s3 這樣的雲本地對象存儲的使用具有成本效益。該方法類似於網絡的分組交換(與電路交換相比,即單一的大型連續存儲數據,例如 Kafka 分區)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MemQ 將連續的日誌流分解成塊(對象),與 Pulsar 中的分類賬相似,只是它們是以對象形式寫入的,而且是不能改變的。在 MEMQ 中,“數據包”\/“對象”的大小被稱作批處理(Batch),它在確定端到端(End-to-End,E2E)的延遲方面起着重要的作用。數據包越小,它們的寫速率就越高,但是 IOPS 的成本也會增加。這樣,MemQ 就可以將端到端的延遲調節到更高的 IOPS。這種架構的一個重要性能優點是,它實現了對底層存儲層讀寫硬件的分離,使得寫入和讀取能夠在跨存儲層傳播的數據包獨立地進行擴展。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這也消除了 Kafka 中所面臨的限制,即,要想恢復一個副本,就得從頭再複製一個分區。在 MemQ 中,底層的複製存儲只需要求恢復特定的批處理,如果發生存儲失敗,則會降低複製的數量。但是,因爲 Pinterest 的 MemQ 是在 Amazon S3 上運行的,所以存儲的恢復、分片和擴充都是由亞馬遜雲科技來完成的,沒有 Pinterest 的人爲干涉。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"MemQ 的組件"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"客戶端"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MemQ 客戶端(Client)使用種子節點發現集羣,並與該種子節點相連,從而發現元數據和託管給定主題的主題處理器(TopicProcessors)的代理(Broker),或者,對於消費者,託管通知隊列的地址。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"代理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"與其他 PubSub 系統類似,MemQ 也有代理的概念。MemQ Broker 是集羣的一部分,主要負責處理元數據和寫入請求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"注意:MemQ 的讀請求可以直接由存儲層處理,除非使用了讀取代理。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"集羣調控器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"調控器(Governor)是 MemQ 集羣的領導者,它負責自動再平衡和主題處理器的分配。集羣中的任何代理都可以被選爲調控器,它通過 Zookeeper 與代理進行通信,Zookeeper 也是在調控器的選舉中使用的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"調控器使用一個可插入式的分配算法來作出分配決策。預設的方式是通過對代理中的可用能力進行評估來作出分配決策。調控器還利用這個功能來處理代理失敗和恢復主題的容量。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"主題與主題處理器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MemQ 與其他 PubSub 系統類似,使用主題的邏輯概念。代理上的 MemQ 主題是通過一個叫做主題處理器的模塊來處理的。一個代理可以承載一個或多個主題處理器,每一個主題處理器實例處理一個主題。主題有寫和讀的分區。寫分區用於創建主題處理器(1:1 的關係),而讀分區用於確定需要多大的並行程度來處理消費者的數據。讀分區的數量等於通知隊列的分區數量。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"存儲"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MemQ 存儲(Storage)由兩部分組成:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"Replicated Storage(Object Store \/ DFS)"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"Notification Queue(Kafka, Pulsar 等)"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"1. Replicated Storage"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MemQ 支持可插入的存儲處理程序。現在,我們已經完成了 Amazon S3 的存儲處理器。Amazon S3 爲容錯、按需存儲提供了一種性價比高的解決方案。MemQ 在 S3 中採用了下面的前綴格式,從而創建了高吞吐量和可擴展的存儲層:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"s3:\/\/\/\/\/topics\/"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(a) = 用於在 S3 內部進行分區,以便在需要時處理更高的請求率。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(b) = MemQ 集羣的名稱。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"可用性與容錯性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲 S3 是一個高度可用的 Web 規模對象存儲,MemQ 依靠其可用性作爲第一道防線。MemQ 爲滿足 S3 未來的重新分區要求,在第一級前綴添加了兩位數的十六進制哈希值,創建了 256 個基本前綴,這一點從理論上講,可以通過獨立的 S3 分區進行處理,只是爲了讓它能夠經得起未來的考驗。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"一致性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MemQ 的一致性是由底層存儲層的一致性所決定的。在 S3 的情形中,S3 標準的每一次寫入(PUT)都保證在被確認之前被複制到至少三個可用性區域(Availability Zones,AZ)。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"2. Notification Queue"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MemQ 使用通知系統將數據位置的指針傳遞給消費者。當前,我們使用了 Kafaka 形式的外部通知隊列(Notification Queue)。一旦數據被寫入存儲層,存儲處理器就會生成一個通知消息,它會記錄寫入的屬性,包括其位置、大小、主題等。消費者使用這個信息從存儲層檢索數據(批處理)。MenQ 代理也可以爲消費者代理批處理,但需要以犧牲效率爲代價。通知隊列爲消費者提供了集羣\/負載平衡。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"MemQ 數據格式"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/b4\/b4173c111ad909c40cc4d21f6108aaae.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MemQ 對消息和批處理使用一種自定義的存儲\/網絡傳輸格式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MemQ 的最低傳輸單位被稱爲 LogMessage。這類似於 Pulsar Message 或 Kafka ProducerRecord。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"LogMessage 的包裝器可以讓 MemQ 進行不同級別的批處理。單位的層次結構:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"批處理(持久化的單位)"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"消息(生產者上傳的單元)"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"LogMessage(應用程序與之交互的單元)"}]}]}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"處理數據"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/66\/66eb2a88da708dae02dd41ad10e15701.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MemQ 生產者負責向代理髮送數據。它使用異步調度模型,允許非阻塞的發送,而不需要等待確認。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個模型對於在維護存儲層確認的同時隱藏底層存儲層的上傳延遲至關重要。由於無法利用已有的 PubSub 協議,需要通過同步確認來實現自定義的 MemQ 協議和客戶端。MemQ 支持三種類型的應答:ack=0(生產者觸發和忘記),ack=1(代理接收)和 ack=all(存儲接收)。在 ack=all 的情況下,複製因子(RF)由底層存儲層決定(例如,在 S3 標準 RF=3[跨越三個 AZ])。如果確認失敗,MemQ 生產者可以顯式或隱式地觸發重試。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"存儲數據"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MemQ 主題處理器在概念上是一個 RingBuffer。這個虛擬環被細分爲批處理,這樣可以簡化寫操作。當信息在通過網絡到達時,將會被排進目前可用的批處理中,直至批處理被填滿或者根據時間觸發。當一個批處理完成後,它就被交給 StorageHandler,以便上傳到存儲層(如 S3)。如果上傳成功,則通過通知隊列發送通知,如果生產者請求確認,則使用 AckHandler 將批處理中的各個消息的確認(ack)發送給他們各自的生產者。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"消費數據"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MemQ 消費者允許應用程序從 MemQ 讀取數據。消費者使用代理元數據 API 來發現指向通知隊列的指針。我們爲應用程序提供了一個基於輪詢的接口,每次輪詢請求都會返回一個 LogMessages 的迭代器,以便讀取一個批處理中的所有 LogMessages。這些批處理是使用通知隊列發現的,並直接從存儲層檢索。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"其他特性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"數據丟失檢測:"},{"type":"text","text":"當 Kafaka 向 MemQ 遷移工作負載時,必須對數據丟失進行嚴格驗證。所以,MemQ 擁有一個內置的審計系統,可以高效地跟蹤每個消息的端到端交付,並且可以以近實時的方式發佈指標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"批處理和流處理的統一:"},{"type":"text","text":"由於 MemQ 使用一個外部存儲系統,所以無需將 MemQ 的數據轉換成其他格式,就可以在原始 MemQ 數據中直接進行批處理操作。這樣,用戶就可以對 MemQ 進行特別檢查,無需爲查找性能而擔憂,只要存儲層可以單獨擴展讀取和寫入。MemQ 消費者可以使用存儲引擎進行併發檢索,從而在特定的流處理中實現更快速地回填。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"性能"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"延遲"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/8e\/8e2a6d3391d9af6efe86b33fde6288f6.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MemQ 支持基於大小和時間的刷新到存儲層,除了一些優化來抑制抖動外,還能對最大的尾部延遲進行了硬限制。到目前爲止,我們能夠通過亞馬遜雲科技 S3 存儲實現 30 秒鐘的 p99 E2E 延遲,而且我們還在積極地改進 MemQ 的延遲,從而提高了從 Kafaka 向 MemQ 遷移的用例數量。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"成本"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"事實證明,與使用 i3 實例的三個 AZ 上的三個副本的同等 Kafka 部署相比,S3 標準上的 MemQ 最多可以節省 90%(平均約 80%)。這些節約源於以下幾個因素,例如:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"減少 IOPS"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"取消排序限制"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"計算和存儲的解耦"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於消除了計算硬件,減少了複製成本"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"放寬延時限制"}]}]}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"可擴展性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"帶有 S3 的 MemQ 按需擴展,這取決於寫入和讀取的吞吐量要求。MemQ 調控器會進行實時再平衡,保證在能夠提供計算的情況下,有充足的寫入能力。代理增加了附加的代理並更新了流量需求,從而實現了線性擴展。如果消費者在處理數據時需要額外的並行性那麼就手工更新讀分區。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/b0\/b003cdba5ec5d1c5ba901ec78415ac0d.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 Pinterest 上,我們將 MemQ 直接用於 ec2,直接在 ec2 上運行 MemQ,並根據流量和新用例需求擴展集羣。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"未來的工作"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們在下列方面作出了積極的承諾:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"減少 MemQ 的 E2E 延遲(<5 秒),以支持更多的使用案例"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"實現與流處理和批處理系統的本地集成"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"讀取時的關鍵排序"}]}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"結語"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MemQ 爲 PubSub 提供了一種靈活、低成本、雲原生的方式。如今,MemQ 可以在 Pinterest 上收集和傳輸所有的機器學習訓練數據。我們正在積極研究將其擴展到其他數據集,並對延遲進行更好的優化。除了解決 PubSub 問題之外,MemQ 存儲還可以提供使用 PubSub 數據進行批處理的能力,而不會對性能造成很大的影響,並且能夠實現低延遲的批處理。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"原文鏈接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https:\/\/medium.com\/pinterest-engineering\/memq-an-efficient-scalable-cloud-native-pubsub-system-4402695dd4e7"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章