Pinterest搜索系統實時化的挑戰和建設實踐

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Pinterest的內部搜索引擎Manas是一個通用的信息檢索平臺。正如我們在"},{"type":"link","attrs":{"href":"https:\/\/medium.com\/pinterest-engineering\/manas-a-high-performing-customized-search-system-cf189f6ca40f","title":null,"type":null},"content":[{"type":"text","text":"上一篇文章"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"中討論的那樣,Manas被設計爲兼具高性能、可用性和可伸縮性的搜索框架。如今,Manas支持大多數Pinterest產品的搜索功能,包括廣告、搜索、Homefeed、Related Pins、Visual和Shopping。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"搜索系統的關鍵指標之一是索引延遲,也就是更新搜索索引以反映更改所花費的時間。隨着我們系統的功能不斷增加,新的用例持續引入,即時索引新文檔的能力變得越來越重要。Manas之前已經支持了增量索引,能夠提供數十分鐘數量級的索引延遲。不幸的是,這還不能滿足我們來自廣告和following feeds持續增長的業務需求。我們決定在Manas中構建一個新模塊,以進一步將索引延遲減少到幾分之一秒的水平。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在這篇博客文章中,我們描述了這一系統的架構及其主要挑戰,並介紹了我們所做權衡的細節內容。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"本文由Michael Mi發表在medium.com,經授權由InfoQ中文站翻譯並分享"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"挑戰"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"新的需求伴隨着新的挑戰。以下是我們面臨的幾個主要挑戰。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"索引延遲"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"對於Lucene、Vespa等開源項目來說,小批(tiny batch)方法(又稱近實時)是最受歡迎的選擇。使用這種方法,只有在調用索引提交時纔可以搜索新編寫的文檔。結果,你需要在索引延遲和吞吐量之間進行權衡。不幸的是,我們無法利用這種方法將索引延遲減少到幾秒鐘級別。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"索引刷新能力"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"實時服務的缺點之一是缺乏索引刷新敏捷性。對於一個批處理管道來說,重新運行索引作業以立即獲取所有模式更改是很簡單。但當涉及到實時服務管道時,實現高效的索引刷新支持就是一件很複雜的事情了。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"爲不斷變化的數據實現擴展"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了避免過度配置,系統採用了自動縮放以根據實際查詢負載來調整副本。如果索引是不可變的,那麼新副本創建起來就相對容易:你只需將索引複製到新節點即可。困難之處在於處理不斷變化的索引:如何確保所有副本都具有相同的索引?"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"錯誤恢復"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Manas是一項數據密集型服務,其中每臺主機可提供的索引高達數百GB。Manas也是一個有狀態的系統,一個錯誤的二進制文件可能會導致連回滾都無法解決的數據問題。我們需要構建一個同時支持容錯和錯誤恢復的系統,以便從二進制錯誤和數據損壞中恢復。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"從靜態到實時"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/33\/331877a31a66aee2639f8a14e2169617.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們來簡要介紹一下常規靜態服務和實時服務之間的區別。如上圖所示,實時服務的主要工作是將索引管道從離線遷移到在線。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"對於靜態服務,索引是通過一個批處理工作流離線生成的,然後將它們複製到Leaf用以在線服務。對於批處理工作流,由於高昂的框架開銷,幾乎不可能在幾分之一秒內建立可服務的索引。實時服務不是使用脫機工作流,而是在服務中即時處理所有寫入。此外,實時索引管道用的是與靜態索引管道相同的索引格式來處理寫入,從而使我們能夠重用整個索引讀取邏輯。記住這一點,我們來繼續瞭解實時服務的工作機制。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"索引接口"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們不是直接使用RPC,而是使用了Kafka作爲我們的高寫入吞吐流。Leaf服務器不斷拉取突變以建立增量索引。事實證明,這一決策以多種方式極大簡化了我們的系統:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"數據複製和寫入失敗由Kafka負責。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"藉助回查能力,Kafka隊列也可以用作"},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Write-ahead_logging","title":null,"type":null},"content":[{"type":"text","text":"WAL"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在每個分區中都有嚴格的順序保證,系統可以隨意應用刪除操作,而不必擔心正確性。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"架構概述"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"由於服務邏輯可以通過共享索引格式重用,因此我們將重點放在索引數據流上。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"本質上,實時Manas leaf是一個"},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Log-structured_merge-tree","title":null,"type":null},"content":[{"type":"text","text":"LSM"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"引擎,它將隨機IO寫入轉換爲順序IO,併爲讀取放大和寫入放大應用程序提供高效的服務。如下所示,整個索引流程包括三個關鍵步驟。我們來一一討論。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/00\/002c204fecab1479c446ab302da5c6a1.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"實時段構建"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"除了現有的靜態段(segment)外,我們還引入了實時段。如上所示,系統中有兩種實時段:活動實時段和密封(sealed)實時段。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"活動實時段是唯一可變的組件,用於累積從Kafka拉取的突變(添加\/刪除)。值得一提的是,將一個文檔添加到一個實時段後,在文檔級別提交後即可立即搜索。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"一旦活動實時段達到一個可配置的閾值,它就會被密封,轉爲不可變並放入一個刷新隊列中。同時,系統創建了一個新的活動實時段以繼續累積突變。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在服務重啓的情況下,可以通過重播來自Kafka的消息來重建各個實時段。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"索引刷新"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"索引刷新是將內存中的數據從一個實時段持久存儲到一個壓縮索引文件中的過程。當一個實時段被密封時將自動觸發一次刷新,並且還可以使用調試命令手動觸發刷新。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"索引刷新是一種有益的運算符,可確保數據持久性,這樣我們就無需在重新啓動期間從頭開始重建內存中的段。此外,通過壓縮的不可變索引,刷新減少了一個段的內存佔用,並提高了服務效率。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"索引壓縮"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"隨着時間的流逝,生成的多個小段會影響服務性能。爲了克服這個問題,我們引入了一個後臺壓縮線程來將這些小段合併爲更大的段。由於刪除運算符只是將文檔標記爲已刪除,而不是物理刪除它們,因此壓縮線程還會保留這些已刪除\/過期的文檔。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在每個刷新和壓縮運算符之後,將生成一個由所有靜態段組成的新索引清單。一些Kafka偏移量(用作檢查點)也被添加到每個清單中。根據這些檢查點,服務就能知道重新啓動後在哪裏消費消息。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"設計細節"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在本節中,我們將更具體地介紹幾個關鍵領域。我們從最有趣的部分開始,即併發模型。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"併發模型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"如前所述,實時段是我們需要同時處理讀取和寫入操作的唯一可變組件。不幸的是,那些開源項目採用的近實時方法無法滿足我們的業務需求。相比之下,我們選擇了另一種方法,使我們能夠在添加到索引後立即提交文檔,而無需等待索引刷新。爲了提升性能,我們針對數據結構採用了一個無鎖技術,以適應我們的使用狀況。現在來深入到細節吧!"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"實時段"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"每個實時段都包含一個倒排索引和一個正排索引。倒排索引在邏輯上是從term到發佈列表(用於檢索的文檔ID列表)的映射。同時,正排索引存儲一個用於完整評分和數據提取的任意二進制Blob。我們只關注實時倒排索引部分,與正排索引相比,它更有趣且更具挑戰性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"從高層次上講,實時段和靜態段之間的主要區別是可變性。對於實時倒排索引,從term到發佈列表的映射必須是併發的。folly的併發哈希圖等開源項目爲此提供了很好的支持。我們更關心的是發佈列表的內部表示,它可以有效地支持我們的併發模型。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"僅附加向量"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"一般來說,單寫入者\/多讀取者模型效率更高,推理起來也更容易。我們選擇了與HDFS類似的數據模型,它具有僅附加的無鎖數據結構。我們來仔細研究一下讀取者和寫入者之間的互動方式。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/83\/834d25879480afec00d9c84c4fa45da1.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"寫入者將文檔ID附加到向量中,然後提交大小(size)以使讀取者可以訪問它"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"讀取者在訪問數據之前獲取一個快照(最大到提交的大小)"}]}]}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/71\/719e7200e30112ed683d549257db1969.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了避免隨着發佈列表的增長而產生的內存複製開銷,我們在內部將數據作爲一個存儲桶列表來管理。當我們的容量用完時,只需添加一個新的存儲桶即可,無需接觸舊的存儲桶。另外,通常搜索引擎使用跳過列表來加快跳過運算符的速度。由於採用了這種格式,我們可以方便地支持一個單級跳過列表,這對於實時倒排索引已經足夠了,因爲它的大小通常很小。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"文檔原子性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"現在有了僅追加的向量,我們就可以實現單個發佈列表的原子性。但是,文檔可以包含一個term列表,並且我們最終可能會返回帶有部分更新索引的意外文檔。爲了解決這個潛在的問題,我們引入了一個文檔級別提交,以保證文檔的原子性。在服務管道中使用了一個額外的過濾器來確保僅返回已提交的文檔。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"說到文檔原子性,文檔更新是這裏值得一提的另一種情況。對於每次文檔更新,我們特意將其轉換爲兩個運算符:添加新文檔,然後從索引中刪除舊文檔。儘管每個運算符都是原子的,但加在一起我們就不能保證原子性了。我們認爲可以在很短的時間窗口內返回舊版本或新版本,但儘管如此,我們還是在服務管道中添加了重複數據刪除邏輯,以在同時返回新舊版本時過濾掉舊版本。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"寫縮放"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"一個自然而然的問題是,如果你的數據結構僅支持一次寫入和多次讀取併發模型,那麼如果單個線程不能及時處理所有寫入操作該怎麼辦?盲目添加更多分片只是爲了擴展寫入吞吐量,這似乎不是一個好主意。雖說這是一個確實存在的擔憂,但在我們的設計中已經考慮到了這一點。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ad\/ad62823a172ba464de72edac9914386e.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"用於數據結構的一次寫入和多次讀取併發模型並不意味着我們不能使用多個線程進行寫入。我們計劃使用term分片策略來支持具有多個線程的寫入。如上圖所示,對於具有term列表的給定文檔,每個term將始終映射到固定線程,以便爲單次寫入和多次讀取定製的所有數據結構都可以無限制地直接重用。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"索引刷新"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"索引刷新功能是我們產品的一項關鍵特性,可實現快速週轉並提高開發速度。一般可以使用兩種方法以高效刷新索引,它們分別是動態回填和從離線構建的索引恢復。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"回填索引"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們提供了以合理的吞吐量回填文檔的功能。爲了避免影響生產的新鮮度,我們需要一個優先級較低的單獨流來處理回填流量。結果,兩個流中可能會存在文檔的兩個版本,而舊版本將覆蓋新版本。爲了克服這個問題,我們需要在實時索引管道中引入一種版本控制機制和一個衝突解決程序,以決定哪個版本更新鮮。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"從離線構建索引中恢復"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"有時,以給定的速度對整個數據集進行回填會非常耗時。我們支持的另一種更快的索引刷新方法是離線構建索引,然後使用離線構建索引和Kafka流之間的同步機制來從離線索引中恢復索引。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"故障轉移和自動擴展"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"出於各種原因,我們有時會需要啓動新實例,例如故障轉移和自動縮放等。對於靜態服務,使用從索引存儲下載的不變索引來啓動新實例是很容易的。但是,對於具有不斷變化的索引的實時服務而言,這就變得很複雜了。我們如何確保新實例最終具有與其他實例相同的索引副本呢?"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/37\/37059202d1be51ccb420ba3626fcb746.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們決定使用基於Leader的複製,如上圖所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們的流程如下所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Leader定期拍攝新快照並將其上傳到持久索引存儲中"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"默認情況下,新實例從索引存儲下載最新的快照"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"新實例根據快照索引中的檢查點恢復消費來自Kafka的消息"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"一旦新實例趕上進度,便開始爲流量提供服務"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這種設計中有一些關鍵點值得一提:"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Leader選舉"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Leader的唯一職責是拍攝快照並定期上傳索引。這意味着我們可以在較短的時間內(最多幾個小時)無Leader或有多個Leader。因此,我們在選擇Leader選舉算法方面具有一定的靈活性。爲簡單起見,我們選擇使用集羣維護作業來靜態地選舉一個Leader,在此我們會定期檢查我們是否有一個好的Leader。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"快照上傳"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"通常,新實例僅連接到Leader以下載最新快照。在這種方法中,從新實例下載快照可能會使Leader過載,從而導致級聯故障。相反,我們選擇將快照定期上載到索引存儲,犧牲存儲空間和新鮮度以保持穩定性。此外,上載的快照對於錯誤恢復很有用,稍後將對此介紹。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"錯誤恢復"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"如上所述,錯誤恢復是實時服務系統的另一挑戰。我們需要處理一些涉及數據損壞的特定場景。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"輸入數據損壞"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們使用Kafka作爲輸入寫入流;不幸的是,這些消息是不可變的,因爲生產者只能在其上附加消息,而不能更改現有消息的內容。這意味着一旦將數據損壞引入Kafka消息中,它將是永久性的。多虧了上傳的快照,我們能夠將索引回退到不損壞的狀態,跳過損壞的消息,然後使用這個修復來消費新消息。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"二進制錯誤導致數據損壞"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"儘管我們擁有成熟的靜態集羣索引驗證管道,以確保在換入新版本之前新索引和新二進制文件均不會出現問題,但仍有一些錯誤會潛入生產環境。幸運的是,我們可以通過回滾二進制或索引來解決此問題。對於實時服務而言,回滾二進制文件無法回滾索引中的錯誤,這帶來了更大的麻煩。使用快照上傳機制,我們可以將二進制文件與回退的索引一起回滾,然後從Kafka重放消息以修復索引中的錯誤。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"下一步計劃"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"隨着越來越多的場景加入Manas,我們需要不斷提高系統的效率、可伸縮性和能力。我們路線圖中的一些有趣的項目如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"共同託管靜態和實時集羣以簡化我們的服務棧"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"優化系統以支持大型數據集"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"構建一個基於通用嵌入的檢索以支持高級場景"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"致謝:這篇文章總結了幾個季度的工作,涉及多個團隊。感謝Tim Koh、Haibin Xie、George Wu、Sheng Chen、Jiacheng Hong和Zheng Liu的無數貢獻。感謝Mukund Narasimhan、Angela Sheu、Ang Xu、Chengcheng Hu和Dumitru Daniliuc所做的許多有意義的討論和反饋。感謝Roger Wang和Randall Keller的出色領導。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"原文鏈接:"},{"type":"link","attrs":{"href":"https:\/\/medium.com\/pinterest-engineering\/manas-realtime-enabling-changes-to-be-searchable-in-a-blink-of-an-eye-36acc3506843","title":null,"type":null},"content":[{"type":"text","text":"Manas Realtime — Enabling changes to be searchable in a blink of an eye"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章