業界前所未有:10分鐘部署十萬量級資源、1小時完成微博後端異地重建

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"機房斷電、數據中心着火,極端情況下全站持續不可用已經成爲很多公司不得不直面的現實問題。微博的目標是在遭受極端情況下在線數據完全損毀時,1 個小時內在異地重新構建完整的微博服務,同時確保數據完整性。這在整個業界都是一個前所未有的巨大挑戰。"}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"大數據時代數據至關重要"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"數據時代全球每天新產生的數據達到2.3EB,存量數據達到"},{"type":"link","attrs":{"href":"https:\/\/www.datanami.com\/2020\/09\/04\/10-big-data-statistics-that-will-blow-your-mind\/","title":null,"type":null},"content":[{"type":"text","text":"33ZB"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",無論是傳統企業還是新晉獨角獸企業,都在基於大數據進行更快、更好的決策支持,從數據中孵化新的產品與服務,同時降低成本。可以說,"},{"type":"link","attrs":{"href":"https:\/\/www.seagate.com\/files\/www-content\/our-story\/trends\/files\/data-age-2025-white-paper-simplified-chinese.pdf","title":null,"type":null},"content":[{"type":"text","text":"數據就是生產力"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。一旦出現數據丟失問題,對於企業來說是毀滅性的,據IDC統計數據,有高達"},{"type":"link","attrs":{"href":"http:\/\/www.zaibei.net\/ziliao\/0F350212016.html","title":null,"type":null},"content":[{"type":"text","text":"84%"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"的企業在遭遇嚴重數據丟失後的2~3年內退出了市場,隨着企業對數據依賴程度的遞增,這個比例會變得更高。在數字化信息化時代,沒有哪個組織能夠從無法快速恢復的數據災難中全身而退。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/57\/57f341b7f929af9fdc3ad1d1ef14f132.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/medium.com\/@syedjunaid.h47\/what-is-big-data-why-is-big-data-important-in-todays-era-8dbc9314fb0a","title":null,"type":null},"content":[{"type":"text","text":"圖片來源於medium"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在過去幾年裏,黑天鵝事件層出不窮,"},{"type":"link","attrs":{"href":"https:\/\/www.sohu.com\/a\/150656574_813379","title":null,"type":null},"content":[{"type":"text","text":"機房大面積斷電"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"、整個可用區不可用等意外時有發生。就在2021年3月9日歐洲最大公有云服務提供商"},{"type":"link","attrs":{"href":"https:\/\/www.ovh.com\/","title":null,"type":null},"content":[{"type":"text","text":"OVH Cloud"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"由於一場大火導致整個IDC被毀滅,"},{"type":"link","attrs":{"href":"https:\/\/www.reuters.com\/article\/us-france-ovh-fire-idUSKBN2B20NU","title":null,"type":null},"content":[{"type":"text","text":"數百萬的網站不可用"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。在2015年的天津大爆炸事件中,騰訊亞洲最大的IDC"},{"type":"link","attrs":{"href":"https:\/\/www.sohu.com\/a\/78018744_259978","title":null,"type":null},"content":[{"type":"text","text":"離毀滅僅僅只有1.5公里"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",更不用提因爲各種人爲操作失敗導致的數據丟失。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"極端情況下全站持續不可用已經成爲一個現實問題,在數據容災領域,唯一能確定的就是數據的易失性。所以,各大公司都在構建自己的數據容災體系。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/9a\/9ae513ec5edc4ad180a20c08f4d5e457.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"BACKUP "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"是數據容災的黃金法則,一切數據都是通過備份提升冗餘度來容災。數據的重要程度、恢復的時效性,決定了數據的備份策略。低級別的日誌類數據一般採取單機離線冷備,重要數據則採用多副本熱備,而影響公司命脈的核心數據通常採用321備份策略,即:"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"至少3個副本"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"2個不同的存儲介質"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"1個offsite"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"2012年,美國計算機應急響應組(US-CERT)"},{"type":"link","attrs":{"href":"https:\/\/us-cert.cisa.gov\/sites\/default\/files\/publications\/data_backup_options.pdf","title":null,"type":null},"content":[{"type":"text","text":"推薦321備份策略"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",裏面特別提到了異地備份對於從自然災害或者嚴重故障恢復的重要性。於同城多活、異地多活、冷熱結合等備份策略,都是321規則的實現或者變體。但多活策略一方面大多是onsite的熱備設計,另外一方面業界缺乏產品化的解決方案,微博部分核心業務實現了同城與異地多活,投入了巨大的人力與資源成本,所以在全站級別容災時,微博選擇了異地快速重建的方案。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"微博數據容災1小時異地構建方案"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"由於微博的社交媒體屬性等業務特點,其對於數據丟失帶來的不可用的容忍度遠低於一般公司。對於社交媒體屬性非常強的公司來說,數天不可用,基本就等同關站。因此,對於微博而言,備份只是手段,快速恢復纔是實際需求。我們需要在遭受極端情況下在線數據完全損毀時,"},{"type":"text","marks":[{"type":"underline"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"1個小時"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"內在異地重新構建完整的微博服務,同時確保數據完整性。這在整個業界都是一個前所未有的巨大挑戰。爲此,微博構建了數據恢復中心,爲全站數據提供數據備份與極速恢復服務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"截止到2020年10月,微博月活躍用戶達"},{"type":"link","attrs":{"href":"https:\/\/finance.sina.cn\/chanjing\/gdxw\/2020-10-20\/detail-iiznezxr7023868.d.html?from=wap","title":null,"type":null},"content":[{"type":"text","text":"5.23億"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",每天產生的各類數據已達PB級,核類在線業務存量數據達到100PB級別。這些數據分散在微博的數百個獨立的業務當中,爲了滿足不同用戶的數據展示場景,這些數據會以各種形式存在,包括圖片、視頻、鏡像文件、MySQL、Redis、原始的二進制文件等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"面對微博複雜的業務場景、超大規模的數據量,微博數據恢復中心需要同時權衡可用性、經濟成本、安全性、效率。在整個設計過程中,所有的設計策略都圍繞恢復時效性展開,把數據與恢復鏈路中涉及的所有環節都進行標準化與自動化。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1 數據分級標準化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"按照80\/20原則,數據的重要性是有差異的。如果不加區分的恢復100PB級別的數據,無論是從成本上還是效率上都是無法承受的。微博從垂直與水平兩個方向對數據的優先級進行拆分:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"垂直層面:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"按業務的重要程度劃分核心與非核心。核心業務的確認相對複雜,通常需要公司層面的領導拍板,但核心業務變更的頻率非常低。微博上千個對外提供的API,核心API只有十幾個。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"水平層面:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"數據的訪問是有時效性的,在微博場景下表現更加明顯,7天內的數據訪問量超過98%。部分超過1年的數據被訪問的吞吐基本維持在個位數甚至是零,簡單的使用吞吐量作爲數據的訪問熱力值,通過熱力值對數據進行二次分級。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"通過垂直與水平數據分級之後,核心熱門數據的數據規模下降兩個數量級到PB級別,這使得整個數據在1小時內重建成爲了可能。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2 數據資源服務拓樸構建自動化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"有了數據分級標準後,需要按照分級標準找出備份哪些資源。數據是由服務生產的,所有的數據都會歸約到一個特定的服務,因此數據的備份轉化成了服務的備份,這樣就可以通過追蹤流量路徑依賴的方式,發現流量路徑中的服務節點,從而完成個服務網絡拓樸圖的構建。拓撲中服務依賴關係的一個核心準則是:核心業務不依賴非核心業務。避免核心業務拓撲扇出不可控,影響數據備份的範圍與可用性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"微博服務的註冊通過"},{"type":"link","attrs":{"href":"https:\/\/github.com\/weibo-mesh","title":null,"type":null},"content":[{"type":"text","text":"weibo-mesh"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"啓動完成後自動註冊構建,形成服務依賴關係拓撲,開源或者第三方資源類的服務包括MySQL、Redis類的數據,微博通過resource mesh agent發現並自動註冊到所屬的服務池。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3 數據備份標準化與自動化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"不同的數據類型,備份方式各有不同。包括流量最開始入口四七層的配置、RPC服務需要備份的二進制版本(鏡像URL)和MySQL、Redis等業務數據。所有接入數據恢復中心的服務,都需要提供snapshot+streaming兩個API。Snapshot用於生成數據的快照,streaming提供自上一次快照以來產生的所有的operation。服務提供API後,數據恢復中心就能自動備份數據,實現備份與業務的解耦合。目前數據恢復中心提供了常用數據類型的snapshot與streaming的API,相關業務只要上線,即可納入數據恢復中心進行備份。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"4 數據備份服務化:微博的321備份機制"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"數據備份兩個最基本的要求是數據的一致性與數據完整性。單個文件的數據一致性通過數據摘要進行動態存儲驗證,使用糾錯碼有效處理bit反轉靜默存儲錯誤。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"對於數據的完整性,微博使用大塊存儲結合Merkle Hash Tree來解決。所有的數據文件,都拆分或者合併成1GB的一個數據塊(1GB大小的塊是一個最佳實踐值,過小網絡傳輸效率低,過大單個塊傳輸耗時長,不利於提升併發效率)。一個完整的數據備份元數據由四元組構成,數據備份服務提供API可以進行全網備份或者指定業務與數據類型的備份。備份API與業務無關,與數據類型無關。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"數據備份利用snapshot進行全量備份,使用streaming支持動態增量備份。我們採用了watch變化的機制,數據容量累計到一定程度或者超時則會把增量數據做一次checkpoint。snapshot通常是天級別,checkpoint一般是小時級。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在微博使用的資源中,一類是類似Redis,讀寫量非常高,增量數據產生的operation streaming量非常大,但這類資源的單實例容量一般控制在10GB級別,可以提升snapshot的頻率,降低兩個checkpoint之間的數據量以提升恢復效率。另一類是MySQL,單實例容量非常大,通常在TB級別,寫入量較低,可以降低snapshot的頻率,提升checkpoint的頻率,以在存儲成本與數據恢復效率上達到平穩。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"數據服務服務通過checkpoint機制,將數據備份轉換成了數據塊的存儲與備份。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"微博在數據備份上遵循了321策略,1)所有的數據備份都至少包含2個熱備,2個冷備;2)在線熱備數據存儲在SSD設備,冷備數據存儲在獨立的OSS集羣;3)數據會在異地離線存儲,通過專線進行數據同步與傳輸。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"數據備份服務存儲中心選擇的是在雲原生場景下應用廣泛的對象存儲OSS。在邏輯上,恢復中心由管理端與存儲端組成,且二者邏輯上是獨立的。存儲端支持多物理存儲,具體來說,支持在物理上各個機房內的自己OSS存儲集羣,同時還支持接入雲廠商的OSS服務;由管理端來統一調度。管理端本身亦支持異地多活。自建存儲共8個集羣,每個小集羣7臺存儲主機,單機容量20T,單集羣容量 140T, 8個集羣 1120 TB。集羣對外網絡帶寬爲100Gbps。自建存儲集羣示意圖如下,能夠支撐PB級別的數據備份與快速讀取。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/5f\/5f799ce0a8b7b77940a777fadd103547.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"5 數據恢復服務化:1小時異地重建"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"全站異地構建主要是應對極端災難,所有站內(onsite)的數據幾乎都處於不可用狀態,所有的數據恢復與構建都圍繞1小時展開。服務構建的每一個環節都需要進行自動化處理,整個重建過程包括以下幾個部分:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"第1分鐘:實現系統"},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Bootstrap","title":null,"type":null},"content":[{"type":"text","text":"自舉"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"第13分鐘:基礎設施的部署與準備"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"第53分鐘:服務啓動與數據分發"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"第58分鐘:服務自檢、自動註冊與負載均衡變更"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"第60分鐘:完成流量遷移"}]}]}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"恢復服務系統自舉過程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"通過部署在offsite的event-listener收到恢復事件後,首先開啓備份數據自檢流程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"主要包括:"}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"加載對應版本的備份元數據。元數據包含備份的整體統計信息,包括服務的版本、數量、規格以及服務依賴,用於在服務恢復時構建拓樸信息及加速分發。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"恢復混合雲平臺服務DCP。DCP作爲混合雲的IaaS層,後續所有的物理機資源都通過DCP創建。爲了減少DCP恢復時的依賴,提升DCP的恢復速度,DCP部署涉及的所有的二進制包、存儲等支持單機部署,所有實例都部署在一臺高配機器(256C\/2TB內存)上,可以做到1分鐘以內完成DCP的恢復。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"自動完成基礎設施部署"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"DCP啓動後作爲IaaS設備混合雲提供平臺,藉助雲廠商5分鐘擴容2000臺神龍裸金屬的能力,相當於8萬臺ECS服務器。DCP啓動後,讀取備份元數據中IaaS設備的規格、數量,快速擴容指定數量機器,並完成包括Docker、kuberlet等初始化(初始化過程需要花費2min左右時間)。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"服務啓動與數據分發"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"完整的服務啓動與數據分發包括服務拓樸解析、服務鏡像拉取與啓動以及數據分發等幾個階段。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"服務依賴樹解析:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"offsite解析模塊開始解析災備待恢復服務元數據,將服務依賴關係解析到服務依賴樹。各個服務樹全速並行恢復,服務與資源按照存儲在拓樸圖中的距離就近甚至同機部署,最大程度上提升帶寬吞吐,在機器上掛載磁盤時每業務一塊盤,提升整體磁盤順序寫入IO帶寬。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"服務恢復流程"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"微博構建了以Kubernetes、Docker爲代表的容器調度混合雲平臺,爲業務提供serverless服務。容器調度平臺可以快速適配待恢復服務所需運行時環境。爲了解決待恢復服務對CPU、內存、磁盤、帶寬等五花八門運行時環境的訴求,我們將其抽象提煉到規格,根據規格匹配鎖定IaaS層節點設備,在鎖定節點上拉取鏡像,啓動容器服務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"業務數據分發"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在服務恢復過程中,當容器啓動後,對於同一份數據需要分發多次的場景,例如服務的應用的可執行文件、鏡像,使用P2P的分發方式,多點並行分發,大幅提高分發效率,同時降低存儲server的負載。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"服務自檢、自動註冊與負載均衡變更"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"所有納入到weibo-mesh管理的服務及資源,都實現了標準的回調API:服務註冊、健康檢查、預熱以及啓動心跳保持。服務在啓動完成首先會健康檢查,健康檢查通過後調用預熱接口,避免系統冷啓動導致的大量超時;冷啓動完成後,服務實例會自動註冊到分佈式配置中心(微博自建的服務:vintage),啓動心跳彙報功能完成服務的註冊。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"四、七層的啓動過程與普通的服務並無差異,以nginx爲例:在nginx完成部署部署啓動後,"},{"type":"link","attrs":{"href":"https:\/\/github.com\/weibocom\/nginx-upsync-module","title":null,"type":null},"content":[{"type":"text","text":"nginx-upsync-module"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"會自動從分配式配置中心vintage同步其相關的配置並進行動態reload,完成upstream的同步更新與backend的註冊。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"流量遷移"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"數據恢復中心通過檢查分配式配置中心vintage並與服務拓樸進行比對,所有服務啓動完成後,則變更出口DNS,以實現流量的遷移,完成最終的異地構建。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"後記"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"微博數據恢復中心涉及十餘個獨立系統的密切交互,包括:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"1)IaaS層的基礎設施管理DCP,抹平了公有云、自建IDC等異構基礎設施差異,向上提供5分鐘2000臺機器的閃電交付能力,後續基於神龍裸金屬可以提供5分鐘相當於8萬臺機器的交付能力;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"2)把所有的物理機資源看成一個超大的CPU、內存與存儲池,由KRS(自研的基於K8S的容器編排調度系統)實現服務的全網調度,有效利用了超大規格機器(目前重點使用256C\/2TB規格)百Gb級別的高帶寬吞吐等能力,實現全網服務與數據的極速分發;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"3)把Redis、MySQL、RPC、代碼、二進制和四七層等所有都看成資源,提供RaaS(Resource as a Service)的抽象,每一類資源實現標準的backup、prestop、poststart、register等接口,都能夠自動接入數據恢復中心,做到整個過程零參與;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"4)微博的熱點聯動系統在系統完成冷啓動後,自動依賴流量的變更,快速擴容,完成業務的重建過程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"整個異地重建項目過程中,每一個環節都可能會成爲瓶頸,包括數據備份的可靠性、基礎設施的自動化部署與快速構建、基於P2P的數據快速分發等等。本文限於篇幅未能充分展開描述,受限於成本,目前微博也未在全站維度進行1:1規模進行數據異地構建與恢復的驗證,接下來一段時間,微博的基礎設施建設會圍繞此展開。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"作者介紹"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"本文作者"},{"type":"text","text":" "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":":微博研發中心基礎架構部 姚四芳、胡云鵬、臣勇、胡春林"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"姚四芳"},{"type":"text","text":" "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":":微博技術專家,基礎架構策略負責人,"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#262525","name":"user"}}],"text":"2012年加入微博。經歷並主導微博數次架構變遷,設計並支持億級別日活用戶的基礎架構服務,支撐春晚等極端峯值流量。主要的技術方向爲分佈式存儲及跨地域多IDC高可用服務優化。近期專注於大規模分佈式集羣的治理與優化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"臣勇:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"微博資深架構開發工程師,2018年加入微博;主要技術方向爲分佈式緩存、KV存儲服務,近期主要負責數據備份與恢復服務。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章