這300條數據變更,引發了十億級核心集羣的罷工慘案!

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"線上某核心mongodb集羣數據量很少,單表數據量十億級,但是該集羣比較核心,影響公司收入流水。本文通過分享本次踩坑來分享整個故障經過,該故障爲一次經典的mongodb分片sharding集羣踩坑故障,包括變更通知不到位、部署架構不到位、變更考慮不仔細等。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"一、問題背景"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"某核心mongodb歷史集羣(入職前就有的一個集羣),在對現在所有mongodb集羣進行風險梳理過程中,發現該集羣存在一些潛在的集羣抖動風險,該集羣架構及流量時延曲線如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/69\/697ca4ab5e24aada90921b48beb48a7f.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/6a\/6ad8df3e24bbb79e98783c912e967d31.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/d3\/d3d4b0c33cde69565f81924eaf687130.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如上圖所示,該分片集羣由3個分片組成,集羣讀寫流量很低,峯值QPS約4-6W\/s,平均時延1ms,每個分片採用mongodb複製集架構實現高可用。通過巡檢發現該集羣存在如下幾個問題:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"該集羣只包含兩個用戶庫,userbucket庫和feeds_content庫,兩個庫中只有feeds_xxxxxxx.collection1啓用了分片功能;第一個userbucket庫存儲集羣路由信息,第二個feeds_xxxxxxx庫存儲約十億數據信息;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於該集羣主要是讀多寫少集羣,讀流量都是讀取feeds_xxxxxxx庫中的數據,並且客戶端做了讀寫分離,所以幾乎大部分讀流量都在分片1。分片2和分片3只有少量數據。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"庫表信息如下表所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

庫名

表名

功能說明

userbucket

whitexxx\/expxxx

用戶路由信息表,約300條數據。用戶訪問feeds_xxxxxxx庫前必須先獲取該表得用戶路由數據




feeds_xxxxxxx

feeds_xxx_pool

十億級數據,未啓用分片

hardware_xxx_cost

未啓用分片,少量數據

news_xxx_profile

數億數據,未啓用分片

resource_xxx_info

數千萬數據,啓用分片

resource_xxx_info

未啓用分片,數億數據

resource_xxx_info

未啓用分片,數億數據

......

......"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上面的描述可以總結爲下圖:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/73\/734ccf1abcc9256928a1160f0379bbdd.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從上圖可以看出,分片2和分片3幾乎沒起到任何作用;由於分片3有兩個節點爲低IO的sata盤,可能影響userbucket庫的讀寫,因此考慮直接removeShard從集羣中剔除分片3和分片2。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"二、操作過程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於分片3爲低IO服務器,有潛在抖動集羣抖動分享;同時分片2和分片3幾乎都是浪費的分片,因此打散直接通過如下removeshad命令刪除分片3和分片2信息,騰出無用服務器資源,如下圖所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/e6\/e69ccb927f3b3dc0bf65e12c944706ad.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"步驟1:登陸任一一個代理,假設是代理mongos1。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"步驟2:由於分片3(也就是shard_8D5370B4分片)爲userbucket庫的主分片,因此報錯了,提示\"you need to drop or movePrimary these databases\",意思是我們需要提前把該庫的主分片信息遷移到其他分片。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"步驟3:通過movePrimary命令把userbucket庫的主分片從分片3遷移到分片1。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"步驟4:登陸監控列表中的其他兩個代理mongos2、mongos3,通過db.adminCommand({\"flushRouterConfig\":1}) 強制刷新路由信息。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"注意事項:"},{"type":"text","text":"由於movePrimary過程,其他代理不會感知到該庫的主分片變化,因此需要強制刷新路由信息或者重啓其他節點的mongos,參考如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/2d\/2de6e304e2317d821bee2230637c1c5c.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"三、用戶反饋大部分請求業務請求不可用"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對含有300條數據的userbucket庫變更後,當我還在若無其事的處理其他集羣性能調優的時候,用戶突然反饋該核心集羣部分訪問不可用(注意:是整個10億級表部分訪問不可用,不僅僅是變更的300條數據訪問不可用)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"收到電話後很突然,和業務人員詳細對接後可以基本上確定是因爲這300條數據變更引起。業務獲取這300條數據的時候,部分請求獲取成功,部分請求獲取失敗,說明肯定和movePrimary有關係。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"於是,除了對監控列表中的所有代理做flushRouterConfig強制路由刷新外,還重啓了所有的代理,但是業務反饋,還是有部分請求獲取不到數據。比較棘手,我自己通過所有的mongos代理查看userbucket庫下面的300條數據,完全可以獲取到數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"於是懷疑是不是還有未刷新路由的mongos代理,於是登陸任一mongos代理獲取config.mongos表,查看結果如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/9b\/9b5e7694eefbf8f02d9893babd626aaa.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上面的config.mongos表記錄了該集羣所有的代理信息,同時記錄了這些代理和集羣最後一次ping通信的詳細時間信息。很明顯,該表中記錄的代理原不止集羣監控列表中的代理個數,比監控列表中的個數要多。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最終,把config.mongos表中羅列的當前在線的所有代理強制通過flushRouterConfig刷新路由後,業務恢復。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"四、問題總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過前面的分析可以得出,由於早期集羣監控中漏掉了部分代理,造成這部分代理對應的userbucket路由信息是movePrimary前的路由信息,也就是指向了錯誤的分片,因此出現了路由不到數據的情況,如下圖所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/d6\/d697282520bc3231a84c6ac9e619eed5.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"爲何用戶userbucket庫對應表中數據有的成功有的失敗?"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲部分代理在moveprimary後,沒有強制刷新該表路由信息,造成部分代理路由獲取數據的時候路由錯誤。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"爲何該300條數據部分路由信息錯誤會造成整個10億集羣部分訪問不可用?"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"和業務實現邏輯有關係,因爲業務在獲取這10億條數據前首先需要獲取業務的路由信息,剛好業務路由信息存在了userbucket庫對應表中,業務在獲取數據前必須要獲取到業務的路由信息數據,如果userbucket數據獲取不到,用戶就無法確定指向feeds_xxxxxxx數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"爲何會遺漏部分代理重啓或者強制路由刷新?"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"歷史原因,造成部分代理業務代碼有配置,但是服務端集羣監控元數據遺漏了,也就是服務端集羣監控漏掉了部分代理,這部分代理沒有監控起來。也有可能是mongos代理擴容,但是集羣監控列表中沒有加入元數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"movePrimary操作最安全的操作方法?"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"官方建議movePrimary操作成功後需要強制路由刷新或者重啓mongos,但是movePrimary操作成功和mongos重啓這個過程中有個中間狀態,如果中間狀態業務讀或者些該遷移的庫下面的表,還是可能路由錯誤。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此,最佳安全的moveprimary可以通過如下兩個方法操作:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"方法一:"},{"type":"text","text":"shutdown所有代理,只留下一個代理,等該代理moveprimary成功後在重啓其他mongos代理。切記別遺漏代理,出現本文踩坑類似情況,提前檢查config.mongos表。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"方法二:"},{"type":"text","text":"如果某些庫的主分片在需要removeShard下掉的分片的時候,對該庫的表啓用分片功能,啓用分片功能後會有chunk信息,當removeShard某個分片的時候會自動遷移該分片的chunk到其他分片,整個過程可以保證所有代理獲取最新最完整的路由信息(所有代理通過chunk version版本管理機制來實時更新最新的路由信息)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"作者介紹"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"楊亞洲,"},{"type":"text","text":"前滴滴出行專家工程師,現任OPPO文檔數據庫mongodb負責人,負責數萬億級數據量文檔數據庫mongodb內核研發、性能優化及運維工作,一直專注於分佈式緩存、高性能服務端、數據庫、中間件等相關研發。後續持續分享《MongoDB內核源碼設計、性能優化、最佳運維實踐》,Github賬號地址:https:\/\/github.com\/y123456yz"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文轉載自:dbaplus社羣(ID:dbaplus)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文鏈接:"},{"type":"link","attrs":{"href":"https:\/\/mp.weixin.qq.com\/s\/aPrheGTZHRGXRVIOi-n7bg","title":"xxx","type":null},"content":[{"type":"text","text":"這300條數據變更,引發了十億級核心集羣的罷工慘案!"}]}]}]}

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章