爲技術系統打“疫苗”,愛奇藝攻防演練平臺的探索實踐
{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在程序員的江湖裏,流傳着一些經典的老梗:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"編程第一法則:","attrs":{}},{"type":"text","text":"如果代碼莫名運行成功了,那就別動了~","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"架構第一法則:","attrs":{}},{"type":"text","text":"穩定運行多年的老系統,千萬不要碰~","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ce/ce3a39650bb46d34c64e37d2804d6531.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}}],"text":"圖片來自網絡","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"初入行的程序員們接受前輩的洗禮,將如上的法則深深印入腦海,並廣泛應用於日後的職業生涯中。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"對於正在運行中的線上生產環境,不僅不敢亂碰,恨不得燒香供奉起來。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"然而,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"技術系統的脆弱性來源是多方面的","attrs":{}},{"type":"text","text":",不僅包含","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"硬件的故障","attrs":{}},{"type":"text","text":",","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"代碼的Bug","attrs":{}},{"type":"text","text":",還有","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"架構和邏輯的漏洞","attrs":{}},{"type":"text","text":",","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"線上流量的各種突發性與不確定性","attrs":{}},{"type":"text","text":"等等。隨着各個技術系統向雲原生架構的遷移,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"架構中引入的依賴和不確定性也越來越多。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如何在日益複雜的技術架構下,實現系統的反脆弱,成爲了一個程序員向架構師提升之路上的","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"必修課。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2010年,NetFlix的工程師提出了","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"混沌工程","attrs":{}},{"type":"text","text":"的概念,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"通過對線上生產系統主動注入故障,來驗證系統在各種故障場景下的響應行爲,提前識別並修復系統隱患。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"它就像","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"“疫苗”","attrs":{}},{"type":"text","text":"一樣,故意將有害物質注入體內以防止未來疾病。對於複雜的技術系統,工程師們同樣可以通過注入有限可控的故障,提前發現系統的弱點並進行修復,從而規避可能產生嚴重後果的大型故障。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"混沌工程的引入,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"既是一個技術上的革新,也是一個觀念上的突破。","attrs":{}},{"type":"text","text":"它拋棄行業前輩們過去死守的“線上系統千萬不能碰”的老舊維穩觀念,提出了","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"“以攻代守”的穩定性建設的新思路","attrs":{}},{"type":"text","text":",開啓了在生產環境做","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"攻防演練","attrs":{}},{"type":"text","text":"的新時代。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":16}},{"type":"strong","attrs":{}}],"text":"01 混沌工程在愛奇藝的發展","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"愛奇藝的攻防演練工作,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"最早是由各個業務自行組織開展的","attrs":{}},{"type":"text","text":"。如金融支付團隊,因其業務特點對穩定性要求極高,很早就開展混沌工程相關工作。詳見之前的文章","attrs":{}},{"type":"link","attrs":{"href":"http://mp.weixin.qq.com/s?__biz=MzI0MjczMjM2NA==&mid=2247487884&idx=1&sn=907d1bafd08b5e1ca2c99f7545562492&chksm=e9768dafde0104b9cbf24c0ab9dbdca0fd2fb2c356ce45e60c3bb950d3f0b7e35c44031845d9&scene=21#wechat_redirect","title":null,"type":null},"content":[{"type":"text","text":"《攻防大戰的背後》","attrs":{}}]},{"type":"text","text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這個階段,各個團隊攻防演練","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"使用的方法和工具都比較分散多樣,沒有形成一個統一的平臺和工具標準。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在2020年初的疫情流量高峯期間,愛奇藝發生了一次播放失敗故障。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在覆盤後總結髮現,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"這個大型故障其實是由一個小的網絡抖動和代碼bug引發的雪崩。","attrs":{}},{"type":"text","text":"這樣的小問題平時","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"隱藏在複雜的技術架構中","attrs":{}},{"type":"text","text":",","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"很難被一般的測試發現","attrs":{}},{"type":"text","text":",然而龐大的系統卻因爲這一個小的異常,觸發","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"巨大的連鎖反應。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自此,愛奇藝的技術團隊開始","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"大規模","attrs":{}},{"type":"text","text":"地在","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"重點服務的生產環境裏","attrs":{}},{"type":"text","text":",推動落地","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"常態化,標準化的攻防演練工作","attrs":{}},{"type":"text","text":",同時建設了","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"公司級的攻防演練平臺","attrs":{}},{"type":"text","text":"—","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"小鹿亂撞平臺","attrs":{}},{"type":"text","text":",用於支持各業務攻防演練的相關需求,提升攻防演練的安全性和執行效率。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"截止到2021年Q2,愛奇藝內部已有","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"20+重點業務","attrs":{}},{"type":"text","text":"通過小鹿亂撞平臺進行攻防演練,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"平均每個重點服務的線上環境都經歷了3到4輪的真實故障攻擊演練。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":16}},{"type":"strong","attrs":{}}],"text":"02 小鹿亂撞平臺的使用介紹","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"小鹿亂撞平臺在愛奇藝承擔着","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"兩個角色","attrs":{}},{"type":"text","text":":","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"(1)業務自測:","attrs":{}},{"type":"text","text":"各業務系統的Owner可以通過小鹿亂撞平臺,對自己系統的生產或測試環境進行故障自測,檢驗自己在服務中預設的各類高可用保障措施(報警/降級/熔斷/災備切流等)是否生效。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"(2)紅藍攻防:","attrs":{}},{"type":"text","text":"由公司成立的獨立的架構評估團隊,從第三方視角,用隨機化的攻擊實驗,驗證重點業務系統的高可用水平,對關鍵服務進行第三方檢驗保障。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/fa/fa847d9557f54cbf2876d249d98f43ba.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}}],"text":"小鹿亂撞平臺模塊圖","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"小鹿亂撞平臺參考了公有","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"雲攻防演練優秀產品","attrs":{}},{"type":"text","text":"的設計,用戶可以通過如下","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"簡單步驟來進行攻防配置","attrs":{}},{"type":"text","text":":","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":12}},{"type":"strong","attrs":{}}],"text":"1.選擇攻擊對象","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/db/db8fe39bcb688151a5546551f0d8d34a.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":12}},{"type":"strong","attrs":{}}],"text":"2.配置攻擊方式","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ea/eaa26edeefe3d891662e0261ad4b4da8.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":12}},{"type":"strong","attrs":{}}],"text":"3.編排演練方案","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/93/93a09288aca25fd8eb182c791b3e9a8e.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"按照上面編排好的演練方案,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"提交審批","attrs":{}},{"type":"text","text":"之後就可以執行了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":12}},{"type":"strong","attrs":{}}],"text":"4.演練觀測","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"小鹿亂撞平臺除了配置和編排演練方案之外,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"還打通了公司內部各個資源管理和監控觀測系統,","attrs":{}},{"type":"text","text":"解決攻防演練中的","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"權限,流程,可觀測性等問題","attrs":{}},{"type":"text","text":",爲用戶","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"輸出簡潔有效的故障演練報告。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/4a/4a2fbc0d5ddf834367b3c8f1a9fb6684.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}}],"text":"攻防演練進行中的觀測:監控和告警","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":16}},{"type":"strong","attrs":{}}],"text":"03 重點業務攻防演練案例介紹","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":14}},{"type":"strong","attrs":{}}],"text":"1.案例一:視頻播放服務Couchbase緩存故障演練","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":12}},{"type":"strong","attrs":{}}],"text":"(1)演練背景","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上文提到的播放故障裏,其中一個原因是由於","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"網絡抖動導致播放服務訪問Couchbase的SDK連接使用異常","attrs":{}},{"type":"text","text":",引發後續的連鎖反應。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在故障後,播放服務的同學對架構中的依賴進行了一系列的","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"熔斷加固","attrs":{}},{"type":"text","text":",當服務訪問Couchbase依賴超時會觸發","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"自動熔斷","attrs":{}},{"type":"text","text":",服務請求改成訪問","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"備用KV數據庫","attrs":{}},{"type":"text","text":",並通過攻防演練","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"重放網絡故障","attrs":{}},{"type":"text","text":",","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"驗證熔斷加固的效果。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":12}},{"type":"strong","attrs":{}}],"text":"(2)演練過程","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在攻防演練中,我們選擇攻擊的方式是注入","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"點對點網絡故障","attrs":{}},{"type":"text","text":",使視頻播放服務所在的服務器與Couchbase集羣之間增加","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"1000ms延遲","attrs":{}},{"type":"text","text":",模擬網絡故障造成業務訪問Couchbase超時。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/09/09b21a8ca3a2ec47c6fa986b09a25b7a.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":12}},{"type":"strong","attrs":{}}],"text":"(3)演練效果","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"順利觸發熔斷,Couchbase訪問超時後,業務請求立即切換到HiKV。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/50/504ffa8b37848b596cd758433b85f5ad.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":14}},{"type":"strong","attrs":{}}],"text":"2.案例二:會員服務Redis分佈式鎖攻防演練","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":12}},{"type":"strong","attrs":{}}],"text":"(1)演練背景","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"愛奇藝會員服務團隊統一切換到新的","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Redis分佈式鎖","attrs":{}},{"type":"text","text":",並通過攻防演練","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"檢驗新的分佈式鎖在各種故障極端場景下的可靠性。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":12}},{"type":"strong","attrs":{}}],"text":"(2)演練過程","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對Redis分佈式鎖進行","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"三種不同故障攻擊","attrs":{}},{"type":"text","text":",分別檢驗分佈式鎖在這三類不同攻擊下的效果表現:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/3d/3daec920c532a82d37e4578c14f71547.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"場景1:","attrs":{}},{"type":"text","text":"業務服務和Redis之間的網絡斷開,5分鐘後恢復","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"場景2:","attrs":{}},{"type":"text","text":"Redis主庫故障,觸發主從切換到Redis從庫","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"場景3:","attrs":{}},{"type":"text","text":"Redis主庫故障,不進行主從切換,經過5分鐘後,重啓Redis服務","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":12}},{"type":"strong","attrs":{}}],"text":"(3)演練效果","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"驗證了Redis分佈式鎖在這三種極端場景下的","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"業務影響","attrs":{}},{"type":"text","text":",確定了","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"業務側","attrs":{}},{"type":"text","text":"在分佈式鎖故障期間的","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"等待/重試等機制","attrs":{}},{"type":"text","text":"。根據Redis故障期間的分佈式鎖請求響應狀況,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"指導業務調用方按照各自業務場景設置合理的等待間隔和重試配置等。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":"br"}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":14}},{"type":"strong","attrs":{}}],"text":"04 演練中常見的隱患和經驗","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"回顧過去一年,在20多個業務線的生產環境裏做攻防演練的經驗,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"我們總結了一些具有普遍性或者較大危害性的問題,","attrs":{}},{"type":"text","text":"這裏簡單整理如下:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/d0/d0ce1eed58f59d2f0e9485080dcc121e.png","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":16}},{"type":"strong","attrs":{}}],"text":"05 總結","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在愛奇藝,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"經過一年多的混沌工程文化佈道和系統性的線上攻防實操","attrs":{}},{"type":"text","text":",核心業務的一線技術Leader都已經具備了相當的攻防意識。","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"作者這裏總結了各業務優秀的架構師們的兩個共同特點:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":12}},{"type":"strong","attrs":{}}],"text":"1.零信任","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"任何一個服務都不是孤立的","attrs":{}},{"type":"text","text":",包括DNS、負載均衡、網關、虛機/容器、數據庫、中間件、存儲、網絡、Cache等大量基礎服務依賴,以及大量的外部接口依賴,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"即使每個單獨的依賴都有99.99%的可用性,幾十項依賴疊乘在一起的可用性也是不高的。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"優秀的架構師從不會抱着一種 “某某依賴是基礎服務,我的架構裏假設它100%可用,出了故障我直接給它甩鍋” 的想法來做技術服務,而是","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"儘量假設任意一個環節都可能出問題,以此來設計自己系統的高可用方案。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":12}},{"type":"strong","attrs":{}}],"text":"2.探索欲","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"優秀的技術負責人,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"對自己的服務架構的可靠性做到心中有數","attrs":{}},{"type":"text","text":",能夠明確地對着架構圖中的任意一個點回答,這裏的異常能抵擋,那裏的異常會讓服務故障。","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"如果有不確定的異常點,會主動通過攻防演練進行探索。","attrs":{}},{"type":"text","text":"在","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"測試/灰度","attrs":{}},{"type":"text","text":"等環境驗證過的","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"各類災備/切流/熔斷/降級機制,同樣也敢去線上環境做攻擊驗證。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"敢於對自己負責的線上服務進行真實攻擊,是對未來雲原生時代下架構師們的新要求,體現了技術Leader們的架構自信和技術自信。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自信源於對代碼邏輯的深刻把握,對架構原理的理性思考,對業務責任的勇敢擔當。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在未來雲時代複雜的技術架構中,堅持這種技術自信,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"積極地參與到攻防演練的工作中","attrs":{}},{"type":"text","text":",有利於","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"加深架構理解,發現和修復服務隱患,爲技術同學的能力成長開闢新前景,爲公司業務的穩定發展保駕護航。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}}],"text":"部分圖片來源於網絡,如有版權問題請及時與我方聯繫。","attrs":{}}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.