過渡架構的作用:一週處理近百起高嚴重性事件,如何重寫這個技術負債系統?

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"本文最初發表於 Medium 博客,經原作者 Zak Islam 授權,InfoQ 中文站翻譯並分享。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"導讀"},{"type":"text","text":":軟件領域的聖經《人月神話:軟件項目管理之道》,由 IBM System\/360 系統之父佛瑞德・布魯克斯所著,全書講解了軟件工程、項目管理相關課題,內容源於作者在 IBM 公司 System\/360 家族和 OS\/360 中的項目管理經驗。這些經驗概括出了第二系統效應,又稱第二系統症候羣,它認爲,在完成一個小型、優雅而成功的系統之後,人們傾向於對下一個計劃有過度的期待,可能因此建造出一個巨大、有各種特色的怪獸系統。第二系統效應可能造成軟件專案計劃過度設計,產生太多變數,過度複雜,無法達成期待,並因而失敗。本文作者反思了他在 AWS 的時光,提醒後來者不要隨意重寫系統,而是要用過渡架構的方法來達到目標。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"2015 年在 AWS,我接手了一款技術債累累的產品。在一個美好的一週裏,我們共處理了 70~100 起高嚴重性事件。這個系統不具備基本的協議,如 API 速率限制、工作優先級、或者像隔離噪音鄰居這樣的防禦功能,但是它很容易在一秒內處理上百萬個請求。換言之,在它後面有一個巨大的集羣,但是這很有用。在對這個系統進行首次評估的時候,我們真的不知道從何入手。擺在我們面前的挑戰似乎不可逾越。我們知道重寫系統是必要的,但是我們必須獲得重寫系統的權利。AWS 的一個核心"},{"type":"link","attrs":{"href":"https:\/\/twitter.com\/tacertain\/status\/1121429740596785152?lang=en","title":null,"type":null},"content":[{"type":"text","text":"工程原則"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"是:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"……工程師們感謝我們的前輩。我們讚賞工作系統的價值,以及其中所體現的體驗。我們知道,很多問題本質上都不新鮮。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"簡而言之,它意味着尊重先前的事物。那時,我想我們並沒有真正意識到,隨着我們職業生涯的發展,這個原則將如何影響我們作爲工程師和領導者的思維方式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"重寫"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在那個時候,重寫系統的想法很有誘惑力。現在回想起來,我真的很高興我們沒有這麼做。而在我後來的職業生涯中,我瞭解到這種誘惑有一個"},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Second-system_effect","title":null,"type":null},"content":[{"type":"text","text":"名字"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"第二系統效應"},{"type":"text","text":"(又稱"},{"type":"text","marks":[{"type":"strong"}],"text":"第二系統症候羣"},{"type":"text","text":")是指由於期望值過高和過度自信,小型、"},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Elegant","title":null,"type":null},"content":[{"type":"text","text":"優雅"}]},{"type":"text","text":"而成功的系統往往會被過度設計的、"},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Software_bloat","title":null,"type":null},"content":[{"type":"text","text":"臃腫"}]},{"type":"text","text":"的系統所取代。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"雖然這個系統在當時處於一個不太理想的狀態,但是它並不需要重寫;它只是需要一點愛❤。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"過渡架構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"過渡架構實際上是迭代開發的一種漂亮說法。這意味着,要小步前進。回到這個系統:我們決定採取迭代的方法來修復我們所繼承的“爛攤子”;將我們重寫它的願望擱置一邊。我們有一座大山要攀登,但我們必須邁出第一步。首先,我們審視了所有的問題,然後創建一個圖來排列這些問題。排列圖看上去像這樣……"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/06\/0620611b43aba40fddab71e336753979.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這張圖給了我們一種有趣的視角(圖中數字系虛構)。一旦我們看到了上圖中的根本原因,我們就知道系統存在兩個大問題。我們需要在系統中加入速率限制,然後想辦法擴展服務 X,這樣它就不會在負載下崩潰。我們把小團隊分成幾個"},{"type":"link","attrs":{"href":"https:\/\/medium.com\/productmanagement101\/spotify-squad-framework-part-i-8f74bcfcd761","title":null,"type":null},"content":[{"type":"text","text":"小組"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",集中力量解決這兩個具體問題。我們承認這個系統的其他部分也是一團糟的,但是我們選擇接受它們,因爲我們知道這些問題爲我們的客戶解決了問題,但是卻給我們帶來了操作上的麻煩。幾次衝刺過去了(感覺似乎是永恆的),我們最終實施了速率限制的第一次迭代。太棒了……是嗎?錯了!"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"迭代 1"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們的速率限制系統的第一次迭代是一個內置於部署到生產中的軟件包的靜態文件。這就是說,我們必須更新這個文件,對其進行簽入,進行代碼審查,進行構建,然後將其部署到生產中——就在事件發生的中間!那聽起來很噁心,但確實有效!雖然我們並沒有減少事件數,但是我們事件的平均修復時間(Mean time to resolution,"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"MTTR"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":")下降了一半。那是個巨大的勝利。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"迭代 2"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"下一步,我們需要確定如何在生產中應用配置更改,而無需構建代碼,然後進行部署。用 S3 桶中的文件的引用替換靜態配置文件。接着,我們構建了一個獨立的工具,它可以把變更從開發人員桌面推到這個文件(經過適當的檢查和平衡),我們已經更新了服務,每隔幾分鐘就重新加載這個文件。這聽起來似乎很簡單,但是我們必須對它進行深入的思考。即使 S3 出現故障,我們也必須確保服務能夠正常運行。我們必須確定服務是否會失敗,是否對所有客戶應用默認限制,最終處理整個集羣的一致性,等等。這是很難解決的問題,但是我們花了很多時間來回答這些問題:因爲我們知道迭代 1 已經用於生產。這樣我們可以贏得很多時間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"迭代 3"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"最後,我們回答了所有困難的問題,將迭代 2 引入生產。對此,我們的胃口也越來越大了。雖然我們進一步降低了 MTTR,但是工程師們希望我們有一個能自動對傳入的請求進行速率限制的動態系統。感覺到我們需要這個系統。爲了更好地減輕我們的工作負荷,我們必須擁有這個系統。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了構建這樣的系統,我們需要:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"一種低延遲的數據分配協議,用於報告來自整個集羣中主機的按客戶分列的請求速率。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"一種半分散的服務,可以基於啓發式方法作出某些決策。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"更新請求限制的自動化系統。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"對了,這種系統必須具有高度的彈性和容錯能力。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們考慮了這個系統的成本和複雜性後,決定不建立它。大家都認爲,我們在迭代 2 中建立的系統足夠好:它能提供我們需要的東西,運行良好,最重要的是它非常可靠。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"完美的系統"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我不時地反思我在這個團隊中的時間,尤其是那些限制速率的項目。我們知道我們需要在某個時間點上進入迭代 3。如果我們從那裏開始,我們會建立一個高度複雜的系統,它會檢查所有的複選框,花費我們大量的時間,並且只返回我們正在嘗試解決的原始問題的一些價值,同時在此過程中產生新的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們採用類似的迭代方法來擴展各種服務,在服務之間實現重試,並提高性能,以至於代碼庫僅與我們最初集成的代碼庫有相似之處。在管理運營開銷時,我們不大可能重寫系統。最後,我們會面臨第二系統症候羣,一個冗長的、團隊士氣低落的項目,以及新類型的技術挑戰。運用尊重前人的原則鼓勵我們欣賞工作系統的價值及其帶來的經驗。在我的職業生涯中,這是我獲得的最有影響力的經驗之一,我想把它傳授給每一個願意學習的人。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"結論"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我在這個團隊的時光教會了我許多寶貴的經驗。其中最重要的兩條是:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"避免重寫的誘惑。從表面上看,重寫系統似乎是避免固有複雜性的正確做法。當你欣賞工作系統的價值和它們所體現的教訓時,重寫往往就不那麼吸引人了。(有時你必須重寫一個系統。這很正常。當你不得不重寫時,堅持複製有效的東西,要堅持有效的東西,做最小的必要改動。)"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"過渡架構——北極星固然不錯,但往往應該僅僅是一個設想。迭代或過渡架構有助於消除不必要的複雜性,並提供一個足夠好的解決方案。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這是從經驗中學來的,我希望別人不必用艱苦的方法就能學到這些經驗。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"原文鏈接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"https:\/\/zakisaninja.medium.com\/back-in-2015-at-aws-i-took-over-a-product-that-was-riddled-with-technical-debt-65b3670966e6"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章