聊一聊在阿里做了 8 年研發後,我對打造大型工程研發團隊的再思考

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/5b/5b7312125961e3cc52acd8b1a92c2c06.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作者|一嘯來源|","attrs":{}},{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s/AiySBaLRvMuEW-qWNoK4Jw","title":"","type":null},"content":[{"type":"text","text":"爾達 Erda 公衆號","attrs":{}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"任何大型工程項目的研發都會涉及到兩個非常共通的難題:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一個是穩定性問題,越大的項目越難做穩定,“魔鬼在細節裏”;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二個是工程研發效率。","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文我們先聊聊第二個問題,後面再談談 Erda 的穩定性建設。具體談論如何打造大型的工程研發效率之前,先回顧一下我之前在阿里的 8 年研發經歷,希望藉此形成一個有帶入感的對比。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"我在阿里的經歷","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"DataX","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我剛畢業加入淘寶後,第一次真正接觸的研發工作就是參與 DataX 的開發。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"datax 的工作原理就是全量將某個數據庫或存儲中的數據讀出,然後再全量寫入到另一個數據庫或存儲中,總結起來,就是爲了將數據從一個地方傳輸到另一個地方。當時在淘寶的核心場景是將 MySQL 的表數據傳輸到 hdfs 中進行 mapreduce 的大數據計算,除了 hadoop 計算場景外,還有將 MySQL 數據導入 Oracle RAC 集羣進行分析的場景。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"datax 的設計其實很簡潔:1 個核心框架 + N 個數據庫或存儲插件。當時整個研發團隊也就不過 4、5 個人,我主要負責 Oracle 數據庫的插件開發,很多時候都是在寫插件,因爲插件的運行本身也就是單機程序,所以整個研發工作基本都是自己一個人完成。除了最後聯調的時候,過程中也不需要過多的複雜協作,研發的效率是非常高效的。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Logserver","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"後來,我開始接觸 logserver 的開發,logserver 就是接收從淘寶每一個頁面發送過來的埋點請求,生成下游能夠消費的日誌數據。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除了上下游的需求團隊外,這個服務的開發只有我一個人全權負責,我的主要工作就是將 logserver 從 apache httpd 的架構遷移到 Nginx 架構上,核心工作就是在 Nginx 上開發一個模塊。logserver 當時在很長時間內都是整個淘寶 qps 量最大的服務,沒有之一(今天就不清楚了)。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"現在回想一下,大都是當初一個人在慢慢倒騰這個 Nginx 模塊的場景。logserver 和 datax 一樣,都是單機版程序,同樣不需要太多的協作開發。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"dbsync","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接着,我又投向 dbsync 的開發。dbsync 和 datax 要做的事情本質上差不多,就是想從全量同步數據升級到實時的增量同步。當時主要做的是從 MySQL 實時同步到 HDFS 上,印象中只有 3 個人參與開發,而我負責的工作就是寫一個 C 程序從 MySQL 中實時地將 binlog 日誌訂閱出來。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Timetunnel","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"寫到這裏突然發現:在淘寶的前幾年,我一直在做數據傳輸相關的中間件。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"TT 本質也是用來傳輸數據的,並且它本身也是一個消息中間件,類似 kafka。整個 TT 的開發團隊大概 5、6 人的樣子,我的主要工作就是基於 hbase 來實現消息的存儲引擎。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這也是我在阿里唯一一個用 java 開發的項目,其他大部分都是寫 C 和 Go。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"tengine & cdn","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"後續我轉崗去了 tengine 團隊,這裏有趣度超高。團隊大概有十多人,基本都是各做各的事情(模塊) ,同時也樂在其中。整個團隊頗受外界獵頭喜歡,甚至一度流傳“要挖就挖 tengine 團隊”。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後,整個 tengine 團隊一起去做 cdn 了。在 cdn ,我被安排帶着 3、4 個人去優化 dns 服務器和調度系統,大概情況就是老牌的 bind 性能不行了,於是完全重寫一個 dns 服務器。其實,寫一個 dns 服務器也不難,大多時候都是一個人在看 RFC ,把標準擼到絲毫不差就完事。最終,我重寫的 dns 服務器性能大幅度超越 bind ,成功上線節省了很多機器。後續準備再用 dpdk 幹一版的,還沒來得及幹,陰差陽錯就跑去做容器了。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"容器","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在我去容器團隊前,只有一個小夥伴天天和 docker 做鬥爭,我去之後變成了兩個人,他也不再是孤軍奮戰了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"然後,我們兩個人把 docker 裏裏外外魔改了一通,將其稱之爲 alidocker。改到最後,老闆們覺得始終還是 docker ,索性就決定做一個自己的容器引擎,也就是現在的 pouch。就這樣,我也算是成了 pouch 的第一個作者,可惜做了第一個版本就跑路了,現在的 pouch 也挺遺憾的。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"補一句題外話(小聲吐槽):容器團隊選擇做 Swarm 而不是 K8s 這件事情也挺遺憾的。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"總結一下","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我在阿里的 8 年,一直做系統級的基礎軟件,這些軟件有着共同的特點:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"很基礎","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"沒有業務需求","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"迭代速度較慢","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"缺少大型團隊的協作開發","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"代碼級複雜,但服務架構簡單","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"行業內非常地通用且標準","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對穩定性性能要求很高","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"很容易拿出來吹牛逼 (是不是值得吹,又是另外一個話題了~)","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"總結而言:我在阿里 8 年待過的所有團隊,好像沒有一個用過項目協作工具、持續集成 CI/CD 、全鏈路調用追蹤等研發效能相關的工具。這可能就是基礎軟件的研發現狀:團隊小、單兵作戰、強調個人能力;甚至,線上 debug、 fix bug 這種事情都是做過的。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有幸,今天我還能繼續從事基礎軟件的研發工作。不同的是團隊規模更大了,近百人的團隊共同開發一個基礎軟件,一起往前快速迭代是一件非常具有工程挑戰的事情。如果今天繼續採用我過去在阿里這種粗放、散養式的研發方式肯定是行不通的。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Erda 工程實踐","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接下來聊一聊:離開阿里後,我現在所處的 Erda 團隊是如何實踐大規模工程研發的?","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"里程碑","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/27/2756d7f3d6fcb63254a4b7ece412fd65.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(里程碑管理)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"里程碑是宏觀層面比較重要的一個工具,我們主要靠它來設定方向、框定產品大圖。如果沒有里程碑設計的話,我們很容易突然迷失方向,陷入各種日常需求和問題中。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們一般會把里程碑做得很大,或者說設計得很遠。比如,未來兩年一個比較粗的時間點要完成的事情都放入里程碑管理。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們 Erda 設計的里程碑,大概就是這個樣子:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/55/555f3b8c36f85e3238b60331b25414d6.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"里程碑的設計和管理畢竟太 high level ,有點類似頂層設計,所以它並不能成爲日常研發工作內容的安排。這有點類似於 OKR,但沒有 OKR 那麼複雜,我們只是將未來的產品大圖打散到一個比較期望的時間節奏上去,然後讓日常的工作內容儘可能沿着這條時間線往前推進。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"里程碑管理好的話,其實能起到承上啓下的作用:向上,能夠給公司層面跨團隊合作形成很好的同步和信任感;向下,能指導方向,拿到研發結果,不至於一年到頭碌碌無爲。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"需求管理","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/2f/2f8c9f85aeeec6513d99ba12fd73f5b3.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"前面我們提到了里程碑工具,要想支撐里程碑的完成,還得靠日常的具體需求和問題,這裏先一起聊聊需求。實際觀察下來,很多開發同學其實不懂什麼是真正的需求,也不懂如何接需求。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面舉個例子。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"需求方","attrs":{}},{"type":"text","text":":Erda 能不能支持從 Excel 中導入數據?(初一聽,這確實是個需求,就是要支持從 Excel 這種很常見的文件裏導入數據。稍微思考多一點的同學,就可以繼續追問。)","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"開發","attrs":{}},{"type":"text","text":":爲什麼要從 Excel 導入數據,具體的場景是什麼?","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"需求方","attrs":{}},{"type":"text","text":":Excel 表格上填寫這些數據很方,我習慣先在表格上把這些工作做好,再導入進 Erda。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"追問到這裏,其實已經能夠發現。需求方的真正的痛點是 Erda 這個功能不如 Excel 方便好用,導致用戶繞了一個彎路。所以,我們真正要做的需求是優先解決這個不好用的痛點。(至於有些人就是要這個功能,那是我們另外要討論的事情了。)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最常見的需求就是,能不能加個 XX 功能。(這裏大家可以細品一下。)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"綜上,我們要求產品和開發同學對於任何需求都要不斷思考,多追問幾個爲什麼,拿到用戶最原始的需求。始終要記住,用戶給你的極有可能是他認爲的方案,而不是原始需求。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"需求管理對了,後面的開發纔會少走彎路,需求是起點、是根本。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"迭代隊列","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ff/ff828b0fb9fd79554eba6c963d602c0a.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個商業化的大規模工程團隊,一定有 PD 這個角色。PD 是做什麼的呢?除了上面提到的需求管理外,PD 最基本的工作就是設計產品邏輯,這裏就不展開描述了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 Erda 團隊裏,PD 最需要核心設計的是迭代隊列,注意:這裏提到的是隊列,這個隊列裏需要長期裝滿 3 個迭代,其中的 1 個迭代裏存放着最優先要解決的需求和問題。爲什麼這裏是 3 個迭代的隊列,而不是 1 個迭代呢?可以思考一下,我們在系統架構中,引入消息隊列中間件的作用。PD 和開發之間完全可以通過這種方式來解耦的,開發只需要從隊列中取設計好的產品任務解決問題就好,PD 只需要不斷地從需求池裏取內容經過設計後再合理排入到迭代隊列中即可,兩邊的角色都可以實現自我驅動,不需要嚴格同步工作,所以這裏的核心是爲了異步工作。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/90/9053f1db91d313ecbd4fd845257f1ea5.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(迭代管理)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那麼,這個隊列爲什麼是迭代隊列,而不直接設計成需求隊列,開發直接取需求而不是取迭代呢?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"需求隊列會存在兩個很大的問題:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一,迭代是用來嚴格定義一個版本的時間週期,如果沒有迭代,版本的時間週期節奏會變得很亂,會進入一種比較隨意的狀態。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二,當前實際開發中的需求如果要延期解決的話,這個延期的需求重新排到哪裏去?難道重新放到需求隊列的末尾嗎?放到末尾顯然不合適。如果是迭代,就可以嚴格規定放到下一個迭代或者下下一個迭代。","attrs":{}}]}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"開發過程","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/75/752630b7d954a310783e91113f540112.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"開發過程中,對我們而言非常重要的一個關鍵指標就是:我們能夠合理接受單個需求的延期,但不能接受整個版本的延期(個別的特殊情況不在討論中)。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"很多人可能對於這個指標的理解僅停留在保持發版節奏上,甚至會覺得這只是一種表面工作,版本出來了,需求沒有做完有何用,爲什麼不延期版本把需求全部一起做完?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其實,這個關鍵指標完全不是一個管理結果,而是技術結果。很多項目在做架構設計、代碼設計的時候,是沒有考慮後續小步快跑的,不能小步快跑地添加新功能、解決新需求的話,就會經常性導致部分需求做了一半要延期的時候,根本停不下來,這個版本根本無法臨時放棄需求而繼續發版;更嚴重的是,需求和需求之間還是強耦合的,一個需求做不完,其他需求也不能正常發版。這都是實實在在的技術問題、代碼問題,而不是管理問題。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/f2/f2d000c3b5194bc965ad272df9e7c7a2.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(需求任務管理)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除了關鍵指標以外,開發過程的管理,主要是基於任務來協同的,每個需求在開發前都已經被拆分成了合理的任務,也就是說一個需求會關聯完成這個需求的所有任務,開發同學只需要每天按照優先級完成,並及時更新反饋任務的情況和進度即可。(這裏提到了及時反饋,反饋又是一個異步工作的關鍵機制。)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"很多研發主管都喜歡有事沒事在釘釘、微信裏詢問一下具體的工作進度,或者拉個會議對一下進度,所以 “已讀 + ding 一下“ 對他們來說是一個很好的功能。我們非常不鼓勵這種依賴即時通信工具或會議的方式來溝通、瞭解開發進度,這種方式非常同步,就和同步調用一樣。正確的做法,應該是開發同學每天根據自己的情況及時在項目管理工具對自己的任務進行進度更新和總結反饋,研發主管自己按需關注任務的研發進度和情況,彼此沒事不要相互打擾。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"CI 流程","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在今年年初我們決定將 Erda 開源後,就把整個團隊的日常迭代開發全部轉移到 GitHub 上執行了,因此 CI 主要是基於 GitHub 的 Action 來做的,會把常規性的檢查任務全部放到 CI 中來完成,比如:單元測試、代碼質量檢查、規範等。這個部分比較常規,沒有什麼特別的。但 GitHub Action 太慢了,我們後續計要把 GitHub Action 遷移到 Erda 的 CI,這樣我們可以實現 CI 併發更高、跑得更快、效率更高。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"本地+雲端","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個很大的產品就是一個很大的軟件工程,這樣大的工程要完全放到個人筆記本電腦上進行本地調試開發,是一件難度不小的事情。就像一箇中間件、一個框架、一個 webserver 等,都可以很輕鬆地放到本地電腦上開發調試,但要把整個淘寶裝到本地電腦就有點爲難了,當然我們也正在向這個方向努力。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在不能完全將整個 Erda 輕裝到本地電腦前,我們採用了本地 + 雲端聯合的方式來進行開發調試。雲端本身是一個已經部署好的 Erda 全平臺開發環境,包含所有能夠正常工作的組件;然後使用 telepresence 打通本地和雲端 K8s 之間的網絡:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本地可以直接使用 K8s 容器網絡內的 DNS,同時支持短域名的 search 機制。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本地啓動一個服務,劫持 (intercept)雲端 K8s 集羣內的 Pod,當 K8s 集羣內其他服務訪問該 Pod 時,流量會轉發到本地。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"實現將雲端 K8s 集羣 Pod 中的環境變量全部自動導出到本地。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"實現將雲端 K8s 集羣 Pod 中掛載的 volume 通過 sshfs 映射到本地的文件系統。","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"簡而言之,本地起一個服務,可以隨意訪問 K8s 集羣內的任意 service;同時,本地的服務也能被 K8s 集羣裏的其他服務訪問到。這樣一來,確實可以大幅度提高開發調試效率,開發人員不用在複雜環境上來來回回的折騰。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"集成環境","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"雖然 Erda 給上層的業務應用提供了很強大的持續集成、部署和運維監控等能力,但 Erda 在很長一段時間內卻不能通過自己部署自己,也就是不能實現 Erda On Erda,所以那段時間一直沒有真正的自動化集成環境,研發質量和效率也不太高。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於 Erda 整個工程項目比較龐大,代碼從構建到部署更新,再加上所有的自動化測試跑一遍,整個流程需要數小時,耗費時間長。所以,我們的集成環境主要採用每天夜間定時運行自動化集成 + 測試,次日上班就可以看到集成結果。集成出來的錯誤結果,當天內必須 fix 掉,這是持續保證集成效果的關鍵。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/6e/6eee50b8caac1e4a5fda6514d61db524.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(基於流水線的自動化集成)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自動化集成主要針對的是 Master 主幹分支,我們沒法做到任何一個 PR 觸發自動化集成,核心問題還是項目太、開發人員多、自動化用例太多,基於 PR 集成的時間成本接受不了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以看出來,我們的日常開發測試和自動化集成兩條線路也不是串行的,而是異步關係。我們不斷地追求、設計整個工程流程的異步化,只有異步才能高效支撐大型工程的研發。即使你的整個工程流程執行速度非常快,沒有時間消耗的成本,同步也會有額外問題產生的;同步的最大問題就是會被異常情況打斷、干擾,哪怕是一些可以延遲處理的異常情況,也會干擾你的工作,你必須花時間先解決異常,這種打斷和干擾對研發團隊的效率影響是致命的。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/1e/1ef65214ae57d661f80ed2c185beef50.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"API 設計和自動化測試","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"集成環境能夠有效工作的關鍵是什麼呢?很顯然,必定是自動化測試了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自動化測試分爲自動化的單元測試、接口測試、性能測試、安全測試、UI 測試等等。日常開發過程中,效率影響最大,最頻繁的顯然是單元測試、接口測試、UI 測試。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"單元測試比較簡單,我們放到了 CI 流程中,每一個 PR 就驅動完成了單元測試。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接口測試非常關鍵,我們需要覆蓋那些全鏈路場景的接口,從最頂層的接口出發完成整個系統的功能自動化測試,所以接口的自動化測試就放到了整個集成環境中完成,因爲集成環境有最新、最穩定的代碼,以及異步執行等優勢。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"UI 的自動化測試成本相對比較高,我們目前還在嘗試階段,沒有完全達到成熟。我一直希望從智能測試的角度,用 AI 算法的思路去解決 UI 的自動化測試。","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏先介紹一下接口測試。接口就是 API,要想編寫出好的接口測試用例,必須得先有 API 的設計。API 的設計從哪裏來呢?我們當然不希望從開發工程師的口中來,所以自研了 API 管理平臺,從 API 的設計到測試進行全覆蓋。開發工程師在完成需求前,首先在平臺上完成 API 的設計工作;然後將 API 的設計和需求關聯上,這樣一來,測試工程師就自主從需求裏獲取到具體的 API 信息,完成接口測試用例的編寫。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/02/02068e0e668f63fc96b839073dc17898.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(API 設計管理)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除了異步協同外,打通整個工程流程,也是我們至關重要的一環,這也是我們爲什麼不用 jira,jenkins 等的一個原因。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外,我們的自動化測試用例是由測試工程師編寫、開發工程師一起維護,對測試工程師的 OKR 設定就是任何一個版本的迴歸 + 新功能測試必須控制在一個非常小的人/天內。今天 Erda 有着近百人的研發團隊,而測試只有 5 個人。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/46/466d30f707703fcfa2d49de378fed844.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(基於場景的自動化自動化測試流程)","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"手工測試","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有了自動化測試,爲什麼還需要手工測試?手工測試顯得似乎沒有那麼高級了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於沒有前端 UI 的系統,自動化掉所有的測試理論上是可行的;一個有着複雜前端 UI 的系統,自動化起來還是會有很多難點,大多數做 UI 自動化的,基本也都是集中在幾個核心的流程上,很難覆蓋邊邊角角的全部場景。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在專業測試工程師眼裏,手工測試也是很重要的一部分,需要有工具、有方法來支撐的。每一個版本,手工測試用例也是要能夠被迴歸的,因此這些手工用例也要被記錄下來。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/bc/bc8dcf57f3a7f303b6825940272fe5ac.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(手工測試用例管理)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這部分我們就講到這裏,不再過多討論。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"答疑和問題處理","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Erda 作爲基礎平臺性產品,用戶會比較多,零零碎碎的答疑和服務支持直接打到產研團隊的話,會消耗非常多的研發精力,因此我們建設了專門的 “SRE 團隊 + 輪值的產品研發答疑同學” 來面對客戶,針對客戶和用戶的各種問題進行答疑。除了配置專業的服務以外,另外一個更重要的事情就是需要將服務支持過程中的問題記錄到 Erda 項目的工單(tickets)中,這裏說的Erda 項目工單就是等同於 GitHub 的 Issues,它並不是一個面向客戶服務的工單系統。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/dd/dd8a3978996b0750c483b0528f5ff86e.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(答疑問題管理)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"項目的工單列表在每週五需要進行 Review,將那些比較簡單、能夠快速解決的工單問題梳理出來,然後一鍵複製到迭代隊列中。這裏有兩個很關鍵的點:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"Review 先挑選的是那些簡單的、能快速解決的問題。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"放入到迭代隊列,而不是需求池。","attrs":{}}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"軟件系統的 bug 總是解決一個少一個,越跑越穩定,所以簡單的問題要快速解決、0 容忍對待。難的問題需要專項對待、專項解決,不對的時間節點或者資源有限的情況去死磕難題,大概率不是一個好想法。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"篩選出來的工單問題千萬不要放到需求池中,一定要直接進迭代規劃。進了需求池,就不知道什麼時間纔可以排上號了,明確放入迭代、明確好解決時間,快速收斂問題最重要。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"發版經理","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們在一個迭代週期內,會輪值一個“發版經理” 的角色,發版經理要做的事情核心就是協調 + 跟蹤。當然,發版經理不是去協調團隊內的你我他的事情,也不是去跟蹤某個人有沒有划水。而是和 PD 一起確定迭代要做的需求內容,參與需求是否延期到下一個迭代的決策,確認新版本週期內幾個關鍵時間節點的產物並做好驗收,以及統一負責凍結代碼分支等事情。總之,“發版經理”需要爲新版本的效率和質量負責。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"工具優先給誰用","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後,這裏再聊聊工具的推行使用問題。我個人在做整個研發管理過程中,有一個深切體會:工具的推行一般來自於上級,上級推行這個工具的出發點當然是爲了效率;但是,你經常會發現很多研發主管在關注工具的時候,一般都是從自己的管理視覺出發,而不是真正從一線員工使用工具的視角出發。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"比如,項目管理類工具的第一核心究竟應該是定位給 PM 或研發主管用,還是應該定位成工程師間的協同使用?如果你要將團隊打造成更高效的異步協同團隊,那麼這類管理工具一定是先給一線員工使用的,真正做到讓團隊內的每個人在工具上協同起來,通過工具平臺來連接你我他。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"架構設計","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏談架構設計,不是想聊 Erda 的架構,我們還是聊一下工程效率這件事情:從架構層面如何做一些效率上的保障,讓大規模研發團隊可以更加從容迭代。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"微服務化","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"早期版本,Erda 也是一個大的單體應用,團隊規模就幾號人,和我之前在阿里團隊經歷的差不多,協同起來非常高效,添加功能和解決問題也很快。但隨着人數的逐漸增多,過程中出確實現了很多協作問題、效率問題,反正最後就是拆分成了微服務。當然,微服務也有微服務的問題。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"講微服務設計方法論的分享有很多,可以自行搜索參考。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"組件化協議","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Erda 團隊的前後端比例,在最高時能達到 1:7,也就是一個前端要對接多個團隊的多個後端,產品開發迭代的瓶頸被前端資源限制住了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了解決這個問題,我們思考並探索出了一套組件化協議框架,前端提供組件庫和交互定義,專注於豐富組件的功能和改善交互體驗,如何拼裝組件和提供數據來實現業務功能,就交給後端來做,由於前端可以不關注業務邏輯和對接溝通 API 定義,中間能節省掉許多溝通成本,從而提高了整體的產品開發效率。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"db migration","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於 Erda 來說,快速迭代產生的衆多版本必須保證能夠順利升級。對我們來說,升級最難的其實是數據庫的 migration,針對這樣的一個情況,我們自己開發了一個 dbmigration 的管理框架,然後基於第一個產品版本定義好數據庫的基線,後續的每個版本都在前一個版本的基礎上開發 migration 邏輯放入到框架集中管理。Erda 的升級必須是從發佈的版本按順序一個一進行 migration 升級。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"開發一個軟件可能比較容易,開發一個能夠持續升級的軟件相對來講困難度就比較高了。針對 migration 和升級,我們在測試階段也會反覆驗證這件事情。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"寫在最後","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"研發管理這件事情本身管理的是技術,而不是人。我所接觸到的很多人,還是會覺得研發管理是安排一個 PM 去盯着整個研發過程,甚至拿着“鞭子”去抽那些走得慢的人。這個意識和認知肯定是不對的,只有把管理動作拉回到技術本身這件事情上,才能真正激發團隊的熱情。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外,任何人設計的任何一個工程流程和管理,在實際落地執行的時候,都不要去追求完美、100% 的精確,這是一件不現實的事情,有誤差纔是符合自然規律的,我們要做的事情就是將誤差控制在足夠小的範圍即可。能夠接受不完美,是研發 TL 的自我修養。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"關於 Erda 如果你有更多想要了解的內容,歡迎添加小助手微信(Erda202106)進入交流羣討論,或者直接點擊下方鏈接瞭解更多!","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Erda Github 地址:https://github.com/erda-project/erda","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Erda Cloud 官網:https://www.erda.cloud/","attrs":{}}]}]}],"attrs":{}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章