作業幫的雲原生歷程與實踐

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文源自作業幫基礎架構負責人董曉聰的分享。講述作業幫的雲原生歷程,並圍繞雲原生架構和多雲架構兩大解決方案進行深入延展。"}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"雲原生改造重塑技術體系"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"“之前在傳統的互聯網公司,大家沒法接觸到用戶,對用戶的感知更多的是一個個UV、PV數字,但在線教育不一樣,我們通過直播等形式面對的是一個個學生,每一次穩定性的事故都可能會影響他們的學業,所以作業幫對穩定性的要求只能更高。”據董曉聰介紹,作業幫在穩定性層面,主要面對以下三大挑戰:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"當出現單機、單機羣、單雲故障的時候,架構能否很好的應對這些衝擊?"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"當代碼變更導致業務中斷的時候,能不能快速止損?"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"除了穩定性外,如何控制成本以及提升效率?"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"作業幫選擇通過雲原生來解決上述問題。用基礎設施接管業務當中大量非功能的邏輯,以此來實現彈性、可觀測性、韌性、自動化、可持續等一些相關特性,通過雲原生的架構解決了部署層面的問題,然後在此之上實現了一套多雲間自由遷移的能力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"“即使從今天來看作業幫當時做的這個決定,選擇雲原生架構,也是很有魄力的,因爲它畢竟是一個技術體系重塑。”董曉聰表示,截至目前,作業幫已經完成了70%左右業務的雲原生改造,處於業內領先水平。同時作業幫在彈性擴縮、Serverless、在離線混部等方面都有廣泛的應用,在CPU調度、GPU調度、多雲管控等方面也有創新型專利產出,解決了開源社區的諸多問題。 "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在CPU調度方面,2020年上半年,作業幫在完成了一塊核心業務的容器化之後,突然發現運維成本增加了。原來在虛機模式下,運維在晚高峯的時候,只需要去做一些穩定性的巡檢,運維動作並不多。但容器化後,在晚高峯下需要不斷地對一些資源負載比較高的進行封鎖,然後把上面的一些比較重的Pod進行驅逐,經分析"},{"type":"link","attrs":{"href":"https:\/\/kubernetes.io\/#:~:text=Kubernetes%20%28K8s%29%20is%20an%20open-source%20system%20for%20automating,into%20logical%20units%20for%20easy%20management%20and%20discovery.","title":"xxx","type":null},"content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"的原生調度器還是以request進行調度,會存在一些問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"互聯網業務都會有一個明顯的"},{"type":"link","attrs":{"href":"https:\/\/www.infoq.cn\/article\/cc65U1HxA8Pzql8E6BQm","title":"xxx","type":null},"content":[{"type":"text","text":"波峯波谷"}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",在線教育的波峯波谷會更加劇烈,可能會有兩個數量級的差異。當研發在波谷的時候進行一次發佈,這時候就會觸發容器的一次重新調度,比如當服務有幾十個Pod,可能會有十多個Pod調度到一臺機器,因爲這時候的機器的使用率很低,服務怎麼調度其實都可以。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"但是到了晚高峯的時候,每一個Pod資源的使用率就上來了,CPU使用高了,它的吞吐也高了,這十個Pod都在同一個機器上,這臺機器就會出現一些資源的瓶頸。原生的調度器只考慮了一些簡單的指標,同時也沒有考慮未來的變化。基於此,作業幫做了自定義的調度器,對晚高峯進行了預測,將CPU、內存、各種IO等指標都作爲因子,同時也會定期的把歷史數據進行大數據迴歸更新。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"GPU是一個相對比較貴的資源,通過調研一些方案並和雲廠商進行溝通,瞭解到目前主要推薦的方案是GPU虛擬化,但是這會至少帶來15%的性能損耗,這個是沒法接受的。大多數的GPU服務使用的各種資源相對比較固定。鑑於此,作業幫基於算力和顯存去進行了一些策略的調度,也就是比較經典的揹包問題,同時夜間也會進行一下預測再重新調度,如果中間出現一些故障,也會執行轉移相關的策略。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"當Web業務完成容器化改造之後,團隊把一些定時任務遷移到容器平臺。這時候又出現了新的問題,很多任務會涉及到密集的計算,容器本身其實並不是一個隔離的機制,還是在做CPU時間片的分配。這些計算密集的任務多多少少還是會對Web任務造成一定的影響。同時它也會佔用主機的IP資源,node上的IP資源是有限的,定時任務調度上來之後就會分配IP,任務銷燬時IP資源也不會立刻銷燬。如果頻繁地把定時任務的Pod調度到主機羣的節點上,就會導致主機羣的Web服務沒有足夠的IP資源。此外,大規模的創建跟回收定時任務,也會觸發一些內核的問題,比如有些定時任務的內存使用比較大,大規模回收會導致陷入內核態,hang住的時間比較長。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這方面作業幫做了一些改造:建立了三個池子,Serverless、任務集羣、主機羣,優先會把定時任務去調度到Serverless上,如果調入失敗的話,再依次到任務集羣、主集羣,Serverless並不是一種完全可靠的計算模式,而是引入了一種資源預佔的方式,比較類似於金融領域爲保證事務的兩階段提交,預先去申請相關的資源,當完成預佔之後,再把真正的把任務調度過去。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"多雲架構實現秒級別自動切換"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"作業幫解決多雲架構主要面臨兩大挑戰。首先在雲間互通的專線選型上,作業幫沒有選擇裸纖的方案,而選擇了供應商的組網方案。董曉聰表示,選擇組網方案,一方面因爲有一層供應商的保護能力,另一方面是組網有一定彈性擴縮的能力。而在此之外,公司自身也做了雙鏈路。每條鏈路選擇不同的供應商,從不同地域進行接入。在這兩條鏈路上,通過"},{"type":"link","attrs":{"href":"https:\/\/networklessons.com\/cisco\/ccie-routing-switching-written\/bgp-multipath-load-sharing-ibgp-and-ebgp#:~:text=Unlike%20most%20routing%20protocols%2C%20BGP%20only%20selects%20a,second%20path%2C%20the%20following%20attributes%20have%20to%20match%3A","title":"xxx","type":null},"content":[{"type":"text","text":"BGP"}]},{"type":"text","text":"+ECMP"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"實現了鏈路的負載均衡,以及當單條線路出現故障的時候,可以實現秒級別的自動切換。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"“多雲還會面臨着一個很大的挑戰,就是計算資源的管理。”董曉聰說,單個雲下就有十幾種、幾十種機型,多雲會直接導致double、trible的工作量。作業幫對一些場景進行了建模,標準的負載型機器、專門的大內存、大存儲機型,然後再結合網絡的安全域,制定具體的業務套餐。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"“完成了上面的網絡、計算的問題之後,作業幫構建出自己的多雲架構”。董曉聰說,用戶通過DNS\/DoH分流,落到不同的機房。常態下的業務應用之間的請求是單雲閉環,不會去跨雲通信。當從機房或者專線出現故障的時候,可以通過DNS\/DoH把流量切到主機房上。當主機房出現故障的時候,還是同樣的流量調度,除此之外,還要將從機房的數據存儲,DB、Redis等進行提主,以此來實現了多雲的穩定。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"“完成雲原生、多雲改造之後,穩定性從之前的99.95%提升到了99.99%,機器故障時間的影響也從分鐘級別縮短到秒級。部署的質量也得到大幅度提升。”董曉聰透露,接下來,作業幫的發力重點會在實時音視頻的雲原生改造,推進無邊界雲計算,促成雲邊端應用一體協調。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章