深入理解雲原生下自適應限流技術原理與應用

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"高併發與服務負載是後端領域關係最爲密切的兩個指標。伴隨着流量的升高,後端服務有效應對負載、過載的能力稱之Scalability(可伸縮性)。長期以來,基於流量閾值配置的應對方案已經在大規模雲原生場景下力不從心。同時,也對自適應流量控制方案(Adjective Load Control)提出了需求。本篇文章將聚焦於後端服務負載治理,結合對應用層和傳輸層負載指標的關鍵細節和痛點分析,引出自適應限流技術完 整的理論基礎與實踐解決方案。"}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"背景"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"負載(load),通常與併發關係密切。對於後端服務而言,任意時間內的併發用戶訪問都會提升服務負載,進而進一步消耗計算資源。然而計算資源是有限的,如CPU、memory、network等等,過載將會導致服務性能下降,進而回復滯緩甚至不可用。描述服務應對日益增長的負載的能力稱之爲"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Scalability"},{"type":"text","text":", 即可伸縮\/擴展性。以下是DDIA(Designing Data Intensive Application)中的解釋:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Scalability is the term we use to describe a system’s ability to cope with increased load"}]}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"測量負載"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"負載的具體指標"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了提升可擴展性,我們第一步要做的是固定資源下對服務請求接口進行測試,進行服務能力上限評估以及相應優化。顯然,單純將併發用於對負載的測量是不夠具體的。對於後端服務,我們會進一步關注以下兩個指標。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"負載因子(load parameter)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/91\/f7\/91d1ac2678afced1b589ef1e12e7f5f7.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"本服務中最能引起現負載增加的指標"},{"type":"text","text":"。對與web服務而言,顯然就是每秒請求數QPS;對於數據庫等服務來說,可能就是讀寫率的上升。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"關鍵結果(performance number)"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/c2\/a4\/c29a0e5102a2df363dc78d2yy352f6a4.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"本服務負載增加時最需要關注的指標"},{"type":"text","text":"。對與web服務而言,顯然就是用戶的等待時間即respond time\/latency;對於數據庫等服務來說,就是對吞吐率的影響。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"性能觀測"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以web服務爲例,我們通過增加請求數來收集對應的時延消耗如下圖。其中X軸表示請求的數量,Y軸表示對應消耗的時間。其中,每次請求所消耗的時間通過灰色柱如所示。從實踐經歷來看,同類型的請求每次消耗的時延會受到多種因素的影響,因而都不是固定的。(後文會詳細解釋)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/8a\/5d\/8aee85c7192219441cce94a2435b8b5d.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所以,單純將總請求數\/總耗時的平均結果不具代表性。爲此業內廣泛使用百分位數,又稱水位線來表示用戶實際經歷的時延情況。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"p "},{"type":"codeinline","content":[{"type":"text","text":"N"}]},{"type":"text","text":" th"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其中N表示百分比,例如圖中95th,又稱95線。假設95線對應的時延爲1.5s,總請求數爲100, 那麼"},{"type":"text","marks":[{"type":"strong"}],"text":"p95"},{"type":"text","text":"就表示在100個請求中有95個請求的時延小於1.5s。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在大型後端服務中,工程師會更傾向於關注高水位線如99th下的時延分佈,因爲這是最直接對用戶產生影響的"},{"type":"text","marks":[{"type":"strong"}],"text":"關鍵結果"},{"type":"text","text":"。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"定位優化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過水位線獲得的時延分佈可以針對性的優化相應接口邏輯,以儘可能減少業務層導致的延遲較大的問題。如優化代碼數據結構、緩存部署、數據隔離以提高命中率等等。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"資源與性能瓶頸"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"請求時延的增高不僅與業務邏輯有關,在有限的硬件資源下(cpu、memory、帶寬),負載升高同樣會進一步影響用戶請求時延。在對預期正常流量下的請求表現出的時延進行業務優化後,我們還需要對服務進一步壓測以至"},{"type":"text","marks":[{"type":"strong"}],"text":"過載"},{"type":"text","text":"情況下服務對請求的響應情況,儘可能找出負載上限,保護服務。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"壓測視圖"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/96\/45\/9608c236106be7648fddb42ea0cb7d45.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當我們使用壓測工具一步步加大強度時,關於負載各項指標的宏觀理想變化基本如上圖所示。其中"},{"type":"text","marks":[{"type":"strong"}],"text":"紅色曲線"},{"type":"text","text":"表示每秒請求被處理的用戶數TPS(throughput per second),"},{"type":"text","marks":[{"type":"strong"}],"text":"藍色曲線"},{"type":"text","text":"表示對應請求時延,X軸表示QPS量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以直觀的看出:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"初始階段"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"隨着QPS的上升,TPS也在逐步提高,時延相對穩定。因爲此時CPU和內存等計算資源相對充裕,請求在系統中無需排隊就可以處理。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":2,"normalizeStart":2},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"瓶頸階段"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當QPS加大到對應TPS曲線頂點附近時,此時計算資源負載接近滿額,服務中的請求處理出現排隊情況,時間非線性上漲。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":3,"normalizeStart":3},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"過載階段"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此時進一步加壓QPS,計算資源超額負載,進一步導致GC、進程調度、網絡等資源壓力導致TPS迅速下降,同理請求排隊過長導致時延進一步類指數級上升,最終導致服務資源耗盡出現宕機無響應等情況。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"直觀來看,由於固定計算資源的限制,當QPS加壓超過TPS峯值時系統的處理能力會被反噬。從服務保護、資源利用的角度,我們會認爲"},{"type":"text","marks":[{"type":"strong"}],"text":"將TPS頂點附近值作爲對QPS的限制閾值"},{"type":"text","text":"相對理想。"}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"過載保護"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本節將着重介紹基於具體閾值的限流辦法以及相關的侷限性,提出服務穩定前提下的核心預期。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"基於QPS閾值的限流"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在很長一段時間內,工程師都會採用上述閾值作爲服務的兜底QPS限制以防止過載。此時就出現了優秀的限流算法,即令牌桶(Token-bucket)和漏桶(leak-bucket)以及衆多衍生。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/f0\/8c\/f0af273baf57f87b03a71ca42ce2328c.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"互聯網上關於這兩種算法原理的介紹多如牛毛,在這就不再贅述。總的來說,桶算法對流量提供了整形的功能,以防止瞬時流量等極端情況打垮服務。在小規模後端服務而言基於閾值的限流方式簡單高效,能夠應付絕大多數問題。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"基於具體閾值的侷限性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上節在進行閾值測量時,對測試的整體描述爲"},{"type":"text","marks":[{"type":"strong"}],"text":"理想情況"},{"type":"text","text":"下。這意味着在大規模分佈式系統中,測量結果從工程的角度忽略了很多必要細節。這其中包括測試方式、硬件資源配額、線上scale、服務環境、閾值可用性等等衆多情況。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"測試之痛"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"現如今後端領域已經進入了雲原生時代,微服務架構早已經是事實標準。其中通訊方面是以典型的事件驅動方式(Event-driven),這表示所有的請求都是通過RPC即以接口調用的方式實現的。一個合理劃分的微服務,在稍具規模的情況下擁有的接口數量保守估計至少是在十位數以上,因此對接口級別、服務級別的閾值獲取將變得相當繁瑣和困難。在Netflix的技術博客中,他們使用"},{"type":"codeinline","content":[{"type":"text","text":"arduous task"}]},{"type":"text","text":"來形容測試的艱鉅。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接口壓測"}]}]},{"type":"listitem","attrs":{"listStyle":"none"},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如前文提到的,即使是同一個接口,每輪壓測的結果也會有差距。這是因爲機器資源在測試中同時會受到垃圾回收、線程\/進程資源搶佔、磁盤I\/O抖動、網絡帶寬質量等因素的影響。在服務級別的測試、線上環境中,此類問題會被進一步放大。爲此,測試人員不得不多次測量求個均值以求準確。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"服務級壓測"}]}]},{"type":"listitem","attrs":{"listStyle":"none"},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不同接口顯然對資源的消耗程度不同,因其包含讀、寫、下游扇出等各種情況時延也會產生差異。對於服務級別的限流閾值確定可以說是相當困難,因爲基於接口級的閾值組合會在不同用戶請求不同接口時產生波動。爲此進行流量錄製並全鏈路壓測雖能夠閾值的範圍,但仍無法做到精準,且成本過大。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"維護之痛"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所有測試以及線上環境中,最關鍵其實是計算資源的配置。所測試機器的CPU核心數和內存等配置時刻影響着測試結果。再者,如今基於容器網絡如k8s的自動資源scale能力,會使得基於具體數字的閾值設定難以維護。再者,秒級的流量限制顯然對於100ms內突發流量洪峯無法做出有效干預,導致服務異常過載。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"干預成本"}]}]},{"type":"listitem","attrs":{"listStyle":"none"},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當無論遇到線上激增流量機器指標異常、自動擴容等導致閾值不再適用,進而要增加或者減小時,工程師從發現問題到上機操作發佈配置的耗時已經錯過了干預的最佳時機。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"認知負擔"}]}]},{"type":"listitem","attrs":{"listStyle":"none"},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從上述測量流程來看,工程師必須對計算資源、接口業務、容器配合、調度時機等各方面有足夠認知才能應對各種流量突發狀況,顯然會消耗大量的精力和時間。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"綜上所述,基於具體數字的閾值方式在大規模後端服務中會進一步捉襟見肘,可延展和可伸縮性較差。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"過載下的服務預期"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在討論進一步的解決方案之前,我們要明確在負載加大時對服務情況的"},{"type":"text","marks":[{"type":"strong"}],"text":"預期"},{"type":"text","text":"。 即在使服務不垮掉情況下,充分利用計算資源服務用戶。結合具體的負載因素預期如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"儘可能多的處理request,即高TPS"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"儘可能快的處理request,即低時延"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"儘可能使得request不產生堆積,當前計算資源下存在的最佳處理量"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/59\/71\/59597808b5461d102246153e51b01c71.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"很明顯,想要同時獲取上述指標是相悖的。對於高TPS,意味着儘可能要將計算資源利用,請求產生擁塞進而導致"},{"type":"text","marks":[{"type":"strong"}],"text":"關鍵結果"},{"type":"text","text":"——時延的指數上升;對於時延,那麼就儘可能需要服務中處理的請求相對寬鬆。雖然時延低下來了但同時TPS也會很低,造成計算資源的浪費;對於服務進程中的請求最佳處理量,結合各種變化因素也不好確定。那麼有最佳的解決辦法嗎?"}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"傳輸層自適應擁塞控制"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本節將引出傳輸層TCP-BBR算法要解決的問題,以及和我們上述服務預期問題的同質性。進一步挖出共通點。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"傳輸層TCP現狀"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"TCP是面向字節流保序的傳輸層協議。在面對未知的網絡中轉節點以及對端網絡狀態,如果最大化利用網絡帶寬是其要面對的首要問題之一。爲此,應運而生了經典的擁塞控制算法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/d4\/f8\/d46d552b38134ae575cbac17251974f8.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖是傳統的基於丟包檢測的擁塞控制算法,代表有CUBIC、RENO等實現。詳細解析已經超出本文的討論範圍,此處簡述基本原理以及其核心特點。其控制週期基本上有四個組成部分,分別是慢啓動、擁塞避免、快重傳、快恢復。其中CWND表示擁塞窗口,也就是可以向網絡中一次性發送的數據量,最終窗口大小由其與對端接收窗口中的較小值來決定。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在不考慮對端窗口的影響下,由於網絡狀況未知,算法會依據不同狀態針對性的加大CWND窗口的大小。當出現丟包時CWND大小會大幅度減小,如圖中陡降部分所示。在後續的優化重傳等策略之下,仍然能夠看到當出現丟包時CWND的減幅程度十分明顯。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"基於丟包的控制算法侷限"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/d4\/dd\/d4a7636bc550340dd691f3c10ea998dd.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"該類算法的核心思想在於對丟包數據的測量,然而在現代網絡設施的發展下某些場景如長肥網絡(高延遲、高帶寬)等已經不再適用。結合上圖,由於路由節點的buffer大小日益增長,同時增加了數據包在隊列中等待時間。再加上多跳路由節點的計算、傳輸等能力的不同會很容易導致CUBIC等算法的認爲丟包,採取銳減CWND窗口的行爲。也就說當開始介入擁塞避免時,擁塞早已發生且會進一步負反饋導致吞吐下降。也就是說,基於檢測"},{"type":"text","marks":[{"type":"strong"}],"text":"丟包"},{"type":"text","text":"信號這單一指標來衡量擁塞達不到最優解。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"基於真實的網絡負載特點分析"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"顯然網絡擁塞的發生同樣是由各個路由節點網絡負載升高產生的,既然是負載,那麼就需要對其具體負載指標的分析。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"負載因子"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"拋開網絡拓撲現狀的複雜性,能夠首要引起網絡負載升高的顯然是網絡流量的增加。那麼,對應的指標就是網絡"},{"type":"text","marks":[{"type":"strong"}],"text":"Bps"},{"type":"text","text":"(bytes per second),即每秒的數據傳輸量。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"關鍵結果"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"面對負載Bps的增加,會影響到的關鍵結果就是網絡傳輸的往返時延"},{"type":"text","marks":[{"type":"strong"}],"text":"RTT"},{"type":"text","text":"(Round-trip time)上升,導致傳輸擁塞。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"傳輸觀測"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖是理想情況下網絡設備在傳輸過程中的負載變化圖,上半部分爲流量傳輸速度(Delivery rate"},{"type":"katexinline","attrs":{"mathString":"\\approx"}},{"type":"text","text":"Bps)和對應時延(RTT)的變化,下半部分爲對應網絡設備的物理狀態變化。分爲三個階段:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/04\/97\/04ac617cd766038815c46354b545fc97.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"初始階段"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"初始階段網絡設備中的隊列爲空,數據包無需排隊。隨着發送速率的增加RTT保持穩定。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":2,"normalizeStart":2},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"瓶頸階段"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此時排隊已經逐漸形成,設備的傳輸吞吐能力即帶寬達到上限,保持穩定。對於RTT而言,由於排隊的原因逐漸增加。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":3,"normalizeStart":3},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"擁塞階段"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此時網絡設備的隊列已經被流量完全打滿,多餘的數據包被拒絕,進一步導致RTT大幅上升。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從圖中紅圈標註的RTT和傳輸率折線圖出看到,RTT在瓶頸階段開始大幅度上升,傳輸速率在該階段同樣達到了瓶頸即帶寬上限。可以看出,從此處開始擁塞就要發生了,然而基於丟包的擁塞控制算法反而在上述第三階段才能夠生效,這顯然滯後太多了。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"最佳預期"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從上述理想分析,我們相對確定出了在一波數據傳輸過程中RTT的下限,即minRTT;數據傳輸帶寬能達到的上限 max BW(max bandwidth)。對於這兩個網絡層的負載指標,至今已經有了很成熟的衡量體系,在網絡傳輸中其關係如下圖所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/28\/ae\/28a120ea85f1292925c9b2bbf0ca66ae.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們可通過 "},{"type":"text","marks":[{"type":"strong"}],"text":"網絡節點每秒傳輸數據的能力"},{"type":"text","text":"即帶寬(bandwidth)"},{"type":"text","marks":[{"type":"strong"}],"text":"乘"},{"type":"text","text":" "},{"type":"text","marks":[{"type":"strong"}],"text":"數據包在節點中的傳輸處理時間時延"},{"type":"text","text":"(即latency"},{"type":"katexinline","attrs":{"mathString":"\\approx"}},{"type":"text","text":"RTT)的結果作爲網絡上正在處理的報文數量,即"},{"type":"text","marks":[{"type":"strong"}],"text":"帶寬時延積BDP"},{"type":"text","text":"。BDP決定了需要向網絡中發送的數據包量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"綜合上述,我們可以確定理想情況下向網絡傳輸數據的最佳數量,即瓶頸期指標特徵:"},{"type":"codeinline","content":[{"type":"text","text":"BestBDP = maxBW * minRTT"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"顯然,我們應對負載的關鍵指標測量已經不僅要關注"},{"type":"text","marks":[{"type":"strong"}],"text":"關鍵結果"},{"type":"text","text":",還與"},{"type":"text","marks":[{"type":"strong"}],"text":"負載因子"},{"type":"text","text":"進行綜合。相對的,多年來基於丟包的擁塞控制算法都只關注了"},{"type":"text","marks":[{"type":"strong"}],"text":"時延"},{"type":"text","text":"這單一指標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"雖然有了理論預期,但同樣出現擺在工程師面前的難題:"},{"type":"text","marks":[{"type":"strong"}],"text":"如何在拓撲結構複雜,路由情況多變的網絡環境中測得這兩個指標?"},{"type":"text","text":" 這個測試過程本質上和上節我們提出的業務負載預期指標相悖論如出一轍。那麼該如何解決這一難題呢?接下來就是BBR登場的時刻。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"TCP-BBR算法"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"BBR(Bottleneck Bandwidth and Round-trip propagation time)是Google近年來提出的擁塞控制算法,誕生後大幅度提高了在高延遲等情況下網絡傳輸的吞吐。從命名就可以看到帶寬(Bandwidth)和往返時延(Round-trip time)關鍵字,在上述鋪墊過程中,對應的就是"},{"type":"text","marks":[{"type":"strong"}],"text":"maxBW"},{"type":"text","text":"和"},{"type":"text","marks":[{"type":"strong"}],"text":"minRTT"},{"type":"text","text":"。對於多變的網絡環境BBR大膽的採用了"},{"type":"text","marks":[{"type":"strong"}],"text":"以預期公式驅動,實時交替探測兩個負載指標的辦法"},{"type":"text","text":",下文會對此詳細解釋。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"算法核心解讀"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"BBR認爲:既然網絡多變,且最佳帶寬和時延不好同時測量,那麼就採取實時交替探測的方式。通過滑動窗口細粒度的交替收集一段時間內的每秒最大傳輸量和最小的RTT,通過計算就可以獲得目前最佳的BDP。即"},{"type":"codeinline","content":[{"type":"text","text":"BestBDP = BtlBw (bottleneck bandwidth) * RTprop (round-trip propagation time)"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"BtlBw"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"表示目前網絡中的瓶頸帶寬,也就是上節中的maxBW,是網絡設施傳輸的上限。BBR會取得一段時間內滑動窗口的統計的最大BtlBw值作爲參考。其測量方式簡述爲一段時間內的數據包總量除以他們所抵達花費的時間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RTprop"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"表示拋開任何外在噪音,如ack重發耗時,網絡抖動等等導致RTT偏高情況。即在滑動窗口統計中的RTT最小值作爲參考,其測量方式爲數據包發送和回覆耗時。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"inflight"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這指的是,在BBR工作期間內,已經發送至網絡但是還是沒有收到答覆的數據包。也就說真實的,在網絡設備里正在傳輸的數據量,即負載。有了預期公式計算出的BestBDP指導將RTprop與BtlBW相乘,BBR就可以得出"},{"type":"text","marks":[{"type":"strong"}],"text":"當前時刻外界網絡最佳的負載量與實際inflight的關係"},{"type":"text","text":"。有了這樣的簡單的數值比對,算法就可以控制發包的最佳量以進行擁塞控制。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以下僞代碼直觀的體現了BBR算法在發包和收包時的處理邏輯。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"收包 onAck"}]}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/70\/56\/708a562323efc3ba4744225a0c22ef56.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"發包 send"}]}]}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/5d\/99\/5d42468ff8e5ca3af324abcc91a1bc99.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"算法組成"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"瞭解了核心的算法邏輯,接下來將簡述其運作週期,進一步瞭解算法是如何充分利用未知網絡設備傳輸能力的。算法運行狀態主要分爲啓動階段(Startup)、排空階段(Drain)、帶寬探測(ProbeBW)、時延探測(ProbeRTT)。同樣,本文將不會闡述具體細節,具體細節可以參考文末的reference。我們回到核心關注點,BBR是如何探測以及適應當前網絡設備傳輸能力的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/89\/10\/892440fc2d68f1dbbcb954cbc51af110.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖摘自google BBR論文,展示了在穩定網絡傳輸節點下BBR算法中關鍵指標RTT、BW、inflight的變化圖。其中灰色"},{"type":"text","marks":[{"type":"strong"}],"text":"cycle gain"},{"type":"text","text":"數組,相當於滑動窗口。其中每個元素裝載了帶寬探測時的增益係數,通過與當前最大BW相乘可以實現增加\/減少向網絡中的數據發送,從而實現適應未知網絡傳輸能力目的;同理,對於時延探測,簡單來說BBR同樣會週期性的發送小體量數據包收集最佳RTT。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可見,基於預期的負載控制算法,即同時集合負載因子和關鍵結果的計算,相比只關注一個指標的實現方式在高吞吐場景下具備一定的優越性。同時,基於滑動窗口細粒度的動態探測極值,使得測量結果更具時效性與說服力。"}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"自適應限流算法"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"經過上面長篇大論的分析和討論,我們終於到了本章最爲核心的自適應限流算法介紹。目的在於找到應用層與傳輸層面對負載、過載情況應對措施的共同點,以及如何應用於後端服務。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"負載在傳輸層的同質性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"綜合來看,傳輸層在面對網絡負載的測量、治理、實施歷程與應用層負載有很多共同點,如下表所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"
傳輸層應用層
負載因子BpsQps
關鍵結果RTTlatency
加壓負載因子時處理端表現delivery Rate(帶寬傳輸速率)上漲直至瓶頸TPS上漲直至瓶頸,由於進程資源搶佔等壓力會進一步反噬下跌
加壓負載因子時關鍵結果表現RTT從理想穩定狀態升高至瓶頸,伴隨擁塞指數上升latency請求等待時延理想情況下至瓶頸期處於穩定狀態,伴隨排隊導致服務過載指數上升
核心預期儘可能大利用網絡傳輸帶寬,即最大傳輸速率maxBW;最小的包等待RTT;不擁塞前提下最佳發包量BestBDP儘可能多的處理請求,即高TPS;儘可能快的處理請求,即低時延;極可能充分的利用服務計算資源,不產生堆積的數量
核心需求能夠得到網絡設施的最佳處理數據量,以避免擁塞且實現高吞吐能夠得到服務進程在當前計算資源下的最佳請求處理量,以避免請求排隊資源搶佔實現高吞吐,進一步保護服務防止過載
難點無法同時測量最佳帶寬和最低時延,基於複雜的網絡拓撲和設施數量難以確定數值無法同時測量最佳TPS和最佳latency,衆多接口耗時不同,且包含進程GC、網絡抖動、資源scale多變的情況下基於服務級別的限流數值不好確定
歷史階段基於RTT耗時檢測,即丟包認爲網絡發生擁塞進行干預基於壓測得到的qps閾值作爲限流標準
現階段TCP-BBRSentinel、Kratos ……"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(我們會在後文介紹sentinel與kratos)"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"利特爾法則"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過上表可看出應用層和傳輸層在應對負載時本質需求是相同的,那麼關於應用層的核心預期公式的推導顯然具有相似性,那就是最佳請求數"},{"type":"codeinline","content":[{"type":"text","text":"TW(當前最佳處理任數目) = TPS * latency"}]},{"type":"text","text":"。如下所示"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/f2\/e7\/f2e8fbac967779eabe5e6d2843e0yye7.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其實這個公式的依據是顯赫有名的利特爾法則little's law,爲通過對工業中平均生產數量和對應耗時提供了理論基礎,以進一步衡量生產能力。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"基於TCP-BBR的自適應算法"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們已經具備充分的理論基礎和傳輸層實現指導,下一步就是因地制宜的實施在應用層後臺服務。在業內最初版目前所知是由阿里的sentinel組件引入,由kratos進行了進一步拓展。在此我們需要搞清楚兩個關鍵問題,才能保證最大化吞吐的同時防止服務過載。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"控制時機"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"BBR作爲數據發送方,需要面臨的問題"},{"type":"text","marks":[{"type":"strong"}],"text":"未知"},{"type":"text","text":"網絡設施傳輸能力。由於網絡設施的傳輸能力、擁塞狀態對發送方是非直接可見的,所以纔有了上文提到的BBR帶寬"},{"type":"text","marks":[{"type":"strong"}],"text":"探測"},{"type":"text","text":"。滑動窗口內通過cycle gain變化,來適應不同時刻的傳輸能力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"應用層作爲請求處理方,無論是在容器網絡和物理機上部署,計算資源是相對"},{"type":"text","marks":[{"type":"strong"}],"text":"固定"},{"type":"text","text":"的。這意味着存在着最佳處理量上限,我們要保證的是在流量上升或者因爲其他因素導致計算資源緊張時,通過計算出的最佳TW來"},{"type":"text","marks":[{"type":"strong"}],"text":"限制入口流量"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"控制信號"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"瞭解了控制時機,可就是當計算資源緊張時進行干預。那麼該如何確定資源緊張信號呢?總的來說就是CPU利用率或者操作系統負載,或者內存、磁盤等資源。以入口流量特徵來看(進程RPC調用下游服務按照業務需求進行組裝、計算、返回),無論時內存資源不足導致的GC(依賴CPU)、磁盤I\/O吞吐下降、調度搶佔等等因素,都會導致用戶請求增加、序列化成本增加(CPU)進而時延上升惡性循環。所以在sentinel和kratos的實現中都選擇了適用CPU作爲資源信號限流,只不過前者使用的是cpu load1,後者使用的是服務基於cgroup對CPU的實時採樣使用率。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"兩者使用各有千秋,但我們認爲,基於load1作爲控制信號仍不夠敏感。在linux下操作系統load1表示一分鐘內CPU的平均負載值,對於流量洪峯等過載的發生干預有效性較慢。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"具體方案"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"本部分屏蔽到絕大部分代碼與設施細節,關注應用方式與過程中上線效果、遇到的問題以及優化。基於bbr算法的完整kratos代碼可進入referenc閱讀"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CPU利用率峯值信號"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"kratos以是當前容器網絡服務CPU利用率的80%作爲控制信號臨界點,通過爲此服務會開啓獨立的goroutine每隔250ms進行基於本服務的cgroup("},{"type":"codeinline","content":[{"type":"text","text":"\/sys\/fs\/cgroup\/cpu\/*"}]},{"type":"text","text":")CPU佔用信息採集,以及系統總cpu tick("},{"type":"codeinline","content":[{"type":"text","text":"proc\/stat"}]},{"type":"text","text":")佔用採集。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對CPU佔用率的計算本質是間隔內 "},{"type":"text","marks":[{"type":"strong"}],"text":"本進程佔用的CPU時間增量\/系統的總CPU時間佔用增量"},{"type":"text","text":"。顯然CPU的變化是相當迅速的,會受到各種因素的影響來回抖動動。爲此我們採用了滑動均值("},{"type":"link","attrs":{"href":"https:\/\/blog.csdn.net\/m0_38106113\/article\/details\/81542863","title":"","type":null},"content":[{"type":"text","text":"算法原理參考"}]},{"type":"text","text":")的辦法進行降噪穩定。通過確定參考衰退率β(<1),使得最終結果等於:"},{"type":"codeinline","content":[{"type":"text","text":"β*上次的CPU佔用率 + (1-β)*本次的時機測得CPU佔用率"}]},{"type":"text","text":"。如下所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/1e\/43\/1e4cba28edc563b87f39dc11b7c41743.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"深藍色折線代表了正常實際測量下CPU的變化折線圖,可以看到抖動十分不穩定。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"橘色公式折線表示了在滑動均值算法下趨於平穩變動的CPU變化圖,但是能看到前提CPU數值較低。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"綠色公式折線是在具體滑動均值算法作用下,對前期數據量不足導致CPU起點低的問題進行的"},{"type":"text","marks":[{"type":"strong"}],"text":"偏差修正"},{"type":"text","text":"。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最終經過上述修正,我們得到線上具有參考使用價值的CPU佔用率。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pass&RT"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"pass和RT分別表示處理完成請求數和對應請求所消耗的時間,即TPS和Latency。相應的,我們的測量辦法同樣是通過滑動窗口對pass和RT進行統計,如下圖。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/30\/54\/30a4022640160cdcdc9a7d1b765fcc54.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"sample window表示窗口採樣週期,sample bucket表示週期內的採樣批次。假設現在採樣窗口時間爲1000ms,bucket採樣批次時間持續500ms,那麼就表示在前500ms內完成的請求數和這些請求消耗的平均時延都會被原子(atomic)統計在bucket1中。同理,當第501ms會被統計在bucket2中,當第1001ms時會再次回到bucket1,以此類推。可見當bucket足夠多,以及統計間隔足夠小時最能夠得到真實的數據,更有效的應對秒內流量洪峯。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"流量干預"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當CPU利用率過載時,就需要通過預期公式進行干預了。我們會在服務運行期間持續統計當前服務的請求數,即inflight,通過在滑動窗口內的所有buckets中比較得出最多請求完成數maxPass,以及最小的耗時minRT,相乘就得出了預期的最佳請求數maxFlight。通過inflight與maxFlight對比,如果前者大於後者那麼就已經過載,進而拒絕後續到來的請求防止服務過載。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"線上效果與調優"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/73\/fe\/73cc01ba026e18a7c1ca56cac68f9efe.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖是在線上部署了基於kratos的自適應算法後的效果圖,其中藍色曲線代表了併發訪問的用戶數,深黃色代表對應請求的時延,淺綠色則表示成功處理的請求數。左側爲最終版,右側爲第一版。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不難看出,第一版時當算法控制後黃色的時延仍然很高,成功處理的請求數也並非穩定。產生這樣結果的原因其實依然是CPU利用率很敏感且粒度很細,當CPU大於80%利用率閾值時算法生效,當微量請求被拒絕時算法便停止了干預。最終的結果便是算法會因爲流量的湧入與拒絕中頻繁開啓與關閉,導致結果不符合預期。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"優化手段"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲此我們簡單加入了1s的冷卻時間,也就是說算法開啓後會持續至少1s的冷卻時間,再次期間保持算法開啓。當冷卻時間過後會再次統計當前CPU利用率,並根據閾值對比進行持續或者關閉。最終測試結果如最終版左圖所示,在流量持續湧入的情況下請求的成功處理數和時延都十分穩定。"}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"後記"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文儘可能通過應對負載的迭代歷程優劣分析,抽絲剝繭完整的引出基於TCP-BBR的自適應算法實現與優勢。自適應過載保護同樣不是完美的,一方面在請求等排隊過長的情況下雖然計算資源未滿載,但是同樣增加了等待時間。爲此,可以選擇結合排隊論相關算法如"},{"type":"link","attrs":{"href":"https:\/\/blog.csdn.net\/dog250\/article\/details\/72849893","title":"","type":null},"content":[{"type":"text","text":"CoDel"}]},{"type":"text","text":"進行干預;另一方面,在極端流量的湧入下可能單純的拒絕回覆成本就會打垮CPU,從而導致宕機,此時可以參考reference中google sre裏的自適應熔斷等策略合力保障服務的可用性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外,對於自適應限流技術的發展,從Netflix最早基於傳輸層vegas的限流手段推出中間件"},{"type":"codeinline","content":[{"type":"text","text":"concurrency-limits"}]},{"type":"text","text":", 再到如今基於TCP-BBR的自適應限流算法改進與實現,都體現了很強的發散性和優化空間。期待讀者的進一步討論!"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Reference"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Google SRE:https:\/\/sre.google\/sre-book\/handling-overload\/"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Alibaba Sentinel:https:\/\/sentinelguard.io\/zh-cn\/docs\/system-adaptive-protection.html"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Netflix 限流技術:https:\/\/netflixtechblog.medium.com\/performance-under-load-3e6fa9a60581"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"TensorFlow滑動窗口簡讀:https:\/\/blog.csdn.net\/m0_38106113\/article\/details\/81542863"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kratos 自適應限流源碼:https:\/\/github.com\/go-kratos\/kratos\/tree\/v1.0.x\/pkg\/ratelimit\/bbr"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CoDel 排隊論簡讀:https:\/\/blog.csdn.net\/dog250\/article\/details\/72849893"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"TCP-BBR:https:\/\/queue.acm.org\/detail.cfm?id=3022184"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https:\/\/www.net.in.tum.de\/fileadmin\/bibtex\/publications\/papers\/IFIP-Networking-2018-TCP-BBR.pdf"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https:\/\/blog.csdn.net\/russell_tao\/article\/details\/98723451"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Golang進階訓練營之微服務可用性設計-毛劍"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"作者簡介: "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"喬卓越"},{"type":"text","text":",於19年畢業,熱愛開源,樂於思考。擁有基礎架構和遊戲領域的一線開發經驗。獨立負責過大規模後端服務的開發與性能測試平臺搭建。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章