深入理解 VPA Recommender

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文將主要針對VPA(Vertical Pod Autoscaler,Pod垂直自動擴縮容)中的核心組件Recommender(V0.9.2版)進行源碼級別的解析與實踐,該組件對VPA整體設計、多種調度策略組合應用有着重大的影響。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"文章結構如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"文章將先對VPA進行一個整體的介紹,包括目標、架構、Recommender設計理念等;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接着針對核心組件Recommender各個流程進行詳細解讀;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第三部分將介紹預測模型中的滑動窗口思想與半衰指數直方圖組件,以及如何擴展該功能以實現資源預測;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第四部分主要介紹我們針對VPA Recommender組件的一些實踐,包括如何將CPU、內存指標擴展至磁盤、網絡等13項指標以實現應用畫像的功能,以及我們做的一些性能優化。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後一部分將簡要介紹Google的Autopilot以及VPA在工業界的其他應用。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"1. VPA簡介"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"VPA 使用戶無需爲 pod 中的容器設置最新的資源請求。 配置後,它將根據資源(cpu與內存)使用情況自動設置requests。 在對pod的調度過程中,使得每個pod都可以使用適當的資源量從而分配到適合的節點上。 它既可以縮小資源請求過多的 pod,也可以根據一段時間內的使用情況擴大資源請求不足的 pod。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"目標"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"垂直縮放有兩個目標:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過自動配置資源需求來降低維護成本。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"提高集羣資源的利用率,同時最大限度地降低容器內存不足或 CPU 不足的風險。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"架構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖引自vpa官方git架構圖,具體介紹如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/77\/ce\/774e0e2a83030f337c11f62d2ee935ce.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"VPA引入了一種新型的 API 資源:VerticalPodAutoscaler。它由匹配 Pod 的標籤選擇器、資源策略(控制 VPA 如何計算資源)、更新策略(控制如何將更改應用於 Pod)和推薦的 Pod 資源(輸出字段)組成。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"VPA Recommender 監視所有 Pod,不斷爲它們計算新的推薦資源,並將推薦存儲在 VPA 對象中。它使用來自 Metrics Server 的集羣中所有 Pod 的利用率和 OOM 事件。後文會對該組件進行詳細的分析。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所有 Pod 創建請求都通過 VPA Admission Controller。如果 Pod 與任何 VerticalPodAutoscaler 對象匹配,准入控制器將使用 VPA 推薦器提供的推薦覆蓋 Pod 中容器的資源。如果 Recommender 不可用,它會回退到 VPA 對象中緩存的推薦。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"VPA Updater 是負責 Pod 實時更新的組件。如果 Pod 在 \"Auto\" 模式下使用 VPA,則Updater可以決定使用推薦器資源對其進行更新。在 最小可行版本(MVP,Minimum Viable Product) 中,這只是通過驅逐 Pod 以便使用新資源重新創建它來實現的。這種方法要求 Pod 屬於一個controller(如deployment,或其他一些能夠重新創建pod的控制器)。由於該種方式會造成pod的重啓以及重新調度,對服務入侵較大,因此原地(in-place)更新也在vpa應用中各大廠商都會進行優化的部分。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"History Storage 是一個存儲組件(如Prometheus),它使用來自 API Server 的利用率信息和 OOM(與推薦器相同的數據)並將其持久存儲。Recommender可以利用該組件在啓動時初始化其狀態。它可以由任意數據庫支持。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Recommender設計理念"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"VPA 控制容器的Resource Requests值(內存和 CPU)。在當前的可用版本中,它總是將Limit設置爲無窮大。 目前Request值的計算是基於對同一組控制器下的所有pod中具有相同名稱的容器的當前和先前運行的分析來計算的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"推薦模型 (MVP) 假設內存和 CPU 利用率是獨立的隨機變量,其分佈等於過去 N 天觀察到的變量(推薦值爲 N=8 以捕獲每週峯值)。未來更高級的模型可能會嘗試檢測趨勢、週期性和其他與時間相關的模式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於CPU,目標是將容器使用率超過請求的高百分比(例如 95%)時的時間部分保持在某個閾值(例如 1% 的時間)以下。在此模型中,“CPU 使用率”被定義爲在短時間間隔內測量的平均使用率。測量間隔越短,針對尖峯、延遲敏感的工作負載的建議質量就越高。最小合理分辨率爲 1\/min,推薦爲 1\/sec。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於內存,目標是將特定時間窗口內容器使用率超過請求的概率保持在某個閾值以下(例如,24 小時內低於 1%)。窗口必須很長(≥ 24 小時)以確保由 OOM 引起的驅逐不會明顯影響 (a) 服務應用程序的可用性 (b) 批處理計算的進度(更高級的模型可以允許用戶指定 SLO 來控制它)。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下文將針對VPA Recommender核心組件、功能擴展、性能調優、理論基礎、業界應用等對該資源推薦組件進行深度解析。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"2. Recommender 流程解讀"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Recommender是 VPA 的主要組成部分。它負責計算推薦資源。在啓動時,Recommender從History Storage中獲取所有 Pod 的歷史資源利用率(無論它們是否使用 VPA)以及 Pod OOM 事件的歷史記錄。它聚合這些數據並將其保存在內存中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在正常操作期間,Recommender通過 Metrics Server 的 Metrics API 獲取資源利用率和新事件的實時更新。此外,它還監視集羣中的所有 Pod 和所有 VPA 對象。對於與某個 VPA 選擇器(Label Selector)匹配的每個 Pod,推薦器計算推薦資源並在 VPA 對象上設置推薦。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個 VPA 對象有一組推薦值(Recommendation),因此用戶應該使用一個 VPA 來控制具有相似資源使用模式的 Pod,通常是一組副本或單個工作負載的分片,例如Deployment\\StatefulSet\\DaemonSet等。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.1 主要流程"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/4d\/a5\/4d0518d3dec99d7e72080fab8a6863a5.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Recommender啓動時的數據來源:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一是VPA checkpoint,藉助ClusterStateFeeder接口的InitFromCheckpoints方法將歷史的VPA checkpoint加載至內存。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"二是prometheus,需要配置prometheus採集k8s metrics-server上報的cpu、memory資源使用情況,再通過prometheus client藉助ClusterStateFeeder接口的InitFromHistoryProvider方法加載至內存。除此之外,歷史Pod的標籤也需要從prometheus採集。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"隨後,Recommender由一個定時器啓動(默認週期爲1min),分別執行RunOnce與健康檢查兩個步驟。 其中我們主要關注RunOnce方法,在一個運行週期內,recommender分別執行了如下6個步驟:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"LoadVPAs:將VPA對象加載至ClusterState"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"LoadPods:LoadPods將Pods對象加載至ClusterState"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"LoadRealTimeMetrics:將metrics server聚合後的容器資源使用情況加載至ClusterState"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"UpdateVPAs:以半衰指數直方圖爲輸入,計算cpu\/memory資源推薦值,並更新到k8s集羣中VPA的status字段"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MaintainCheckpoints:將推薦模型使用的半衰指數直方圖備份至vpa checkpoint"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"GarbageCollect:回收內存中無用數據,保證recommender不佔用過多的內存和VPA推薦值的新鮮度"}]}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.2 核心數據結構ClusterState"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"ClusterState是recommender核心數據結構,保存kubernetes集羣所有與Pod垂直擴縮(VPA)相關的運行信息,例如pods, containers, vpa等資源對象,以及聚合後的資源使用率(如CPU和memory)和事件(如容器OOM)等。即VPA recommender推薦算法的所有輸入都保存在這個數據結構中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"type ClusterState struct {\n \/\/ 集羣中所有Pod狀態\n Pods map[PodID]*PodState\n \/\/ 集羣中所有VPA對象\n Vpas map[VpaID]*Vpa\n \/\/ 集羣中還未有推薦值的VPA,以及第一次觀察到推薦值缺失或有相應warning日誌記錄的時間\n EmptyVPAs map[VpaID]time.Time\n \/\/ 觀察到的所有VPA,用以記錄是否需要更新狀態\n ObservedVpas []*vpa_types.VerticalPodAutoscaler\n\n \/\/ 所有容器聚合後的aggregateContainerState存儲位置\n aggregateStateMap aggregateContainerStatesMap\n \/\/ 所有Pod和容器的label set集合,可作爲recommender本地cache減少對API server 的壓力\n labelSetMap labelSetMap\n}\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"涉及到的關鍵類圖如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/fb\/f9\/fbe17e97ac674e1d0e2c943852a262f9.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.3 加載VPA與Pods"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"(1) LoadVPAs"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"LoadVPAs中recommender將VPA對象加載至ClusterState。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"VPA的targetRef記錄VPA針對的workload類型、workload名稱以及APIVersion。recommender需要獲取對應workload的ownerReference,判斷相應controller是否爲“頂層”controller。若不是,將不會爲VPA計算推薦值。同時,recommender還需獲取workload的selector,作爲VPA篩選container的Selector,同時建立Workload-Pod-Container間的映射關係。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"(2)LoadPods"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"LoadPods將Pods對象加載至ClusterState。recommender基於client-go初始化了一個PodLister,每次List集羣中全部Pod,進一步獲取所有Pod的Spec信息和Labels。ClusterState使用給定的PodID更新對應Pod狀態。若某個VPA對象的Selector和Pod Labels匹配,則爲對應的VPA和pod、Container建立關聯。pod的標籤發生變化時,ClusterState將會更新VPA和Pod、Container的關聯。 需要注意的是,基於Label的映射方法無法避免多個VPA對象匹配了同一個Pod。在實踐中,我們在kubernetes集羣中起了一個名爲workload-monitor的定時任務,定期爲集羣中新增workload創建VPA,回收workload不存在的VPA,保證同一時間只有一個VPA對象匹配同一個Pod。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.4 加載實時指標數據"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"LoadRealTimeMetric"},{"type":"text","text":" 將metrics server聚合後的容器資源使用情況加載至ClusterState,默認從metrics server獲取容器資源使用監控數據。這和HPA控制器(Horizontal Pod Autoscaler)是一致的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"聚合後的容器資源使用率保存在aggregateContainerStatesMap 這個結構體中,本質上是一個key-value結構的Map。其中Key的接口方法定義如下"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"type AggregateStateKey interface {\n Namespace() string\n ContainerName() string\n Labels() labels.Labels\n}\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"重要結構體:AggregateContainerState"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"VPA recommender以命名空間ns、容器名ContainerName、標籤labels爲key,對採集到的容器資源使用率樣本進行聚合。聚合後的數據結構爲AggregateContainerState,定義如下。同時,AggregateContainerState也可作爲計算資源推薦值的輸入。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"AggregateContainerState使用半衰指數直方圖統計容器對於CPU和Memory的資源使用分佈。半衰指數直方圖是recommender推薦算法的核心。後續章節將針對此本部分進行詳細解讀。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"OOM 處理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"recommender通過watch機制監聽集羣中Pod驅逐事件。在發生OOM(out of memory)事件時,recommender認爲當前容器對memory資源實際需求是超出觀測到的使用量的,利用下列公式估計容器對memory資源實際需求。方法是將OOM事件轉換爲內存使用樣本來建模,將“安全邊際”乘數 (“safety margin” multiplier ) 應用於最後一次觀察到的使用情況,即選擇OOMMinBumpUp和OOMBumpUpRatio計算後較大的結果,以避免VPA推薦值過小,從而造成容器反覆OOM。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"katexinline","attrs":{"mathString":"memoryNeeded = max\\left( {memoryUsed + 100MB,memoryUsed*1.2} \\right)"}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.5 更新VPA"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"UpdateVPAs"},{"type":"text","text":" 中recommender以半衰指數直方圖爲輸入,計算cpu\/memory資源推薦值,並更新到VPA的status中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"estimator"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"資源推薦值的計算邏輯部分是通過estimator實現的,ResourceEstimator接口提供從AggregateContainerState計算推薦值的方法,默認條件下只考慮CPU和內存,如下所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"type ResourceEstimator interface {\n GetResourceEstimation(s *model.AggregateContainerState) model.Resources\n}\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"recommender提供如下數個estimator,如constEstimator,percentileEstimator, marginEstimator, minResourcesEstimator, confidenceMultiplier等,通過匿名組合的方式組合多個Estimator實現複雜的計算邏輯。recommender中各個estimator作用如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"constEstimator 返回一個不可修改的固定值作爲推薦值"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"percentileEstimator根據提供的百分位數計算直方圖的百分位數"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"marginEstimator 將推薦值乘以一個”安全邊際係數“"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"minResourcesEstimator通過設置一個全局最小值,以防止推薦值過小"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"confidenceMultiplier 基於AggregateContainerState第一個和最後一個數據樣本間隔天數計算置信度指標"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"計算推薦值"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"百分位數percentile是將隨機變量的概率分佈範圍分爲幾個等份的數值點。半衰指數直方圖計算百分位數方法與普通直方圖方法相同。例如,針對VPA推薦值的lowerBound、target、upperBound,相應百分位數分別爲50%、90%、95%。以CPU爲例,假設從AggregateContainerState中的半衰指數直方圖計算出90%分位數推薦值爲base90,50%分位數推薦值爲base50,95%分位數推薦值爲base95,相應pod內容器個數爲N,recommender 最終給出的推薦值爲:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"katexinline","attrs":{"mathString":"UncappedTarget = max\\left( {base90*\\left( {1 + safetyMarginFraction} \\right),podMinCPUMillicores0.0011.0\/N} \\right)"}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"katexinline","attrs":{"mathString":"Target = max\\left( {base90*\\left( {1 + safetyMarginFraction} \\right),podMinCPUMillicores0.0011.0\/N} \\right)"}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"katexinline","attrs":{"mathString":"LoweBound = max\\left( {base50*\\left( {1 + 1\/history - length - in - days} \\right),podMinCPUMillicores0.0011.0\/N} \\right)"}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"katexinline","attrs":{"mathString":"UpperBound = max(base95*\\left( {1 + 0.001\/history - length - in - days{)^ - }2,podMinCPUMillicores0.0011.0\/N} \\right)"}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對Target、LowerBound、UpperBound,根據VPA爲每個容器指定ContainerResourcePolicy進一步處理,防止推薦值過大或過小。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"VPA爲每個容器定義了ContainerResourcePolicy,用以控制容器垂直伸縮模式、資源推薦值允許的最大和最小範圍、可調整的資源等,定義如下所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"type ContainerResourcePolicy struct {\n \/\/ 容器名稱\n ContainerName string `json:\"containerName,omitempty\" protobuf:\"bytes,1,opt,name=containerName\"`\n \/\/ 是否垂直伸縮,包括Auto和Off兩種\n Mode *ContainerScalingMode `json:\"mode,omitempty\" protobuf:\"bytes,2,opt,name=mode\"`\n \/\/ 資源推薦允許的最小值\n MinAllowed v1.ResourceList `json:\"minAllowed,omitempty\" protobuf:\"bytes,3,rep,name=minAllowed,casttype=ResourceList,castkey=ResourceName\"`\n \/\/ 資源推薦允許的最大值\n MaxAllowed v1.ResourceList `json:\"maxAllowed,omitempty\" protobuf:\"bytes,4,rep,name=maxAllowed,casttype=ResourceList,castkey=ResourceName\"`\n \/\/ 指定VPA可控制容器哪些資源,默認爲CPU和Memory\n ControlledResources *[]v1.ResourceName `json:\"controlledResources,omitempty\" patchStrategy:\"merge\" protobuf:\"bytes,5,rep,name=controlledResources\"`\n \/\/ 指定VPA可對哪些值做垂直伸縮,包括RequestsAndLimits和RequestsOnly,默認爲RequestsAndLimits\n ControlledValues *ContainerControlledValues `json:\"controlledValues,omitempty\" protobuf:\"bytes,6,rep,name=controlledValues\"`\n}\n"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.6 更新檢查點與垃圾回收"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"(1) MaintainCheckpoints維護檢查點"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RunOnce第五步爲MaintainCheckpoints,即維護集羣中VPA檢查點"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"func (r *recommender) MaintainCheckpoints(ctx context.Context, minCheckpointsPerRun int) {\n now := time.Now()\n if r.useCheckpoints {\n if err := r.checkpointWriter.StoreCheckpoints(ctx, now, minCheckpointsPerRun); err != nil {\n klog.Warningf(\"Failed to store checkpoints. Reason: %+v\", err)\n }\n if time.Now().Sub(r.lastCheckpointGC) > r.checkpointsGCInterval {\n r.lastCheckpointGC = now\n r.clusterStateFeeder.GarbageCollectCheckpoints()\n }\n }\n}\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"recommender使用的半衰指數直方圖支持保存和載入checkpoint,如referenceTimestamp、每個bucket的權重,總權重等。同時保存爲相應的k8s VerticalPodAutoscalerCheckpoint CRD資源對象。 Recommder定期將CheckPoint(檢查點)寫入k8s API server,同時回收已失效的舊CheckPoint。如果有非常多的CheckPoint需要寫入,MaintainCheckpoints每次保證至少默認寫入10個。在不超時前提下(1min),MaintainCheckpoints儘可能寫入更多的CheckPoint,直到所有的CheckPoint都被寫入或更新。同時,MaintainCheckpoints同時會刪除kubernetes集羣中沒有匹配任何VPA 的Checkpoint。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"(2) GarbageCollect"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RunOnce中最後一個步驟爲GarbageCollect,這一步中recommender嘗試回收內存中無用數據,如下。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"func (r *recommender) GarbageCollect() {\n gcTime := time.Now()\n if gcTime.Sub(r.lastAggregateContainerStateGC) > AggregateContainerStateGCInterval {\n r.clusterState.GarbageCollectAggregateCollectionStates(gcTime)\n r.lastAggregateContainerStateGC = gcTime\n }\n}\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"ClusterState同時支持GarbageCollect,移除過期的AggregateContainerState,以保證recommender不佔用過多的內存和VPA推薦值的新鮮度。判斷AggregateContainerState是否過期的依據如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"AggregateContainerState中沒有數據樣本且沒有活躍的Pod能夠貢獻cpu\/memory資源使用樣本"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"AggregateContainerState中上一個數據樣本太老(默認爲8天前)無法給出有意義的資源推薦值"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"AggregateContainerState中沒有數據樣本且創建於8天前"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"滿足上述任意一條,相應的AggregateContainerState將被移除。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"3. 滑動窗口與半衰指數直方圖"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.1 滑動窗口模型在VPA Recommender中的應用"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"(1)直方圖"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"recommender的資源推薦算法主要受Google moving window推薦器的啓發,假設CPU和Memory消耗是獨立的隨機變量,其分佈等於過去 N 天觀察到的變量分佈(推薦值爲 N=8 以捕獲每週業務容器峯值)。recommender組件獲取資源消耗實時數據,存到相應資源對象CheckPoint中。CheckPoint CRD資源本質上是一個直方圖。根據直方圖分佈比計算容器資源推薦值,使CPU和Memory資源消耗量低於該推薦值的部分佔總體時間的比重保持在某個閾值以上。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"recommender中直方圖定義如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/41\/7a\/41ebee2cf8aab31847ef0a3d90061a7a.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在pkg\/recommender\/util\/histogram.go 中定義了直方圖統一對外提供的接口,如下所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/b4\/fe\/b49d3f033d5aba37ed70e4f897d2bbfe.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"(2) 指數直方圖"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"指數直方圖中每個桶的大小以指數速率逐步提升,假設首個桶大小爲firstBucketSize,第n個桶大小計算方式爲:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"katexinline","attrs":{"mathString":"firstBucketSize*\\left( {1 + ratio + rati{o^2} + ... + rati{o^(}n - 1} \\right)) = firstBucketSize*\\left( {rati{o^n} - 1} \\right)\/\\left( {ratio - 1} \\right)numBucket = int\\left( {math.Ceil\\left( {log\\left( {ratio,maxValue*\\left( {ratio - 1} \\right)\/firstBucketSize + 1} \\right)} \\right)} \\right) + 1"}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以CPU\/Memory爲例,recommender中指數直方圖表示範圍如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"
ResourceminBucketmaxBucketrate
cpu0.01cores1000 cores5%
memory10MB1TB5%"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"(3) 半衰期和權重係數"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"半衰指數直方圖在上述基礎上增加了半衰期和樣本”年齡“的參考時間,定義如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"type decayingHistogram struct {\n histogram\n \/\/ 半衰期.\n halfLife time.Duration\n \/\/ 決定樣本相對”年齡“的參考時間,總是半衰期的倍數.\n referenceTimestamp time.Time\n}\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"recommender從metrics server或prometheus中獲取帶權重的樣本數據,如container維度的CPU、Memory資源使用等。爲每個樣本數據權重乘上指數"},{"type":"codeinline","content":[{"type":"text","text":"2^((sampleTime - referenceTimestamp) \/ halfLife)"}]},{"type":"text","text":",以保證較新的樣本被賦予更高的權重,而較老的樣本隨時間推移權重逐步衰減。默認情況下,每24h爲一個半衰期,即每經過24h,直方圖中所有樣本的權重(重要性)衰減爲原來的一半。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於直方圖中數據樣本只有相對權重更爲重要,因此樣本”年齡“的參考時間可以隨時調整。事實上,當指數過大時,referenceTimestamp就需要向前調整,以避免浮點乘法計算時向上溢出。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對CPU和Memory資源使用數據,AggregateContainerState實現了不同的處理邏輯。例如向半衰指數直方圖導入數據時,CPU使用量樣本對應的權重是基於容器CPU request值確定的。當CPU request增加時,對應的權重也隨之增加。舊的樣本數據權重將相對減少,有助於推薦模型快速應對CPU使用”尖刺“問題,減緩CPU\"飢餓等待\"機率。而Memory使用量樣本對應的權重固定爲1.0。由於內存爲不可壓縮資源,recommender劃分了memory使用量統計窗口,默認爲24h。在當前窗口內只關注資源使用量峯值,添加到對應的半衰指數直方圖中。同時這也表示,針對memory每24h recommender中只保存一個採樣點。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.2 場景延伸:不同半衰期對推薦值的影響"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在vpa recommender CPU和Memory推薦模型中,半衰期設置會嚴重影響到容器預測指標與真實指標的擬合度,例如半衰期設置較長可能導致指標預測值偏向於平直,這種情況更傾向於反應容器長週期的資源利用率情況,半衰期設置較短可能導致指標預測值偏向于波峯波谷明顯,這種情況更傾向於反應容器短週期的資源利用率情況。總結來說,不同長度的半衰期配置適用於不同的場景:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"半衰期較長的指標預測比較適合私用雲場景下的"},{"type":"text","marks":[{"type":"strong"}],"text":"用戶資源申請量推薦"},{"type":"text","text":"、"},{"type":"text","marks":[{"type":"strong"}],"text":"動態調度"},{"type":"text","text":"、"},{"type":"text","marks":[{"type":"strong"}],"text":"垂直伸縮"},{"type":"text","text":"等場景,這些場景的特點是基於指標預測值完成一次決策後應用負載在較長時間內保持穩定;"}]}]},{"type":"listitem","attrs":{"listStyle":"none"},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/d4\/yy\/d436955336a3b121009d8220fe03a1yy.jpg","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}}]}]},{"type":"listitem","attrs":{"listStyle":"none"},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/7f\/9e\/7f2af94243c4ee0130fea1d19bafb69e.jpg","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"半衰期較短的指標預測比較適合私用雲場景下的"},{"type":"text","marks":[{"type":"strong"}],"text":"在離線混部"},{"type":"text","text":"等場景,這些場景的特點是系統對應用負載指標數據的波峯波谷比較敏感,期望通過削峯填谷來實現降本增效;"}]}]},{"type":"listitem","attrs":{"listStyle":"none"},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/78\/52\/7854d9c39c607aa6c0dcff34bac19d52.jpg","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}}]}]},{"type":"listitem","attrs":{"listStyle":"none"},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/9e\/82\/9e767b68e6ddf10d63c85ae1069c5c82.jpg","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"短半衰期引入的問題:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原生的vpa recommender在部署時是一箇中心單體結構,它會對集羣中全量的pod進行指標預測,並更新至vpa status資源中。半衰期長度的設置可能誘發性能瓶頸,例如較短的半衰期設置會導致短週期內大量vpa status狀態的變更,這些變更會更新至etcd中,此時有可能會給apiserver和etcd帶來較大的壓力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"指標預測值擬合有以下設計要點:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在實踐中,我們採取以下措施緩解半衰期長度設置帶來的性能問題:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"拉長vpa recommender執行週期(recommender-interval);"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"限制vpa recommender的qps(kube-api-qps);"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"4. Recommender實踐"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"4.1 指標擴展"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作爲私有云的kubernetes團隊,在實際的kubernetes集羣資源管控與運營過程中,除了CPU和Memory之外,我們也會對容器的磁盤IO、網絡IO等指標有較多的關注,在內部自研的資源調度與運營組件中基於容器多種維度的指標進行"},{"type":"text","marks":[{"type":"strong"}],"text":"指標決策"},{"type":"text","text":"。一般意義上,容器原始的監控指標表現出較強的抖動特徵,難以爲我們需要的指標決策提供直接的能力支持,vpa recommender的特色的CPU和Memory資源預測能力爲這種場景提供了一個可選項,我們可以方便地基於vpa recommender進行擴展,實現磁盤IO和網絡IO等資源利用情況的預測。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原生Recommender採用metrics server採用監控數據源,實際資源推薦也只能僅限於cpu和memory,擴展性不高。 我們使用prometheus替代metrics server作爲監控數據源,既包括cAdvisor提供的cpu\/memory resource監控數據,也可以靈活的添加一些extended resource使用指標監控,如ephemeral-storage、hostPath,或Disk I\/O、Net I\/O等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面將以上文介紹的指數直方圖添加指標與指標配置兩個重要流程進行介紹我們是對複雜指標進行擴展的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"流程一:添加指標"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"vpa recommender對CPU和Memory資源的推薦目標有稍許差異:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於CPU, 目標是保證容器使用的CPU超過容器請求的CPU資源的高百分比(如95%)時間低於某個特定的閾值(例如,保證只有1%的時間內容器的CPU使用高於請求的CPU資源的95%)。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於內存,目標是保證在特定時間窗口內容器使用的內存超過容器請求的內存資源的概率低於某個閾值(例如,在 24 小時內低於 1%)。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"關鍵問題:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在實際工程實現中,容器CPU指標數據加入指數直方圖時的bucket權重基於容器的CPU申請量進行設置以快速應對CPU使用率尖刺問題,容器Memory指標數據加入指數直方圖會綜合考慮特定時間窗口內容器的Memory利用率峯值以降低OOM風險。對於磁盤IO和網絡IO而言,kubernetes系統未將它們作爲可申請的資源類型來進行設計,且這些資源無需考慮諸如OOM之類的系統風險,因此這些指標數據加入指數直方圖的流程設計與CPU和Memory會有稍許不同。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"擴展指標數據加入指數直方圖的流程設計有以下幾個要點:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"磁盤IO和網絡IO等擴展指標數據從prometheus獲取;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"磁盤IO和網絡IO等擴展指標類型定義於apis types中,其中磁盤IO和網絡IO分別包含讀IO和寫IO類型;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"磁盤IO和網絡IO等擴展指標加入指數直方圖時的bucket權重設置與Memory設置相同,同爲1;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"與CPU類似,磁盤IO和網絡IO等擴展指標的lastSampleStart、firstSampleStart、totalSamplesCount信息同樣對置信度計算有較大影響,我們使用字典保存這些信息;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/b1\/f3\/b1aa2083b24106f9e529960a703ed3f3.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"流程二:指標配置"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"關鍵問題:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"vpa recommender的推薦模型將CPU和Memory作爲獨立的隨機變量進行考慮,它們的分佈等於在過去N天中觀察到的分佈。在進行指標擴展時,我們同樣將磁盤IO和網絡IO等擴展指標作爲獨立的隨機變量進行考慮。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"與vpa原生的CPU和Memory指標不同,磁盤IO和網絡IO等擴展指標往往分屬於不同的指標維度,不同的指標維度下又有不同的具體指標,而且不同的指標維度之間有較大的差異,相同的指標維度之內有較強的相關性,因此在模型設計上我們傾向於以下原則:"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":1,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"單個擴展指標維度內的具體指標使用相同的配置項。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"直方圖配置項有以下幾個要點:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"直方圖配置項中新增額外的磁盤IO和網絡IO等擴展指標項配置信息,如半衰期等;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"直方圖配置項中新增的擴展指標項信息以指標維度爲單位進行設置,後續擴展指標預測時不同的指標維度下的具體指標使用指定的指標維度的配置項,下圖中apa爲我們擴展vpa後的自定義資源(k8s crd)名稱。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/ec\/92\/eccefd709f9846b2186daf4b686b3f92.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"4.2 性能優化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"優化一:冷啓動內存佔用問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在實踐中我們發現,recommender冷啓動時需從prometheus加載歷史資源使用監控數據,往往短時間內需要佔用過多的內存。分析後發現,recommender默認加載8天的監控數據,耗時較長。歷史監控數據ClusterHistory經聚合後以半衰指數直方圖形式保存在ClusterState數據結構中,無用內存被golang垃圾回收機制回收,此時內存佔用纔會恢復至正常值。 在10000+容器的kubernetes集羣中,內存佔用峯值爲數十GB。此外由於經常發生超時錯誤重試,冷啓動時間至少需要40~50分鐘左右。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們優化了這部分邏輯,按時間進行切片採集prometheus監控數據(默認每次加載12h的監控數據),聚合後依次保存在直方圖中。這裏我們未採用並行方式的原因有二:一是recommender的半衰指數直方圖並非併發安全的;二是我們不希望給集羣的prometheus帶來太大的併發壓力。但調整後仍能觀察到一些性能提升,其中recommender內存佔用峯值降爲數GB。通過緩解查詢prometheus超時重試現象,冷啓動時間縮短爲10~15分鐘左右。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"優化二:VPA更新性能瓶頸"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於某些負載較爲穩定的服務,VPA推薦值並非在每個RunOnce週期都需要更新。在原生實現中,判斷VPA是否需要更新依據是前後兩次推薦值是否完全相等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於VPA 推薦值針對CPU單位爲毫核(即爲CPU core的1\/1000),而針對Memory單位爲byte。根據推薦模型計算公式,每次RunOnce週期內推薦值基本不可能完全相等,結果是kubernetes集羣內每個VPA對象在每個recommender RunOnce週期內都需要更新,在大規模集羣中將會給API server造成比較大的壓力。在我們測試場景下,千級別負載,每輪至少更新百餘+vpa,在默認的更新週期內(1分鐘),原生recommender無法完成全部vpa的更新,此種情況導致大量vpa更新累積,進而影響加載最新Metrics的頻率,最終導致vpa推薦值無法實時更新。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對上述情況,我們增大了VPA的更新粒度,若相鄰兩個週期內推薦值diff在CPU 0.1core或Memory 100MB以上,纔會去更新相應的VPA。同時不改變原始recommender計算的推薦值。在我們測試場景下,千級別負載,每輪更新vpa的值由百降低至幾十\/十幾,得到了數量級的降低,在默認的更新週期內(1分鐘)可以正常完成更新,保證了vpa推薦值的準確性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"優化三:非原生負載資源緩存問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對k8s 原生workload例如DaemonSet, Deployment, ReplicaSet, StatefulSet, ReplicationController, Job, CronJob等,recommender內置並在啓動時初始化相應的client-go informer。而針對非k8s原生的自定義workload,需要直接從API Server處獲取,這在大規模kubernete集羣中給API Server非常大的壓力。社區通過給自定義workload資源增加一種scale子資源,同時增加相應controllerFetcher和controllerCache解決這個問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"VPA檢查點垃圾回收問題導致VPA中存儲過期數據"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此外,我們還發現雖然在MaintainCheckpoints步驟中,會刪除kubernetes集羣中沒有匹配任何VPA 的Checkpoint。 但原生recommender不會刪除VPA存在但對應container不存在的Checkpoint。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"用戶重新發布workload存在刪除現有container的可能性,我們修復了相關邏輯,回收了該部分實際已不存在的容器, 防止部分Checkpoint既脫離了recommender管理,也無法刪除的情況(該情況最終會導致VPA中顯示了已經被刪除了的container信息),提升了VPA的可用性。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"5. 業界應用"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Autopilot"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Autopilot是Google在內部雲上的自動縮放工具。Google使用Autopilot自動配置資源,同時調整作業中的併發任務數(水平縮放)和單個任務的CPU\/內存限制(垂直縮放)。提升資源利用率,同時保證服務質量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Autopilot主要使用兩種算法用於垂直伸縮:第一種依賴於歷史用量的指數平滑滑動窗口;第二種基於從強化學習借用的算法,從多個滑動窗口算法中爲每個任務選擇歷史數據表現最佳的算法。本質上說,Autopilot的算法是基於滑動時間窗口的。這和VPA recommender推薦算法也是相似的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Autopilot使用5分鐘作爲一個滑動窗口,每個滑動窗口內使用直方圖記錄任務的資源使用。對於CPU每1秒包含一個使用樣本數據。而對於memory,Autopilot僅關注5分鐘窗口內的使用峯值,以減少OOM。這個recommender推薦算法是相似的。不同之處在於,VPA recommender每1分鐘從metrics server獲取一次cpu\/memory的樣本數據,且對於memory的統計時間窗口爲24h,即僅關注24h內的使用峯值。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲實現推薦值隨着資源使用量的增加而迅速增加,但在使用量減少後緩慢減少,以避免對任務負載暫時下降時的過快響應。Autopilot使用指數衰減權重對資源使用樣本進行加權。每個樣本數據權重等於負載b[k],即樣本落在直方圖的第k個bin的值(邊界值)。這和VPA recommender稍有不同(recommender中樣本的權重等於對應資源如cpu\/memory的request值)。但都能達到對負載尖峯平滑響應的效果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此外,Autopilot使用的半衰指數直方圖,CPU的權重半衰期爲12小時,memory的權重半衰期爲48小時。這和VPA recommender推薦算法也是相似的,但VPA recommender對於cpu\/memory的半衰期默認爲8天。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Autopilot根據半衰指數直方圖的百分位數Pj計算推薦值,這和VPA recommender也是相同的。對於CPU,不同類型作業的百分位數如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"批處理作業:50%,認爲批處理作業可以承受一定的CPU壓制而正常運行"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"服務作業:根據負載對延遲的敏感度分別爲95%或90% 對於內存,Autopilot根據作業對OOM的容忍度使用不同的百分位數,劃分爲中等、低、最低三個類型,如下:"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"50%或60%適用於OOM容錯中等的作業"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"98%適用於OOM容錯低的作業"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"100%適用於OOM容錯最低的作業"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而在VPA recommender中,推薦值分爲UpperBound、Target、LowerBound,對應半衰指數直方圖的百分位數分別爲95%、90%、50%。VPA根據lowerBound和upperBound決定pod是否需要被更新。若pod資源請求低於lowerBound或者高於upperBound,VPA 驅逐該pod並通過Admission Controller修改它的request爲target推薦值。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"VPA在雲廠商中的應用"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於k8s vpa當前版仍不支持原地更新,對業務入侵較大。在實際工業界應用中,各大容器雲廠商針對vpa的優化主要集中在pod request值熱更新這個方向,涉及的技術包括但不限於負載(Deployment\/Statefulset等)維度資源request熱更新、節點維度Kubelet資源限制(Cgroup等)熱更新等,例如阿里雲、騰訊雲內部都有相應的應用。由於本篇文章主要關注點在VPA資源預測組件,相關技術不在本文介紹範圍內,在此不贅述。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此外,針對VPA Recommender組件,也有利用滑動窗口模型,複用該組件以實現短週期資源預測功能。如騰訊雲中的在離線混部組件Caelus,就有在已有pod和容器資源使用畫像基礎上,藉助VPA推薦算法,增加節點資源使用畫像。即對每個節點上所有在線業務的實際資源使用量進行預測,同時動態調整上報的擴展資源數量,該部分擴展資源將提供給混部的離線業務使用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"作者介紹"},{"type":"text","text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"王麗婧 搜狗資深高級開發工程師,主要關注領域爲雲原生場景下的資源管理。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"程振京 搜狗開發工程師,主要關注領域爲雲原生場景下的調度、GPU管理 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"劉雲飛 搜狗開發工程師,主要關注領域爲雲原生場景下的調度、存儲。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"劉建 搜狗資深架構師,搜狗容器雲平臺負責人,目前主要關注領域爲雲原生。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章