目錄
HPA 基本原理
kubectl scale
命令可以來實現 Pod 的擴縮容功能,但是這個畢竟是完全手動操作的,要應對線上的各種複雜情況,我們需要能夠做到自動化去感知業務,來自動進行擴縮容。爲此,Kubernetes 也爲我們提供了這樣的一個資源對象:Horizontal Pod Autoscaling(Pod 水平自動伸縮)
,簡稱HPA
,HPA 通過監控分析一些控制器控制的所有 Pod 的負載變化情況來確定是否需要調整 Pod 的副本數量,這是 HPA 最基本的原理:
我們可以簡單的通過 kubectl autoscale
命令來創建一個 HPA 資源對象,HPA 基本原理(可通過 kube-controller-manager
的--horizontal-pod-autoscaler-sync-period
參數進行設置),查詢指定的資源中的 Pod 資源使用率,並且與創建時設定的值和指標做對比,從而實現自動伸縮的功能。
Metrics Server
在 HPA 的第一個版本中,我們需要 Heapster
提供 CPU 和內存指標,在 HPA v2 過後就需要安裝 Metrcis Server 了,Metrics Server
可以通過標準的 Kubernetes API 把監控數據暴露出來,有了 Metrics Server
之後,我們就完全可以通過標準的 Kubernetes API 來訪問我們想要獲取的監控數據了:
https://10.96.0.1/apis/metrics.k8s.io/v1beta1/namespaces/<namespace-name>/pods/<pod-name>
比如當我們訪問上面的 API 的時候,我們就可以獲取到該 Pod 的資源數據,這些數據其實是來自於 kubelet 的 Summary API
採集而來的。不過需要說明的是我們這裏可以通過標準的 API 來獲取資源監控數據,並不是因爲 Metrics Server
就是 APIServer 的一部分,而是通過 Kubernetes 提供的 Aggregator
匯聚插件來實現的,是獨立於 APIServer 之外運行的。
聚合 API
Aggregator
允許開發人員編寫一個自己的服務,把這個服務註冊到 Kubernetes 的 APIServer 裏面去,這樣我們就可以像原生的 APIServer 提供的 API 使用自己的 API 了,我們把自己的服務運行在 Kubernetes 集羣裏面,然後 Kubernetes 的 Aggregator
通過 Service 名稱就可以轉發到我們自己寫的 Service 裏面去了。這樣這個聚合層就帶來了很多好處:
- 增加了 API 的擴展性,開發人員可以編寫自己的 API 服務來暴露他們想要的 API。
- 豐富了 API,核心 kubernetes 團隊阻止了很多新的 API 提案,通過允許開發人員將他們的 API 作爲單獨的服務公開,這樣就無須社區繁雜的審查了。
- 開發分階段實驗性 API,新的 API 可以在單獨的聚合服務中開發,當它穩定之後,在合併會 APIServer 就很容易了。
- 確保新 API 遵循 Kubernetes 約定,如果沒有這裏提出的機制,社區成員可能會被迫推出自己的東西,這樣很可能造成社區成員和社區約定不一致。
安裝Metrics Server
所以現在我們要使用 HPA,就需要在集羣中安裝 Metrics Server
服務,要安裝 Metrics Server
就需要開啓 Aggregator
,因爲 Metrics Server
就是通過該代理進行擴展的,不過我們集羣是通過 Kubeadm 搭建的,默認已經開啓了,如果是二進制方式安裝的集羣,需要單獨配置 kube-apsierver 添加如下所示的參數:
--requestheader-client-ca-file=<path to aggregator CA cert>
--requestheader-allowed-names=aggregator
--requestheader-extra-headers-prefix=X-Remote-Extra-
--requestheader-group-headers=X-Remote-Group
--requestheader-username-headers=X-Remote-User
--proxy-client-cert-file=<path to aggregator proxy cert>
--proxy-client-key-file=<path to aggregator proxy key>
如果 kube-proxy
沒有和 APIServer 運行在同一臺主機上,那麼需要確保啓用瞭如下 kube-apsierver 的參數:
--enable-aggregator-routing=true
對於這些證書的生成方式,我們可以查看官方文檔:https://github.com/kubernetes-sigs/apiserver-builder-alpha/blob/master/docs/concepts/auth.md。
Aggregator
聚合層啓動完成後,就可以來安裝 Metrics Server
了,我們可以獲取該倉庫的官方安裝資源清單:
$ git clone https://github.com/kubernetes-incubator/metrics-server
$ cd metrics-server
$ kubectl apply -f deploy/1.8+/
在部署之前,修改 metrcis-server/deploy/1.8+/metrics-server-deployment.yaml
的鏡像地址爲:
containers:
- name: metrics-server
image: gcr.azk8s.cn/google_containers/metrics-server-amd64:v0.3.6
等部署完成後,可以查看 Pod 日誌是否正常:
$ kubectl get pods -n kube-system -l k8s-app=metrics-server
NAME READY STATUS RESTARTS AGE
metrics-server-6886856d7c-g5k6q 1/1 Running 0 2m39s
$ kubectl logs -f metrics-server-6886856d7c-g5k6q -n kube-system
......
E1119 09:05:57.234312 1 manager.go:111] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ydzs-node1: unable to fetch metrics from Kubelet ydzs-node1 (ydzs-node1): Get https://ydzs-node1:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup ydzs-node1 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:ydzs-node4: unable to fetch metrics from Kubelet ydzs-node4 (ydzs-node4): Get https://ydzs-node4:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup ydzs-node4 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:ydzs-node3: unable to fetch metrics from Kubelet ydzs-node3 (ydzs-node3): Get https://ydzs-node3:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup ydzs-node3 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:ydzs-master: unable to fetch metrics from Kubelet ydzs-master (ydzs-master): Get https://ydzs-master:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup ydzs-master on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:ydzs-node2: unable to fetch metrics from Kubelet ydzs-node2 (ydzs-node2):
Get https://ydzs-node2:10250/stats/summary?only_cpu_and_memory=true: dial tcp:
lookup ydzs-node2 on 10.96.0.10:53: no such host]
我們可以發現 Pod 中出現了一些錯誤信息:xxx: no such host
,我們看到這個錯誤信息一般就可以確定是 DNS 解析不了造成的,我們可以看到 Metrics Server 會通過 kubelet 的 10250 端口獲取信息,使用的是 hostname,我們部署集羣的時候在節點的 /etc/hosts
裏面添加了節點的 hostname 和 ip 的映射,但是是我們的 Metrics Server 的 Pod 內部並沒有這個 hosts 信息,當然也就不識別 hostname 了,要解決這個問題,有兩種方法: 第一種方法就是在集羣內部的 DNS 服務裏面添加上 hostname 的解析,比如我們這裏集羣中使用的是 CoreDNS
,我們就可以去修改下 CoreDNS 的 Configmap 信息,添加上 hosts 信息:
$ kubectl edit configmap coredns -n kube-system
apiVersion: v1
data:
Corefile: |
.:53 {
errors
health
hosts { # 添加集羣節點hosts隱射信息
10.151.30.11 ydzs-master
10.151.30.57 ydzs-node3
10.151.30.59 ydzs-node4
10.151.30.22 ydzs-node1
10.151.30.23 ydzs-node2
fallthrough
}
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
upstream
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
proxy . /etc/resolv.conf
cache 30
reload
}
kind: ConfigMap
metadata:
creationTimestamp: 2019-05-18T11:07:46Z
name: coredns
namespace: kube-system
這樣當在集羣內部訪問集羣的 hostname 的時候就可以解析到對應的 ip 了,另外一種方法就是在 metrics-server 的啓動參數中修改 kubelet-preferred-address-types
參數,如下:
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP
我們這裏使用第二種方式,然後重新安裝:
$ kubectl get pods -n kube-system -l k8s-app=metrics-server
NAME READY STATUS RESTARTS AGE
metrics-server-6dcfdf89b5-tvdcp 1/1 Running 0 33s
$ kubectl logs -f metric-metrics-server-58fc94d9f-jlxcb -n kube-system
......
E1119 09:08:49.805959 1 manager.go:111] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ydzs-node3: unable to fetch metrics from Kubelet ydzs-node3 (10.151.30.57): Get https://10.151.30.57:10250/stats/summary?only_cpu_and_memory=true: x509: cannot validate certificate for 10.151.30.57 because it doesn't contain any IP SANs, unable to fully scrape metrics from source kubelet_summary:ydzs-node4: unable to fetch metrics from Kubelet ydzs-node4 (10.151.30.59): Get https://10.151.30.59:10250/stats/summary?only_cpu_and_memory=true: x509: cannot validate certificate for 10.151.30.59 because it doesn't contain any IP SANs, unable to fully scrape metrics from source kubelet_summary:ydzs-node2: unable to fetch metrics from Kubelet ydzs-node2 (10.151.30.23): Get https://10.151.30.23:10250/stats/summary?only_cpu_and_memory=true: x509: cannot validate certificate for 10.151.30.23 because it doesn't contain any IP SANs, unable to fully scrape metrics from source kubelet_summary:ydzs-master: unable to fetch metrics from Kubelet ydzs-master (10.151.30.11): Get https://10.151.30.11:10250/stats/summary?only_cpu_and_memory=true: x509: cannot validate certificate for 10.151.30.11 because it doesn't contain any IP SANs, unable to fully scrape metrics from source kubelet_summary:ydzs-node1: unable to fetch metrics from Kubelet ydzs-node1 (10.151.30.22): Get https://10.151.30.22:10250/stats/summary?only_cpu_and_memory=true:
x509: cannot validate certificate for 10.151.30.22 because it doesn't contain any IP SANs]
因爲部署集羣的時候,CA 證書並沒有把各個節點的 IP 簽上去,所以這裏 Metrics Server
通過 IP 去請求時,提示籤的證書沒有對應的 IP(錯誤:x509: cannot validate certificate for 10.151.30.22 because it doesn’t contain any IP SANs
),我們可以添加一個--kubelet-insecure-tls
參數跳過證書校驗:
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
然後再重新安裝即可成功!可以通過如下命令來驗證:
$ kubectl apply -f deploy/1.8+/
$ kubectl get pods -n kube-system -l k8s-app=metrics-server
NAME READY STATUS RESTARTS AGE
metrics-server-5d4dbb78bb-6klw6 1/1 Running 0 14s
$ kubectl logs -f metrics-server-5d4dbb78bb-6klw6 -n kube-system
I1119 09:10:44.249092 1 serving.go:312] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I1119 09:10:45.264076 1 secure_serving.go:116] Serving securely on [::]:4443
$ kubectl get apiservice | grep metrics
v1beta1.metrics.k8s.io kube-system/metrics-server True 9m
$ kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"
{"kind":"NodeMetricsList","apiVersion":"metrics.k8s.io/v1beta1","metadata":{"selfLink":"/apis/metrics.k8s.io/v1beta1/nodes"},"items":[{"metadata":{"name":"ydzs-node3","selfLink":"/apis/metrics.k8s.io/v1beta1/nodes/ydzs-node3","creationTimestamp":"2019-11-19T09:11:53Z"},"timestamp":"2019-11-19T09:11:38Z","window":"30s","usage":{"cpu":"240965441n","memory":"3004360Ki"}},{"metadata":{"name":"ydzs-node4","selfLink":"/apis/metrics.k8s.io/v1beta1/nodes/ydzs-node4","creationTimestamp":"2019-11-19T09:11:53Z"},"timestamp":"2019-11-19T09:11:37Z","window":"30s","usage":{"cpu":"167036681n","memory":"2574664Ki"}},{"metadata":{"name":"ydzs-master","selfLink":"/apis/metrics.k8s.io/v1beta1/nodes/ydzs-master","creationTimestamp":"2019-11-19T09:11:53Z"},"timestamp":"2019-11-19T09:11:38Z","window":"30s","usage":{"cpu":"350907350n","memory":"2986716Ki"}},{"metadata":{"name":"ydzs-node1","selfLink":"/apis/metrics.k8s.io/v1beta1/nodes/ydzs-node1","creationTimestamp":"2019-11-19T09:11:53Z"},"timestamp":"2019-11-19T09:11:39Z","window":"30s","usage":{"cpu":"1319638039n","memory":"2094376Ki"}},{"metadata":{"name":"ydzs-node2","selfLink":"/apis/metrics.k8s.io/v1beta1/nodes/ydzs-node2","creationTimestamp":"2019-11-19T09:11:53Z"},"timestamp":"2019-11-19T09:11:36Z","window":"30s","usage":{"cpu":"320381888n","memory":"3270368Ki"}}]}
$ kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
ydzs-master 351m 17% 2916Mi 79%
ydzs-node1 1320m 33% 2045Mi 26%
ydzs-node2 321m 8% 3193Mi 41%
ydzs-node3 241m 6% 2933Mi 37%
ydzs-node4 168m 4% 2514Mi 32%
現在我們可以通過 kubectl top
命令來獲取到資源數據了,證明 Metrics Server
已經安裝成功了。
HPA
基於 CPU自動擴縮容
現在我們用 Deployment 來創建一個 Nginx Pod,然後利用 HPA
來進行自動擴縮容。資源清單如下所示:(hpa-demo.yaml)
apiVersion: apps/v1
kind: Deployment
metadata:
name: hpa-demo
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
然後直接創建 Deployment:
$ kubectl apply -f hpa-demo.yaml
deployment.apps/hpa-demo created
$ kubectl get pods -l app=nginx
NAME READY STATUS RESTARTS AGE
hpa-demo-85ff79dd56-pz8th 1/1 Running 0 21s
現在我們來創建一個 HPA
資源對象,可以使用kubectl autoscale
命令來創建:
$ kubectl autoscale deployment hpa-demo --cpu-percent=10 --min=1 --max=10
horizontalpodautoscaler.autoscaling/hpa-demo autoscaled
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-demo Deployment/hpa-demo <unknown>/10% 1 10 1 16s
此命令創建了一個關聯資源 hpa-demo 的 HPA,最小的 Pod 副本數爲1,最大爲10。HPA 會根據設定的 cpu 使用率(10%)動態的增加或者減少 Pod 數量。
當然我們依然還是可以通過創建 YAML 文件的形式來創建 HPA 資源對象。如果我們不知道怎麼編寫的話,可以查看上面命令行創建的HPA的YAML文件:
$ kubectl get hpa hpa-demo -o yaml
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
annotations:
autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"True","lastTransitionTime":"2019-11-19T09:15:12Z","reason":"SucceededGetScale","message":"the
HPA controller was able to get the target''s current scale"},{"type":"ScalingActive","status":"False","lastTransitionTime":"2019-11-19T09:15:12Z","reason":"FailedGetResourceMetric","message":"the
HPA was unable to compute the replica count: missing request for cpu"}]'
creationTimestamp: "2019-11-19T09:14:56Z"
name: hpa-demo
namespace: default
resourceVersion: "3094084"
selfLink: /apis/autoscaling/v1/namespaces/default/horizontalpodautoscalers/hpa-demo
uid: b84d79f1-75b0-46e0-95b5-4cbe3509233b
spec:
maxReplicas: 10
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: hpa-demo
targetCPUUtilizationPercentage: 10
status:
currentReplicas: 1
desiredReplicas: 0
然後我們可以根據上面的 YAML 文件就可以自己來創建一個基於 YAML 的 HPA 描述文件了。但是我們發現上面信息裏面出現了一些 Fail 信息,我們來查看下這個 HPA 對象的信息:
$ kubectl describe hpa hpa-demo
Name: hpa-demo
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 19 Nov 2019 17:14:56 +0800
Reference: Deployment/hpa-demo
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): <unknown> / 10%
Min replicas: 1
Max replicas: 10
Deployment pods: 1 current / 0 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: missing request for cpu
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 14s (x4 over 60s) horizontal-pod-autoscaler missing request for cpu
Warning FailedComputeMetricsReplicas 14s (x4 over 60s) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: missing request for cpu
我們可以看到上面的事件信息裏面出現了 failed to get cpu utilization: missing request for cpu
這樣的錯誤信息。這是因爲我們上面創建的 Pod 對象沒有添加 request 資源聲明,這樣導致 HPA 讀取不到 CPU 指標信息,所以如果要想讓 HPA 生效,對應的 Pod 資源必須添加 requests 資源聲明,更新我們的資源清單文件:
apiVersion: apps/v1
kind: Deployment
metadata:
name: hpa-demo
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
resources:
requests:
memory: 50Mi
cpu: 50m
然後重新更新 Deployment,重新創建 HPA 對象:
$ kubectl apply -f hpa.yaml
deployment.apps/hpa-demo configured
$ kubectl get pods -o wide -l app=nginx
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hpa-demo-69968bb59f-twtdp 1/1 Running 0 4m11s 10.244.4.97 ydzs-node4 <none> <none>
$ kubectl delete hpa hpa-demo
horizontalpodautoscaler.autoscaling "hpa-demo" deleted
$ kubectl autoscale deployment hpa-demo --cpu-percent=10 --min=1 --max=10
horizontalpodautoscaler.autoscaling/hpa-demo autoscaled
$ kubectl describe hpa hpa-demo
Name: hpa-demo
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 19 Nov 2019 17:23:49 +0800
Reference: Deployment/hpa-demo
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): 0% (0) / 10%
Min replicas: 1
Max replicas: 10
Deployment pods: 1 current / 1 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ScaleDownStabilized recent recommendations were higher than current one, applying the highest recent recommendation
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events: <none>
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-demo Deployment/hpa-demo 0%/10% 1 10 1 52s
現在可以看到 HPA 資源對象已經正常了,現在我們來增大負載進行測試,我們來創建一個 busybox 的 Pod,並且循環訪問上面創建的 Pod:
$ kubectl run -it --image busybox test-hpa --restart=Never --rm /bin/sh
If you don't see a command prompt, try pressing enter.
/ # while true; do wget -q -O- http://10.244.4.97; done
下圖可以看到,HPA 已經開始工作:
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-demo Deployment/hpa-demo 338%/10% 1 10 1 5m15s
$ kubectl get pods -l app=nginx --watch
NAME READY STATUS RESTARTS AGE
hpa-demo-69968bb59f-8hjnn 1/1 Running 0 22s
hpa-demo-69968bb59f-9ss9f 1/1 Running 0 22s
hpa-demo-69968bb59f-bllsd 1/1 Running 0 22s
hpa-demo-69968bb59f-lnh8k 1/1 Running 0 37s
hpa-demo-69968bb59f-r8zfh 1/1 Running 0 22s
hpa-demo-69968bb59f-twtdp 1/1 Running 0 6m43s
hpa-demo-69968bb59f-w792g 1/1 Running 0 37s
hpa-demo-69968bb59f-zlxkp 1/1 Running 0 37s
hpa-demo-69968bb59f-znp6q 0/1 ContainerCreating 0 6s
hpa-demo-69968bb59f-ztnvx 1/1 Running 0 6s
我們可以看到已經自動拉起了很多新的 Pod,最後定格在了我們上面設置的 10 個 Pod,同時查看資源 hpa-demo 的副本數量,副本數量已經從原來的1變成了10個:
$ kubectl get deployment hpa-demo
NAME READY UP-TO-DATE AVAILABLE AGE
hpa-demo 10/10 10 10 17m
查看 HPA 資源的對象瞭解工作過程:
$ kubectl describe hpa hpa-demo
Name: hpa-demo
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 19 Nov 2019 17:23:49 +0800
Reference: Deployment/hpa-demo
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): 0% (0) / 10%
Min replicas: 1
Max replicas: 10
Deployment pods: 10 current / 10 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ScaleDownStabilized recent recommendations were higher than current one, applying the highest recent recommendation
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited True TooManyReplicas the desired replica count is more than the maximum replica count
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 5m45s horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 5m30s horizontal-pod-autoscaler New size: 8; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 5m14s horizontal-pod-autoscaler New size: 10; reason: cpu resource utilization (percentage of request) above target
同樣的這個時候我們來關掉 busybox 來減少負載,然後等待一段時間觀察下 HPA 和 Deployment 對象:
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-demo Deployment/hpa-demo 0%/10% 1 10 1 14m
$ kubectl get deployment hpa-demo
NAME READY UP-TO-DATE AVAILABLE AGE
hpa-demo 1/1 1 1 24m
從 Kubernetes
v1.12
版本開始我們可以通過設置kube-controller-manager
組件的--horizontal-pod-autoscaler-downscale-stabilization
參數來設置一個持續時間,用於指定在當前操作完成後,HPA
必須等待多長時間才能執行另一次縮放操作。默認爲5分鐘,也就是默認需要等待5分鐘後纔會開始自動縮放。
可以看到副本數量已經由 10 變爲 1,當前我們只是演示了 CPU 使用率這一個指標,在後面的課程中我們還會學習到根據自定義的監控指標來自動對 Pod 進行擴縮容。
HPA
基於 內存自動擴縮容
HorizontalPodAutoscaler
是 Kubernetes autoscaling API 組的資源,在當前穩定版本 autoscaling/v1
中只支持基於 CPU 指標的縮放。在 Beta 版本 autoscaling/v2beta2
,引入了基於內存和自定義指標的縮放。所以我們這裏需要使用 Beta 版本的 API。
現在我們用 Deployment 來創建一個 Nginx Pod,然後利用 HPA 來進行自動擴縮容。資源清單如下所示:(hpa-mem-demo.yaml)
apiVersion: apps/v1
kind: Deployment
metadata:
name: hpa-mem-demo
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
volumes:
- name: increase-mem-script
configMap:
name: increase-mem-config
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts:
- name: increase-mem-script
mountPath: /etc/script
resources:
requests:
memory: 50Mi
cpu: 50m
securityContext:
privileged: true
這裏和前面普通的應用有一些區別,我們將一個名爲 increase-mem-config
的 ConfigMap 資源對象掛載到了容器中,該配置文件是用於後面增加容器內存佔用的腳本,配置文件如下所示:(increase-mem-cm.yaml)
apiVersion: v1
kind: ConfigMap
metadata:
name: increase-mem-config
data:
increase-mem.sh: |
#!/bin/bash
mkdir /tmp/memory
mount -t tmpfs -o size=40M tmpfs /tmp/memory
dd if=/dev/zero of=/tmp/memory/block
sleep 60
rm /tmp/memory/block
umount /tmp/memory
rmdir /tmp/memory
由於這裏增加內存的腳本需要使用到 mount
命令,這需要聲明爲特權模式,所以我們添加了 securityContext.privileged=true
這個配置。現在我們直接創建上面的資源對象即可:
$ kubectl apply -f increase-mem-cm.yaml
$ kubectl apply -f hpa-mem-demo.yaml
$ kubectl get pods -l app=nginx
NAME READY STATUS RESTARTS AGE
hpa-mem-demo-66944b79bf-tqrn9 1/1 Running 0 35s
然後需要創建一個基於內存的 HPA 資源對象:(hpa-mem.yaml)
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: hpa-mem-demo
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: memory
targetAverageUtilization: 60
要注意這裏使用的 apiVersion
是 autoscaling/v2beta1
,然後 metrics
屬性裏面指定的是內存的配置,直接創建上面的資源對象即可:
$ kubectl apply -f hpa-mem.yaml
horizontalpodautoscaler.autoscaling/nginx-hpa created
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
nginx-hpa Deployment/hpa-mem-demo 2%/60% 1 5 1 12s
到這裏證明 HPA 資源對象已經部署成功了,接下來我們對應用進行壓測,將內存壓上去,直接執行上面我們掛載到容器中的 increase-mem.sh
腳本即可:
$ kubectl exec -it hpa-mem-demo-66944b79bf-tqrn9 /bin/bash
root@hpa-mem-demo-66944b79bf-tqrn9:/# ls /etc/script/
increase-mem.sh
root@hpa-mem-demo-66944b79bf-tqrn9:/# source /etc/script/increase-mem.sh
dd: writing to '/tmp/memory/block': No space left on device
81921+0 records in
81920+0 records out
41943040 bytes (42 MB, 40 MiB) copied, 0.584029 s, 71.8 MB/s
然後打開另外一個終端觀察 HPA 資源對象的變化情況:
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
nginx-hpa Deployment/hpa-mem-demo 83%/60% 1 5 1 5m3s
$ kubectl describe hpa nginx-hpa
Name: nginx-hpa
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"nginx-hpa","namespace":"default"...
CreationTimestamp: Tue, 07 Apr 2020 13:13:59 +0800
Reference: Deployment/hpa-mem-demo
Metrics: ( current / target )
resource memory on pods (as a percentage of request): 3% (1740800) / 60%
Min replicas: 1
Max replicas: 5
Deployment pods: 2 current / 2 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ScaleDownStabilized recent recommendations were higher than current one, applying the highest recent recommendation
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 5m26s (x3 over 5m58s) horizontal-pod-autoscaler unable to get metrics for resource memory: no metrics returned from resource metrics API
Warning FailedComputeMetricsReplicas 5m26s (x3 over 5m58s) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get memory utilization: unable to get metrics for resource memory: no metrics returned from resource metrics API
Normal SuccessfulRescale 77s horizontal-pod-autoscaler New size: 2; reason: memory resource utilization (percentage of request) above target
$ kubectl top pod hpa-mem-demo-66944b79bf-tqrn9
NAME CPU(cores) MEMORY(bytes)
hpa-mem-demo-66944b79bf-tqrn9 0m 41Mi
可以看到內存使用已經超過了我們設定的 60% 這個閾值了,HPA 資源對象也已經觸發了自動擴容,變成了兩個副本了:
$ kubectl get pods -l app=nginx
NAME READY STATUS RESTARTS AGE
hpa-mem-demo-66944b79bf-8m4d9 1/1 Running 0 2m51s
hpa-mem-demo-66944b79bf-tqrn9 1/1 Running 0 8m11s
當內存釋放掉後,controller-manager 默認5分鐘過後會進行縮放,到這裏就完成了基於內存的 HPA 操作。
HPA 基於自定義指標自動擴縮容
除了基於 CPU 和內存來進行自動擴縮容之外,我們還可以根據自定義的監控指標來進行。這個我們就需要使用 Prometheus Adapter
,Prometheus 用於監控應用的負載和集羣本身的各種指標,Prometheus Adapter
可以幫我們使用 Prometheus 收集的指標並使用它們來制定擴展策略,這些指標都是通過 APIServer 暴露的,而且 HPA 資源對象也可以很輕易的直接使用。
首先,我們部署一個示例應用,在該應用程序上測試 Prometheus 指標自動縮放,資源清單文件如下所示:(hpa-prome-demo.yaml)
apiVersion: apps/v1
kind: Deployment
metadata:
name: hpa-prom-demo
spec:
selector:
matchLabels:
app: nginx-server
template:
metadata:
labels:
app: nginx-server
spec:
containers:
- name: nginx-demo
image: cnych/nginx-vts:v1.0
resources:
limits:
cpu: 50m
requests:
cpu: 50m
ports:
- containerPort: 80
name: http
---
apiVersion: v1
kind: Service
metadata:
name: hpa-prom-demo
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "80"
prometheus.io/path: "/status/format/prometheus"
spec:
ports:
- port: 80
targetPort: 80
name: http
selector:
app: nginx-server
type: NodePort
這裏我們部署的應用是在 80 端口的 /status/format/prometheus
這個端點暴露 nginx-vts 指標的,前面我們已經在 Prometheus 中配置了 Endpoints 的自動發現,所以我們直接在 Service 對象的 annotations
中進行配置,這樣我們就可以在 Prometheus 中採集該指標數據了。爲了測試方便,我們這裏使用 NodePort 類型的 Service,現在直接創建上面的資源對象即可:
$ kubectl apply -f hpa-prome-demo.yaml
deployment.apps/hpa-prom-demo created
service/hpa-prom-demo created
$ kubectl get pods -l app=nginx-server
NAME READY STATUS RESTARTS AGE
hpa-prom-demo-755bb56f85-lvksr 1/1 Running 0 4m52s
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hpa-prom-demo NodePort 10.101.210.158 <none> 80:32408/TCP 5m44s
......
部署完成後我們可以使用如下命令測試應用是否正常,以及指標數據接口能夠正常獲取:
$ curl http://k8s.qikqiak.com:32408
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
$ curl http://k8s.qikqiak.com:32408/status/format/prometheus
# HELP nginx_vts_info Nginx info
# TYPE nginx_vts_info gauge
nginx_vts_info{hostname="hpa-prom-demo-755bb56f85-lvksr",version="1.13.12"} 1
# HELP nginx_vts_start_time_seconds Nginx start time
# TYPE nginx_vts_start_time_seconds gauge
nginx_vts_start_time_seconds 1586240091.623
# HELP nginx_vts_main_connections Nginx connections
# TYPE nginx_vts_main_connections gauge
......
上面的指標數據中,我們比較關心的是 nginx_vts_server_requests_total
這個指標,表示請求總數,是一個 Counter
類型的指標,我們將使用該指標的值來確定是否需要對我們的應用進行自動擴縮容。
接下來我們將 Prometheus-Adapter 安裝到集羣中,並添加一個規則來跟蹤 Pod 的請求,我們可以將 Prometheus 中的任何一個指標都用於 HPA,但是前提是你得通過查詢語句將它拿到(包括指標名稱和其對應的值)。
這裏我們定義一個如下所示的規則:
rules:
- seriesQuery: 'nginx_vts_server_requests_total'
seriesFilters: []
resources:
overrides:
kubernetes_namespace:
resource: namespace
kubernetes_pod_name:
resource: pod
name:
matches: "^(.*)_total"
as: "${1}_per_second"
metricsQuery: (sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>))
這是一個帶參數的 Prometheus 查詢,其中:
seriesQuery
:查詢 Prometheus 的語句,通過這個查詢語句查詢到的所有指標都可以用於 HPAseriesFilters
:查詢到的指標可能會存在不需要的,可以通過它過濾掉。-
resources
:通過seriesQuery
查詢到的只是指標,如果需要查詢某個 Pod 的指標,肯定要將它的名稱和所在的命名空間作爲指標的標籤進行查詢,resources
就是將指標的標籤和 k8s 的資源類型關聯起來,最常用的就是 pod 和 namespace。有兩種添加標籤的方式,一種是overrides
,另一種是template
。overrides
:它會將指標中的標籤和 k8s 資源關聯起來。上面示例中就是將指標中的 pod 和 namespace 標籤和 k8s 中的 pod 和 namespace 關聯起來,因爲 pod 和 namespace 都屬於核心 api 組,所以不需要指定 api 組。當我們查詢某個 pod 的指標時,它會自動將 pod 的名稱和名稱空間作爲標籤加入到查詢條件中。比如nginx: {group: "apps", resource: "deployment"}
這麼寫表示的就是將指標中 nginx 這個標籤和 apps 這個 api 組中的deployment
資源關聯起來;- template:通過 go 模板的形式。比如
template: "kube_<<.Group>>_<<.Resource>>"
這麼寫表示,假如<<.Group>>
爲 apps,<<.Resource>>
爲 deployment,那麼它就是將指標中kube_apps_deployment
標籤和 deployment 資源關聯起來。
-
name
:用來給指標重命名的,之所以要給指標重命名是因爲有些指標是隻增的,比如以 total 結尾的指標。這些指標拿來做 HPA 是沒有意義的,我們一般計算它的速率,以速率作爲值,那麼此時的名稱就不能以 total 結尾了,所以要進行重命名。matches
:通過正則表達式來匹配指標名,可以進行分組as
:默認值爲$1
,也就是第一個分組。as
爲空就是使用默認值的意思。
-
metricsQuery
:這就是 Prometheus 的查詢語句了,前面的seriesQuery
查詢是獲得 HPA 指標。當我們要查某個指標的值時就要通過它指定的查詢語句進行了。可以看到查詢語句使用了速率和分組,這就是解決上面提到的只增指標的問題。Series
:表示指標名稱LabelMatchers
:附加的標籤,目前只有pod
和namespace
兩種,因此我們要在之前使用resources
進行關聯GroupBy
:就是 pod 名稱,同樣需要使用resources
進行關聯。
接下來我們通過 Helm Chart 來部署 Prometheus Adapter,新建 hpa-prome-adapter-values.yaml
文件覆蓋默認的 Values 值,內容如下所示:
rules:
default: false
custom:
- seriesQuery: 'nginx_vts_server_requests_total'
resources:
overrides:
kubernetes_namespace:
resource: namespace
kubernetes_pod_name:
resource: pod
name:
matches: "^(.*)_total"
as: "${1}_per_second"
metricsQuery: (sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>))
prometheus:
url: http://thanos-querier.kube-mon.svc.cluster.local
這裏我們添加了一條 rules 規則,然後指定了 Prometheus 的地址,我們這裏是使用了 Thanos 部署的 Promethues 集羣,所以用 Querier 的地址。使用下面的命令一鍵安裝:
$ helm install prometheus-adapter stable/prometheus-adapter -n kube-mon -f hpa-prome-adapter-values.yaml
NAME: prometheus-adapter
LAST DEPLOYED: Tue Apr 7 15:26:36 2020
NAMESPACE: kube-mon
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
prometheus-adapter has been deployed.
In a few minutes you should be able to list metrics using the following command(s):
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1
等一小會兒,安裝完成後,可以使用下面的命令來檢測是否生效了:
$ kubectl get pods -n kube-mon -l app=prometheus-adapter
NAME READY STATUS RESTARTS AGE
prometheus-adapter-58b559fc7d-l2j6t 1/1 Running 0 3m21s
$ kubectl get --raw="/apis/custom.metrics.k8s.io/v1beta1" | jq
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "custom.metrics.k8s.io/v1beta1",
"resources": [
{
"name": "namespaces/nginx_vts_server_requests_per_second",
"singularName": "",
"namespaced": false,
"kind": "MetricValueList",
"verbs": [
"get"
]
},
{
"name": "pods/nginx_vts_server_requests_per_second",
"singularName": "",
"namespaced": true,
"kind": "MetricValueList",
"verbs": [
"get"
]
}
]
}
我們可以看到 nginx_vts_server_requests_per_second
指標可用。 現在,讓我們檢查該指標的當前值:
$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/nginx_vts_server_requests_per_second" | jq .
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/nginx_vts_server_requests_per_second"
},
"items": [
{
"describedObject": {
"kind": "Pod",
"namespace": "default",
"name": "hpa-prom-demo-755bb56f85-lvksr",
"apiVersion": "/v1"
},
"metricName": "nginx_vts_server_requests_per_second",
"timestamp": "2020-04-07T09:45:45Z",
"value": "527m",
"selector": null
}
]
}
出現類似上面的信息就表明已經配置成功了,接下來我們部署一個針對上面的自定義指標的 HPA資源對象,如下所示:(hpa-prome.yaml)
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: nginx-custom-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: hpa-prom-demo
minReplicas: 2
maxReplicas: 5
metrics:
- type: Pods
pods:
metricName: nginx_vts_server_requests_per_second
targetAverageValue: 10
如果請求數超過每秒10個,則將對應用進行擴容。直接創建上面的資源對象:
$ kubectl apply -f hpa-prome.yaml
horizontalpodautoscaler.autoscaling/nginx-custom-hpa created
$ kubectl describe hpa nginx-custom-hpa
Name: nginx-custom-hpa
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"nginx-custom-hpa","namespace":"d...
CreationTimestamp: Tue, 07 Apr 2020 17:54:55 +0800
Reference: Deployment/hpa-prom-demo
Metrics: ( current / target )
"nginx_vts_server_requests_per_second" on pods: <unknown> / 10
Min replicas: 2
Max replicas: 5
Deployment pods: 1 current / 2 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 7s horizontal-pod-autoscaler New size: 2; reason: Current number of replicas below Spec.MinReplicas
可以看到 HPA 對象已經生效了,會應用最小的副本數2,所以會新增一個 Pod 副本:
$ kubectl get pods -l app=nginx-server
NAME READY STATUS RESTARTS AGE
hpa-prom-demo-755bb56f85-s5dzf 1/1 Running 0 67s
hpa-prom-demo-755bb56f85-wbpfr 1/1 Running 0 3m30s
接下來我們同樣對應用進行壓測:
$ while true; do wget -q -O- http://k8s.qikqiak.com:32408; done
打開另外一個終端觀察 HPA 對象的變化:
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
nginx-custom-hpa Deployment/hpa-prom-demo 14239m/10 2 5 2 4m27s
$ kubectl describe hpa nginx-custom-hpa
Name: nginx-custom-hpa
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"nginx-custom-hpa","namespace":"d...
CreationTimestamp: Tue, 07 Apr 2020 17:54:55 +0800
Reference: Deployment/hpa-prom-demo
Metrics: ( current / target )
"nginx_vts_server_requests_per_second" on pods: 14308m / 10
Min replicas: 2
Max replicas: 5
Deployment pods: 3 current / 3 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from pods metric nginx_vts_server_requests_per_second
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 5m2s horizontal-pod-autoscaler New size: 2; reason: Current number of replicas below Spec.MinReplicas
Normal SuccessfulRescale 61s horizontal-pod-autoscaler New size: 3; reason: pods metric nginx_vts_server_requests_per_second above target
可以看到指標 nginx_vts_server_requests_per_second
的數據已經超過閾值了,觸發擴容動作了,副本數變成了3,但是後續很難繼續擴容了,這是因爲上面我們的 while
命令並不夠快,3個副本完全可以滿足每秒不超過 10 個請求的閾值。
如果需要更好的進行測試,我們可以使用一些壓測工具,比如 ab、fortio 等工具。當我們中斷測試後,默認5分鐘過後就會自動縮容:
$ kubectl describe hpa nginx-custom-hpa
Name: nginx-custom-hpa
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"nginx-custom-hpa","namespace":"d...
CreationTimestamp: Tue, 07 Apr 2020 17:54:55 +0800
Reference: Deployment/hpa-prom-demo
Metrics: ( current / target )
"nginx_vts_server_requests_per_second" on pods: 533m / 10
Min replicas: 2
Max replicas: 5
Deployment pods: 2 current / 2 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from pods metric nginx_vts_server_requests_per_second
ScalingLimited True TooFewReplicas the desired replica count is less than the minimum replica count
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 23m horizontal-pod-autoscaler New size: 2; reason: Current number of replicas below Spec.MinReplicas
Normal SuccessfulRescale 19m horizontal-pod-autoscaler New size: 3; reason: pods metric nginx_vts_server_requests_per_second above target
Normal SuccessfulRescale 4m2s horizontal-pod-autoscaler New size: 2; reason: All metrics below target
到這裏我們就完成了使用自定義的指標對應用進行自動擴縮容的操作。如果 Prometheus 安裝在我們的 Kubernetes 集羣之外,則只需要確保可以從集羣訪問到查詢的端點,並在 adapter 的部署清單中對其進行更新即可。在更復雜的場景中,可以獲取多個指標結合使用來制定擴展策略。