在日常工作中,有時我們需要了解當前正在運行的集羣中每個服務實際佔用資源情況,此時我們就需要Kubernetes的集羣資源採集服務Metrics-Server,通過Metrics-Server採集Node和Pod的內存、磁盤、CPU、網絡的使用情況。
metrics-server架構示意圖:
一、metrics-server部署
1. 下載資源文件
$ git clone -b release-0.3 https://github.com/kubernetes-incubator/metrics-server.git
2、修改配置文件
$ cd metrics-server/deploy/1.8+/
$ vim metrics-server-deployment.yaml
修改後內容如下:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
image: mirrorgooglecontainers/metrics-server-amd64:v0.3.2
# image: k8s.gcr.io/metrics-server-amd64:v0.3.2
imagePullPolicy: IfNotPresent
command:
- /metrics-server
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls
volumeMounts:
- name: tmp-dir
mountPath: /tmp
注:主要修改以下3個地方,修改原因①是被牆的鏡像;②鏡像拉取策略;③添加命令和相關參數:
containers:
- name: metrics-server
image: mirrorgooglecontainers/metrics-server-amd64:v0.3.2
# image: k8s.gcr.io/metrics-server-amd64:v0.3.2
imagePullPolicy: IfNotPresent
command:
- /metrics-server
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls
修改完成後保存並退出。
3. 應用所有配置文件到系統中
$ kubectl apply -f .
過個一兩分鐘(下載鏡像和獲取數據都會耗時)檢查metrics-server的狀態,命令如下:
[root@k8s-master 1.8+]# kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
calico-node-b78m4 1/1 Running 0 176m
calico-node-r5mlj 1/1 Running 0 3h6m
calico-node-z5tdh 1/1 Running 0 176m
coredns-fb8b8dccf-6mgks 1/1 Running 0 3h21m
coredns-fb8b8dccf-cbtlx 1/1 Running 0 3h21m
etcd-k8s-master 1/1 Running 0 3h20m
kube-apiserver-k8s-master 1/1 Running 0 3h20m
kube-controller-manager-k8s-master 1/1 Running 0 3h20m
kube-proxy-c9xd2 1/1 Running 0 3h21m
kube-proxy-fp2r2 1/1 Running 0 176m
kube-proxy-lrsw7 1/1 Running 0 176m
kube-scheduler-k8s-master 1/1 Running 0 3h20m
metrics-server-7579f696d8-pgcc4 1/1 Running 0 99s
[root@k8s-master 1.8+]# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-master 179m 8% 1660Mi 43%
k8s-node1 81m 4% 908Mi 23%
k8s-node2 78m 3% 1036Mi 26%
metrics-server已經正常running,並且能夠獲取節點的信息。
二、metrics-server監控
1、監控全部節點
$ kubectl top node
2、監控指定節點
$ kubectl top node dev
3、監控指定命名空間Pod
$ kubectl top po -n oas-dev
4、監控全部命名空間節點
$ kubectl top po -A
5、節點內存使用率倒序
$ kubectl top node | sort -n -r -k 5
6、Pod內存使用率倒序
$ kubectl top po -A | sort -n -r -k 3
7、Pod指定命名空間內存使用率倒序
$ kubectl top po -n oas-dev | sort -n -r -k 3
8、Pod指定命名空間CPU使用率倒序
$ kubectl top po -n oas-dev | sort -n -r -k 2
9、Pod指定命名空間CPU使用率正序
$ kubectl top po -n oas-dev | sort -n -k 2
到此集羣資源監控工具 metrics-server 介紹完成。