Kubernetes監控指標獲取方式對比

對比

node-exporter用於採集服務器層面的運行指標,包括機器的loadavg、filesystem、meminfo等基礎監控,類似於傳統主機監控維度的zabbix-agent。

metric-server/heapster是從api-server中獲取CPU、內存使用率這種監控指標,並把他們發送給存儲後端,如InfluxDB或雲廠商,他當前的核心作用是:爲HPA等組件提供決策指標支持。

kube-state-metrics關注於獲取Kubernetes各種資源的最新狀態,如Deployment或者DaemonSet。

例如:

我調度了多少個Replicas?現在可用的有幾個?

多少個Pod是running/stopped/terminated狀態?

Pod重啓了多少次?

我有多少job在運行中?

這些指標都由kube-state-metrics提供。

之所以沒有把kube-state-metrics納入到metric-server的能力中,是因爲他們的關注點本質上是不一樣的。

metric-server僅僅是獲取、格式化現有數據,寫入特定的存儲,實質上是一個監控系統。

kube-state-metrics是將Kubernetes的運行狀況在內存中做了個快照,並且獲取新的指標,但他沒有能力導出這些指標。

部署metric-server

下載metric-server部署的yaml文件到本地。

wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.7/components.yaml

拉取metric-server的鏡像到本地:

# docker pull zhaoqinchang/metrics-server:0.3.7
0.3.7: Pulling from zhaoqinchang/metrics-server
9ff2acc3204b: Pull complete 
9d14b55ff9a0: Pull complete 
Digest: sha256:c0efe772bb9e5c289db6cc4bc2002c268507d0226f2a3815f7213e00261c38e9
Status: Downloaded newer image for zhaoqinchang/metrics-server:0.3.7
docker.io/zhaoqinchang/metrics-server:0.3.7

修改components.yaml文件爲如下內容:

# cat components.yaml 
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:aggregated-metrics-reader
  labels:
    rbac.authorization.k8s.io/aggregate-to-view: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
rules:
- apiGroups: ["metrics.k8s.io"]
  resources: ["pods", "nodes"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: metrics-server:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: metrics-server-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
  name: v1beta1.metrics.k8s.io
spec:
  service:
    name: metrics-server
    namespace: kube-system
  group: metrics.k8s.io
  version: v1beta1
  insecureSkipTLSVerify: true
  groupPriorityMinimum: 100
  versionPriority: 100
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: metrics-server
  namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: metrics-server
  namespace: kube-system
  labels:
    k8s-app: metrics-server
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  template:
    metadata:
      name: metrics-server
      labels:
        k8s-app: metrics-server
    spec:
      serviceAccountName: metrics-server
      volumes:
      # mount in tmp so we can safely use from-scratch images and/or read-only containers
      - name: tmp-dir
        emptyDir: {}
      containers:
      - name: metrics-server
        image: zhaoqinchang/metrics-server:0.3.7    #修改鏡像爲剛剛拉取下來的鏡像
        imagePullPolicy: IfNotPresent
        args:
          - --cert-dir=/tmp
          - --secure-port=4443
        command:                 #添加以下三行command命令
            - /metrics-server
            - --kubelet-preferred-address-types=InternalIP
            - --kubelet-insecure-tls
        ports:
        - name: main-port
          containerPort: 4443
          protocol: TCP
        securityContext:
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        volumeMounts:
        - name: tmp-dir
          mountPath: /tmp
      nodeSelector:
        kubernetes.io/os: linux
---
apiVersion: v1
kind: Service
metadata:
  name: metrics-server
  namespace: kube-system
  labels:
    kubernetes.io/name: "Metrics-server"
    kubernetes.io/cluster-service: "true"
spec:
  selector:
    k8s-app: metrics-server
  ports:
  - port: 443
    protocol: TCP
    targetPort: main-port
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:metrics-server
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  - nodes/stats
  - namespaces
  - configmaps
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:metrics-server
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-server
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system

部署metric-server:

# kubectl apply  -f components.yaml 
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created

查看metric.k8s.io是否出現在Kubernetes集羣的API羣組列表中:

# kubectl api-versions | grep metrics 
metrics.k8s.io/v1beta1

使用

kubectl top命令可顯示節點和Pod對象的資源使用信息,它依賴於集羣中的資源指標API來收集各項指標數據。它包含有Node和Pod兩個子命令,可分別顯示Node對象和Pod對象的相關資源佔用率。

列出Node資源佔用率命令的語法格式爲“kubectl top node [-l label | NAME]”,例如下面顯示所有節點的資源佔用狀況的結果中顯示了各節點累計CPU資源佔用時長及百分比,以及內容空間佔用量及佔用比例。必要時,也可以在命令直接給出要查看的特定節點的標識,以及使用標籤選擇器進行節點過濾。

[root@master metric]# kubectl top nodes
NAME      CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
master    282m         14%    1902Mi          51%       
node-02   70m          3%     1371Mi          37%       
node-03   121m         1%     892Mi           11%

而名稱空間級別的Pod對象資源佔用率的使用方法會略有不同,使用時,一般應該跟定名稱空間及使用標籤選擇器過濾出目標Pod對象。例如,下面顯示kube-system名稱空間下的Pod資源使用狀況:

[root@master metric]# kubectl top pods -n kube-system
NAME                              CPU(cores)   MEMORY(bytes)   
etcd-master                       32m          300Mi           
kube-apiserver-master             86m          342Mi           
kube-controller-manager-master    30m          48Mi            
kube-flannel-ds-l5ghn             5m           10Mi            
kube-flannel-ds-rqlm2             4m           12Mi            
kube-flannel-ds-v92r9             4m           14Mi            
kube-proxy-7vjcv                  18m          15Mi            
kube-proxy-xrz8f                  13m          21Mi            
kube-proxy-zpwn6                  1m           14Mi            
kube-scheduler-master             7m           17Mi            
metrics-server-5549c7694f-7vb66   2m           14Mi

kubectl top命令爲用戶提供簡潔、快速獲取Node對象及Pod對象系統資源佔用狀況的接口,是集羣運行和維護的常用命令之一。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章