Kube-prometheus監控

介紹

很多地方提到Prometheus Operator是kubernetes集羣監控的終極解決方案,但是目前Prometheus Operator已經不包含完整功能,完整的解決方案已經變爲kube-prometheus。項目地址爲:

https://github.com/coreos/kube-prometheus

kube-prometheus 是一整套監控解決方案,它使用 Prometheus 採集集羣指標,Grafana 做展示,包含如下組件:

組件 功能描述
The Prometheus Operator 可以非常簡單的在kubernetes集羣中部署Prometheus服務,並且提供對kubernetes集羣的監控,並且可以配置和管理prometheus
Highly available Prometheus 高可用監控工具
Highly available Alertmanager 高可用告警工具,用於接收 Prometheus 發送的告警信息,它支持豐富的告警通知渠道,而且很容易做到告警信息進行去重,降噪,分組等,是一款前衛的告警通知系統。
node-exporter 用於採集服務器層面的運行指標,包括機器的loadavg、filesystem、meminfo等基礎監控,類似於傳統主機監控維度的zabbix-agent
Prometheus Adapter for Kubernetes Metrics APIs (k8s-prometheus-adapter) 輪詢Kubernetes API,並將Kubernetes的結構化信息轉換爲metrics
kube-state-metrics 收集kubernetes集羣內資源對象數據,制定告警規則。
grafana 用於大規模指標數據的可視化展現,是網絡架構和應用分析中最流行的時序數據展示工具

其中 k8s-prometheus-adapter 使用 Prometheus 實現了 metrics.k8s.io 和 custom.metrics.k8s.io API,所以不需要再部署 metrics-server( metrics-server 通過 kube-apiserver 發現所有節點,然後調用 kubelet APIs(通過 https 接口)獲得各節點(Node)和 Pod 的 CPU、Memory 等資源使用情況。 從 Kubernetes 1.12 開始,kubernetes 的安裝腳本移除了 Heapster,從 1.13 開始完全移除了對 Heapster 的支持,Heapster 不再被維護)。

1.部署

1.1下載源碼

cd /etc/kubernetes
git clone https://github.com/coreos/kube-prometheus.git

1.2執行安裝

[root@k8s-m01 kube-prometheus]# pwd
/etc/kubernetes/kube-prometheus
# 安裝 prometheus-operator
[root@k8s-m01 kube-prometheus]# kubectl apply -f manifests/setup
# 安裝 promethes metric adapter
[root@k8s-m01 kube-prometheus]# kubectl apply -f manifests/

1.3 查看資源

[root@k8s-m01 kube-prometheus]# kubectl get pod,svc,ep -n monitoring
NAME                                       READY   STATUS              RESTARTS   AGE
pod/alertmanager-main-0                    0/2     ContainerCreating   0          7s
pod/alertmanager-main-1                    0/2     ContainerCreating   0          7s
pod/alertmanager-main-2                    0/2     ContainerCreating   0          7s
pod/grafana-5c55845445-wvtrj               0/1     Pending             0          5s
pod/kube-state-metrics-957fd6c75-whw8r     0/3     Pending             0          5s
pod/node-exporter-gqsrh                    0/2     ContainerCreating   0          5s
pod/node-exporter-qmv8h                    0/2     Pending             0          5s
pod/prometheus-adapter-5cdcdf9c8d-whln4    0/1     Pending             0          5s
pod/prometheus-k8s-0                       0/3     Pending             0          4s
pod/prometheus-k8s-1                       0/3     Pending             0          4s
pod/prometheus-operator-6f98f66b89-46fj6   2/2     Running             0          29s

NAME                            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
service/alertmanager-main       ClusterIP   10.109.157.168   <none>        9093/TCP                     7s
service/alertmanager-operated   ClusterIP   None             <none>        9093/TCP,9094/TCP,9094/UDP   7s
service/grafana                 ClusterIP   10.110.108.88    <none>        3000/TCP                     6s
service/kube-state-metrics      ClusterIP   None             <none>        8443/TCP,9443/TCP            6s
service/node-exporter           ClusterIP   None             <none>        9100/TCP                     6s
service/prometheus-adapter      ClusterIP   10.102.225.171   <none>        443/TCP                      6s
service/prometheus-k8s          ClusterIP   10.100.29.234    <none>        9090/TCP                     5s
service/prometheus-operated     ClusterIP   None             <none>        9090/TCP                     6s
service/prometheus-operator     ClusterIP   None             <none>        8443/TCP                     30s

NAME                              ENDPOINTS         AGE
endpoints/alertmanager-main       <none>            7s
endpoints/alertmanager-operated   <none>            7s
endpoints/grafana                 <none>            6s
endpoints/kube-state-metrics      <none>            6s
endpoints/node-exporter                             6s
endpoints/prometheus-adapter      <none>            6s
endpoints/prometheus-k8s          <none>            5s
endpoints/prometheus-operated     <none>            6s
endpoints/prometheus-operator     10.244.2.3:8443   30s


kube-prometheus 創建的 crd 資源

[root@k8s-m01 kube-prometheus]# kubectl get crd -o wide
NAME                                    CREATED AT
alertmanagers.monitoring.coreos.com     2020-05-24T10:11:01Z
podmonitors.monitoring.coreos.com       2020-05-24T10:11:01Z
prometheuses.monitoring.coreos.com      2020-05-24T10:11:01Z
prometheusrules.monitoring.coreos.com   2020-05-24T10:11:02Z
servicemonitors.monitoring.coreos.com   2020-05-24T10:11:02Z
thanosrulers.monitoring.coreos.com      2020-05-24T10:11:02Z

prometheus 資源定義了 prometheus 服務應該如何運行:

[root@k8s-m01 kube-prometheus]# kubectl -n monitoring get prometheus,alertmanager
NAME                                   VERSION   REPLICAS   AGE
prometheus.monitoring.coreos.com/k8s   v2.17.2   2          16s

NAME                                      VERSION   REPLICAS   AGE
alertmanager.monitoring.coreos.com/main   v0.20.0   3          17s

prometheus 和 alertmanager 都是 statefulset 控制器:

[root@k8s-m01 kube-prometheus]# kubectl get statefulset -o wide -n monitoring
NAME                READY   AGE   CONTAINERS                                                       IMAGES
alertmanager-main   0/3     22s   alertmanager,config-reloader                                     quay.io/prometheus/alertmanager:v0.20.0,jimmidyson/configmap-reload:v0.3.0
prometheus-k8s      0/2     21s   prometheus,prometheus-config-reloader,rules-configmap-reloader   quay.io/prometheus/prometheus:v2.17.2,quay.io/coreos/prometheus-config-reloader:v0.39.0,jimmidyson/configmap-reload:v0.3.0

查看node和pod資源使用率:

https://10.0.0.61:6443/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
在這裏插入圖片描述

命令行查看node和pod使用資源信息

[root@k8s-m01 kube-prometheus]# kubectl top node
NAME      CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
k8s-m01   151m         7%     1516Mi          39%       
k8s-m02   136m         6%     1299Mi          33% 

[root@k8s-m01 kube-prometheus]# kubectl top pod
NAME             CPU(cores)   MEMORY(bytes)   
nginx-ds-9dfb7   0m           1Mi 

[root@k8s-m01 kube-prometheus]# kubectl top pod -n kube-system
NAME                              CPU(cores)   MEMORY(bytes)   
coredns-66bff467f8-gpz5f          2m           16Mi            
coredns-66bff467f8-pfzc6          2m           14Mi            
etcd-k8s-m01                      23m          99Mi            
kube-apiserver-k8s-m01            51m          488Mi           
kube-controller-manager-k8s-m01   9m           54Mi            
kube-flannel-ds-amd64-xgln9       1m           10Mi            
kube-proxy-nx2jf                  2m           16Mi            
kube-scheduler-k8s-m01            2m           21Mi  

清除資源

kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup
# 強制刪除pod
kubectl delete pod prometheus-k8s-1 -n monitoring --force --grace-period=0

以上各組件說明:

  • MerticServer: k8s集羣資源使用情況的聚合器,收集數據給k8s集羣內使用;如kubectl,hpa,scheduler
    PrometheusOperator:是一個系統監測和警報工具箱,用來存儲監控數據。
    NodeExPorter:用於各個node的關鍵度量指標狀態數據。
    kubeStateMetrics:收集k8s集羣內資源對象數據,指定告警規則。
    Prometheus:採用pull方式收集apiserver,scheduler,control-manager,kubelet組件數據,通過http協議傳輸。
    Grafana:是可視化數據統計和監控平臺。

2.訪問方式

1.nodeport方式

Kubernetes 服務的 NodePort 默認端口範圍是 30000-32767,在某些場合下,這個限制不太適用,我們可以自定義它的端口範圍,操作步驟如下:

編輯 vim /etc/kubernetes/manifests/kube-apiserver.yaml 配置文件,增加配置–service-node-port-range=20000-50000

vim /etc/kubernetes/manifests/kube-apiserver.yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 10.0.0.61:6443
  creationTimestamp: null
  labels:
    component: kube-apiserver
    tier: control-plane
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-apiserver
    - --advertise-address=10.0.0.61
    - --service-node-port-range=20000-50000  #添加這一行

修改grafana-service文件:

cd /etc/kubernetes/kube-prometheus/
cat >manifests/grafana-service.yaml<<EOF
apiVersion: v1
kind: Service
metadata:
  labels:
    app: grafana
  name: grafana
  namespace: monitoring
spec:
  type: NodePort
  ports:
  - name: http
    port: 3000
    targetPort: http
    nodePort: 33000
  selector:
    app: grafana
EOF
kubectl apply -f manifests/grafana-service.yaml

修改Prometheus-service文件:

cd /etc/kubernetes/kube-prometheus/
cat >manifests/prometheus-service.yaml<<EOF
apiVersion: v1
kind: Service
metadata:
  labels:
    prometheus: k8s
  name: prometheus-k8s
  namespace: monitoring
spec:
  type: NodePort
  ports:
  - name: web
    port: 9090
    targetPort: web
    nodePort: 39090
  selector:
    app: prometheus
    prometheus: k8s
  sessionAffinity: ClientIP
EOF
kubectl apply -f manifests/prometheus-service.yaml

修改alertmanager-service文件:

cd /etc/kubernetes/kube-prometheus/
cat >manifests/alertmanager-service.yaml<<EOF
apiVersion: v1
kind: Service
metadata:
  labels:
    alertmanager: main
  name: alertmanager-main
  namespace: monitoring
spec:
  type: NodePort
  ports:
  - name: web
    port: 9093
    targetPort: web
    nodePort: 39093
  selector:
    alertmanager: main
    app: alertmanager
  sessionAffinity: ClientIP
EOF
kubectl apply -f manifests/alertmanager-service.yaml

2.Prometheus監控頁面展示

訪問Prometheus web頁面: http://10.0.0.61:39090/
展開Status菜單,查看targets,可以看到只有圖中兩個監控任務沒有對應的目標,這和serviceMonitor資源對象有關
在這裏插入圖片描述
原因分析:

因爲serviceMonitor選擇svc時,是根據labels標籤選取,而在指定的命名空間(kube-system),並沒有對應的標籤。kube-apiserver之所以正常是因爲kube-apiserver 服務 namespace 在default使用默認svc kubernetes。其餘組件服務在kube-system 空間 ,需要單獨創建svc。

解決辦法:

# 查看serviceMonitor選取svc規則
[root@k8s-m01 ~]# cd /etc/kubernetes/kube-prometheus/
[root@k8s-m01 kube-prometheus]# grep -2 selector manifests/prometheus-serviceMonitorKube*
manifests/prometheus-serviceMonitorKubeControllerManager.yaml-    matchNames:
manifests/prometheus-serviceMonitorKubeControllerManager.yaml-    - kube-system
manifests/prometheus-serviceMonitorKubeControllerManager.yaml:  selector:
manifests/prometheus-serviceMonitorKubeControllerManager.yaml-    matchLabels:
manifests/prometheus-serviceMonitorKubeControllerManager.yaml-      k8s-app: kube-controller-manager
--
manifests/prometheus-serviceMonitorKubelet.yaml-    matchNames:
manifests/prometheus-serviceMonitorKubelet.yaml-    - kube-system
manifests/prometheus-serviceMonitorKubelet.yaml:  selector:
manifests/prometheus-serviceMonitorKubelet.yaml-    matchLabels:
manifests/prometheus-serviceMonitorKubelet.yaml-      k8s-app: kubelet
--
manifests/prometheus-serviceMonitorKubeScheduler.yaml-    matchNames:
manifests/prometheus-serviceMonitorKubeScheduler.yaml-    - kube-system
manifests/prometheus-serviceMonitorKubeScheduler.yaml:  selector:
manifests/prometheus-serviceMonitorKubeScheduler.yaml-    matchLabels:
manifests/prometheus-serviceMonitorKubeScheduler.yaml-      k8s-app: kube-scheduler


# 新建prometheus-kubeSchedulerService.yaml
# 新建prometheus-kubeSchedulerService.yaml
$ cd /etc/kubernetes/kube-prometheus/
$ cat >manifests/prometheus-kubeSchedulerService.yaml<<EOF
apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-scheduler
  labels:
    k8s-app: kube-scheduler
spec:
  selector:
    component: kube-scheduler
  type: ClusterIP
  clusterIP: None
  ports:
  - name: http-metrics
    port: 10251
    targetPort: 10251
    protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
  labels:
    k8s-app: kube-scheduler
  name: kube-scheduler
  namespace: kube-system
subsets:
- addresses:
  - ip: 10.0.0.61
  ports:
  - name: http-metrics
    port: 10251
    protocol: TCP
EOF
kubectl apply -f manifests/prometheus-kubeSchedulerService.yaml

# 同理新建prometheus-kubeControllerManagerService.yaml
cat >manifests/prometheus-kubeControllerManagerService.yaml<<EOF  
apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-controller-manager
  labels:
    k8s-app: kube-controller-manager
spec:
  selector:
    component: kube-controller-manager
  type: ClusterIP
  clusterIP: None
  ports:
  - name: http-metrics
    port: 10252
    targetPort: 10252
    protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
  labels:
    k8s-app: kube-controller-manager
  name: kube-controller-manager
  namespace: kube-system
subsets:
- addresses:
  - ip: 10.0.0.61
  ports:
  - name: http-metrics
    port: 10252
    protocol: TCP
EOF
kubectl apply -f manifests/prometheus-kubeControllerManagerService.yaml

3.訪問alertmanager

訪問alertmanager web頁面: http://10.0.0.61:39093/
在這裏插入圖片描述

4.訪問grafana

http://10.0.0.61:33000/
在這裏插入圖片描述

1.設置grafana時間爲UTC時間:

在這裏插入圖片描述

2.查看自帶的一些模板:

在這裏插入圖片描述
在這裏插入圖片描述
本身自帶很多模板;當然也可以去grafana官網下載 https://grafana.com/grafana/dashboards 或者自己寫。

eyJhbGciOiJSUzI1NiIsImtpZCI6IjhqcUpJc2dtV3dRODB2bmEwM3NZcGJIR3JLZ0toenRmdGJicGQ4ZmtXa1UifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tcGJuajUiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiMmZlZDYyZWItOGY1MS00MDk2LWI3MTQtYTYyNjQyOTY1MTczIiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.iaT-wKWtr2Qb9bJPaZbriHqocQVuHhvNNF6ffZjBnM9S0I_52zFjh02IpGjYouQf7lv1PQ_BaQ1S1TyxRgqK5H0KoEse_4Id-aglFLtpUWzgPVFp1_pQFcoEtqAp3oLNTt5-gij5jLBkyfD19Jayom4c8QOi5tcxFJVVTjx4k1A13xbnqR-Vu-Oo48fdTSmzSlpZlgeAoWJfh4mbXx8E5FhI8c15XHnVVNFML42SRVRMCiD5x2cx0q_6ZrZVA2qeGjmd2H4dNhmXpZrUtgt_Zebep5srwhe0CS02nlTLcRE5wITiNcqGicMDDn1X6rKFbinOcPTfWXueE2XxoPniBA
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章