K8S實踐Ⅸ(集羣監控)

一、PrometheusOperator介紹

PrometheusOperator是CoreOS開源的一套用於管理在Kubernetes集羣上的Prometheus的控制器,簡化在Kubernetes上部署、管理和運行Prometheus和Alertmanager集羣的操作。

二、部署

1.從官方下載部署文件

# git clone https://github.com/coreos/kube-prometheus.git

2.更改鏡像倉庫地址

# mkdir prometheus
# cp kube-prometheus/manifests/* prometheus/
# sed -i 's#k8s.gcr.io#gcr.azk8s.cn/google_containers#g' prometheus/*
# sed -i 's#quay.io#quay.azk8s.cn#g' prometheus/*
# cat prometheus/* | grep image

3.部署所有資源

# kubectl apply -f prometheus/

4.查看創建的ns和crd

# kubectl get ns |grep monitoring
monitoring        Active   3m30s
# kubectl get crd
NAME                                    CREATED AT
alertmanagers.monitoring.coreos.com     2019-09-10T09:13:00Z
podmonitors.monitoring.coreos.com       2019-09-10T09:13:00Z
prometheuses.monitoring.coreos.com      2019-09-10T09:13:01Z
prometheusrules.monitoring.coreos.com   2019-09-10T09:13:02Z
servicemonitors.monitoring.coreos.com   2019-09-10T09:13:03Z

5.查看monitoring下所有的pod和svc

# kubectl get pod -n monitoring
NAME                                   READY   STATUS    RESTARTS   AGE
alertmanager-main-0                    2/2     Running   0          23h
alertmanager-main-1                    2/2     Running   0          23h
alertmanager-main-2                    2/2     Running   0          23h
grafana-57bfdd47f8-bhkvv               1/1     Running   0          23h
kube-state-metrics-8cf4797dc-7dg4w     4/4     Running   0          23h
node-exporter-446xd                    2/2     Running   0          23h
node-exporter-8sbsf                    2/2     Running   0          23h
node-exporter-dk7qk                    2/2     Running   0          23h
node-exporter-vdsqg                    2/2     Running   0          23h
node-exporter-w7czt                    2/2     Running   0          23h
node-exporter-wx7vj                    2/2     Running   0          23h
prometheus-adapter-6b9989ccbd-bcl2h    1/1     Running   0          23h
prometheus-k8s-0                       3/3     Running   1          23h
prometheus-k8s-1                       3/3     Running   1          23h
prometheus-operator-7894d75578-rg2gl   1/1     Running   0          23h
# kubectl get svc -n monitoring
NAME                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
alertmanager-main       NodePort    10.97.155.71    <none>        9093:30093/TCP               23h
alertmanager-operated   ClusterIP   None            <none>        9093/TCP,9094/TCP,9094/UDP   23h
grafana                 NodePort    10.110.28.251   <none>        3000:30030/TCP               23h
kube-state-metrics      ClusterIP   None            <none>        8443/TCP,9443/TCP            23h
node-exporter           ClusterIP   None            <none>        9100/TCP                     23h
prometheus-adapter      ClusterIP   10.111.75.114   <none>        443/TCP                      23h
prometheus-k8s          NodePort    10.109.3.70     <none>        9090:30090/TCP               23h
prometheus-operated     ClusterIP   None            <none>        9090/TCP                     23h
prometheus-operator     ClusterIP   None            <none>        8080/TCP                     23h

6.更改端口模式爲NodePort映射端口

# kubectl edit svc prometheus-k8s -n monitoring
service/prometheus-k8s edited
# kubectl edit svc grafana -n monitoring
service/grafana edited
# kubectl edit svc alertmanager-main -n monitoring
service/alertmanager-main edited
# kubectl get svc -n monitoring | grep NodePort
alertmanager-main       NodePort    10.97.155.71    <none>        9093:30093/TCP               21h
grafana                 NodePort    10.110.28.251   <none>        3000:30030/TCP               21h
prometheus-k8s          NodePort    10.109.3.70     <none>        9090:30090/TCP               21h

7.訪問測試
K8S實踐Ⅸ(集羣監控)

三、配置

1.查看prometheus的targets頁面

K8S實踐Ⅸ(集羣監控)

發現kube-controller-manager 和 kube-scheduler 這兩個系統組件沒有監控到,此處和ServiceMonitor 的定義有關係

# cat prometheus/prometheus-serviceMonitorKubeScheduler.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    k8s-app: kube-scheduler
  name: kube-scheduler
  namespace: monitoring
spec:
  endpoints:
  - interval: 30s
    port: http-metrics
  jobLabel: k8s-app
  namespaceSelector:
    matchNames:
    - kube-system
  selector:
    matchLabels:
      k8s-app: kube-scheduler

selector.matchLabels在kube-system這個命名空間下面匹配具有k8s-app=kube-scheduler這樣的Service,但是系統中沒有對應的Service。

2.創建kube-controller-manager 和 kube-scheduler對應的Service

# cat cms-svc.yaml 
apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-controller-manager
  labels:
    k8s-app: kube-controller-manager
spec:
  selector:
    component: kube-controller-manager
  ports:
  - name: http-metrics
    port: 10252
    targetPort: 10252
    protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-scheduler
  labels:
    k8s-app: kube-scheduler
spec:
  selector:
    component: kube-scheduler
  ports:
  - name: http-metrics
    port: 10251
    targetPort: 10251
    protocol: TCP
# kubectl describe pod kube-controller-manager-k8s-master01 -n kube-system
Labels:               component=kube-controller-manager
                      tier=control-plane

3.查看kube-controller-manager 和 kube-scheduler是否正常

K8S實踐Ⅸ(集羣監控)

4.訪問Grafana

K8S實踐Ⅸ(集羣監控)

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章