k8s容器中通過Prometheus Operator部署Kafka Exporter監控Kafka集羣

寫在前面

在按照下面步驟操作之前,請先確保服務器已經部署k8s,prometheus,prometheus operator以及kafka集羣,關於這些環境的部署,可以自行查找相關資料安裝部署,本文檔便不在此贅述。

關於prometheus監控這部分,大致的系統架構圖如下,感興趣的同學可以自行研究一下,這裏就不再具體說明。

1、Deployment(工作負載)以及Service(服務)部署

配置yaml可參考如下:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kafka-exporter
  namespace: prometheus-exporter
  labels:
    app: kafka-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kafka-exporter
  template:
    metadata:
      labels:
        app: kafka-exporter
    spec:
      containers:
      - name: kafka-exporter
        image: danielqsj/kafka-exporter:v1.6.0
        imagePullPolicy: IfNotPresent
        args: ["--kafka.server=kafkaCluster.monitorsoftware:9092"]
        ports:
        - containerPort: 9308
          name: http

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: kafka-exporter
  name: kafka-exporter
  namespace: prometheus-exporter
spec:
  type: ClusterIP
  ports:
  - name: http
    port: 9308
    protocol: TCP
    targetPort: 9308
  selector:
    app: kafka-exporter

說明:

1> 關於kafka exporter中指標參數的含義可參看官網說明,地址如下:https://github.com/danielqsj/kafka_exporter

2> 關於kafka exporter 鏡像版本可以根據需要選擇對應的版本,鏡像倉庫地址如下:https://hub.docker.com/r/danielqsj/kafka-exporter/tags

3> 部署成功圖如下:

(1)Deployment(工作負載

 (2)Service(服務

2、創建ServiceMonitor配置文件

yaml配置文件如下:

---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app: kafka-exporter
  name: prometheus-kafka-exporter
  namespace: prometheus-exporter
spec:
  endpoints:
    - honorLabels: true
      interval: 1m
      path: /metrics
      port: http
      scheme: http
      params:
        target:
          - 'kafkaCluster.monitorsoftware:9092'
      relabelings:
        - sourceLabels: [__param_target]
          targetLabel: instance
  namespaceSelector:
    matchNames:
      - prometheus-exporter
  selector:
    matchLabels:
      app: kafka-exporter

說明:

1> prometheus operator是通過ServiceMonitor發現監控目標,並對其進行監控。serviceMonitor 是對service 獲取數據的一種方式。

  • promethus-operator可以通過serviceMonitor 自動識別帶有某些 label 的service ,並從這些service 獲取數據。
  • serviceMonitor 也是由promethus-operator 自動發現的。

2> prometheus監控過程如下:

 

 3> 部署成功圖如下

(1)serviceMonitor部署

 (2)Prometheus部署成功圖

3、Prometheus告警規則配置 

 prometheus rule規則配置:

---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    prometheus: k8s
    role: alert-rules
  name: kafka-exporter-rules
  namespace: prometheus-exporter
spec:
  groups:
    - name: kafka-exporter
      rules:
        - alert: KafkaConsumersGroup延遲
          expr: sum(kafka_consumergroup_lag) by (consumergroup,namespace,instance) > 1000
          for: 1m
          labels:
            severity: critical
          annotations:
            summary: Kafka consumers group 延遲, (consumergroup {{ $labels.consumergroup }} in instance {{ $labels.instance }})
            description: "Kafka consumers group\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        - alert: kafka集羣節點減少
          expr: kafka_brokers < 3
          for: 3m
          labels:
            severity: critical
          annotations:
            summary: "kafka集羣部分節點已停止,請儘快處理!(instance {{ $labels.instance }})"
            description: "{{$labels.instance}} kafka集羣節點減少"

說明:

1> prometheusRule規則配置,可以參考模板配置,模板網址如下:https://awesome-prometheus-alerts.grep.to/rules#kafka

2> 部署成功圖如下:

4、Grafana部署圖

4.1、grafana dashboard地址如下:https://grafana.com/grafana/dashboards

        官方推薦模板ID爲:7589

4.2、dashboard效果圖如下

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章