寫在最前面

我心裏也打鼓，是不是應該把這個文章寫出來，因爲這個項目裏面的代碼不能拿來就用，需要做修改（其實後端API監控本來就和業務相關，應該沒有一個成品可以做到對每一個業務的API都能監控).其次文章裏面涉及的知識點比較多，Prometheus,Prometheus-Operator,K8S,Python ,如果你已經具備這些基礎，也願意花時間來讀這篇文章，請繼續。如果按照步驟做了有問題，可以留言，我會回覆的。

簡介：

目前公司有一個項目，使用前後端分離。前端採樣VUE，後端使用Java Spring全家桶，後端的接口爲Restful API.爲了能第一時間發現後端服務的故障和檢測後端API的響應時間，自己使用Python+Prometheus_Client（python sdk) 寫了一個exporter,然後對接prometheus,並配置告警。後端有故障的時候，可以第一時間發現，而不是等用戶有感覺來才發現。

後端項目的接口認證採用在header中附加token的形式，用戶第一次登錄的時候會返回token。

PS: 前端項目運行在nginx容器中，已經使用blackbox的http模塊監控

實施步驟:

1. 將代碼克隆到本地，或者使用pycharm導入

git clone https://gitee.com/kevinliu_CQ/api-monitoring.git

2. 修改代碼中的配置

a. 修改config_dir 下面的app.yaml文件。裏面的域名和接口地址都需要修改，以我的項目爲例，請查看描述信息

config:
     testset: 'APP API monitoring'  #描述性信息，代碼中未使用
     timeout: 15 #調用API的超時時間
     scrape_interval: 15 #檢測API的時間間隔，單位爲秒
     base_url: 'https://app.×××.com/app' #API接口的URL，我的項目是微服務，這個app是微服務的url前綴
token:
     base_url: 'https://app.××××.com/app' #獲取token的URL地址
     url: '/login/pwdLogin' #獲取token的接口地址
     method: "POST" #獲取接口的HTTP方法
     body: '' #獲取接口時在post請求的payload內容
     params: {"mobile": "17320491234","pwd": "your_password"} #這個是在請求的URL中的參數列表，HTTP請求的參數可以放在URL裏面，也可以放在payload裏面，我的應用時放在URL裏面的
     headers: {} # POST請求的header 
     token_key_in_request: 'authKey' # 在post請求中後端認證時在header裏面讀取的值名字，後面截圖說明
     token_key_in_response: 'token' #在用戶第一次登錄後返回數據裏面token的值的名字，
verify:
    base_url: 'https://app.****.com/app' #驗證token的URL,因爲token一般都有較長時間的有效期，不能監控一次就取一次token
    url: '/user/userAddress/list' #驗證token的接口地址
    method: "GET" #驗證token的HTTP方法
    token_key_in_request: 'authKey' # header裏面token值的名字
    headers: {'Content-Type': 'application/json'} #HTTP請求的header
cases: #所有需要監控的API接口地址
- UserAddress:  #接口名字
    url: '/user/userAddress/list' #接口地址，相對於config段裏base_url
    method: "GET" #請求方式
    headers: {'Content-Type': 'application/json'} # HTTP headers
- UserBalance:
    url: '/user/getBalance'
    method: "GET"
    headers: {'Content-Type': 'application/json'}
- UserRecord:
    url: '/user/userRecord/getTypeList'
    method: "GET"
    headers: {'Content-Type': 'application/json'}
- UserBankCard:
    url: '/user/bankCard/payList'
    method: "GET"
    headers: {'Content-Type': 'application/json'}

b. 如果有多個微服務API需要監控，直接添加另外一個文件就可以了，比如我的im.yaml這個文件，im是另外一個微服務

3. 製作Docker鏡像，我這裏的鏡像上傳到私有habor倉庫中。

docker build -t harbor.***.work/monitor/api-monitoring:latest .
docker push harbor.***.work/monitor/api-monitoring:latest

4.修改k8s.yaml文件中的鏡像地址，鏡像地址和步驟3裏面的地址一致

5. 在k8s中部署服務

kubectl apply -f k8s.yaml

6. 查看運行的容器

[root@controller ~]# kubectl get po -n monitoring
NAME                                READY   STATUS    RESTARTS   AGE
api-monitoring-5584fb9bbb-l2xb8     1/1     Running   0          32h
[root@controller ~]# kubectl get svc -n monitoring
NAME               TYPE        CLUSTER-IP        EXTERNAL-IP   PORT(S)    AGE
api-monitoring     ClusterIP   192.168.180.128   <none>        80/TCP     3d8h
[root@controller ~]#

7. 查看prometheus採集到的數據，API 和API_ResponseTime，這兩個Metrics分別代表API是否可用（1正常，0不正常)以及API的響應時間，單位是秒

[root@controller ~]# curl 192.168.180.128
# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 113122.0
python_gc_objects_collected_total{generation="1"} 10240.0
python_gc_objects_collected_total{generation="2"} 987.0
# HELP python_gc_objects_uncollectable_total Uncollectable object found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 7069.0
python_gc_collections_total{generation="1"} 642.0
python_gc_collections_total{generation="2"} 58.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="6",patchlevel="1",version="3.6.1"} 1.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 9.0904576e+08
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 3.485696e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.5912509406e+09
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 967.65
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 7.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP request_process_sencods Time spent processing request
# TYPE request_process_sencods summary
request_process_sencods_count 15472.0
request_process_sencods_sum 637.3630767939612
# HELP request_process_sencods_created Time spent processing request
# TYPE request_process_sencods_created gauge
request_process_sencods_created 1.5912509420943658e+09
# HELP API API Status
# TYPE API gauge
API{apiModule="IMFriend"} 1.0
API{apiModule="IMQrcode"} 1.0
API{apiModule="UserRecord"} 1.0
API{apiModule="UserBankCard"} 1.0
API{apiModule="UserBalance"} 1.0
API{apiModule="UserAddress"} 1.0
# HELP API_ResponseTime API Response Time
# TYPE API_ResponseTime gauge
API_ResponseTime{apiModule="IMFriend"} 0.0
API_ResponseTime{apiModule="IMQrcode"} 0.0
API_ResponseTime{apiModule="UserRecord"} 0.0
API_ResponseTime{apiModule="UserBankCard"} 0.0
API_ResponseTime{apiModule="UserBalance"} 0.0
API_ResponseTime{apiModule="UserAddress"} 0.0
[root@controller ~]#

8.接下來就是在Prometheus中配置監控target和告警規則，我的集羣裏面使用Prometheus-Operator部署的Prometheus，所以我使用CRD來配置告警規則。

kubectl create secret generic prometheus-operator-prometheus-scrape-confg --from-file=additional-scrape-configs.yaml --dry-run -oyaml -nmonitoring > additional.yaml && kubectl apply -f additional.yaml
kubectl apply -f alertrule.yaml

9.在prometheus中查看監控target和告警規則已經生效

10.Prometheus的告警會發送給Alertmanager,告警的媒介就自己配置了。我配置了一個阿里雲的電話告警,也是自己寫的一個服務

使用Prometheus監控API

寫在最前面

簡介：

Python實現大麥網搶票的四大關鍵技術點解析

salesforce零基礎學習（一百三十八）零碎知識點小總結（十）

一款開源的.NET程序集反編譯、編輯和調試神器

關於接口協議，你必須要知道這些！

【2024-05-21】以茶會友

CentOS 6.8下安裝glibc-2.17

使用CAS整合LDAP實現單點登錄(SSO)-從0到1-基於Django實現SSO

使用CAS整合LDAP實現單點登錄(SSO)-從0到1-準備工作

[轉] CentOS7 防火牆（firewall）的操作命令

helm國內源 - 親測可用

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結