Spring Boot 微服務應用集成Prometheus + Grafana 實現監控告警
部分內容原文地址: Richard_Yi:[Spring Boot 微服務應用集成Prometheus + Grafana 實現監控告警](https://segmentfault.com/a/1190000021639286)
一、添加依賴
Spring Boot 應用和Prometheus 集成,你需要增加micrometer-registry-prometheus依賴。
<!-- Micrometer Prometheus registry -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
添加上述依賴項之後,Spring Boot 將會自動配置 PrometheusMeterRegistry 和 CollectorRegistry來以Prometheus 可以抓取的格式(即上文提到的 Metrics 格式)收集和導出指標數據。
所有的相關數據,都會在Actuator 的 /prometheus端點暴露出來。Prometheus 可以抓取該端點以定期獲取度量標準數據。
1.1 Actuator 的 /prometheus端點
加micrometer-registry-prometheus依賴後,我們訪問http://localhost:8080/actuator/prometheus地址,可以看到一下內容:
# HELP jvm_buffer_total_capacity_bytes An estimate of the total capacity of the buffers in this pool
# TYPE jvm_buffer_total_capacity_bytes gauge
jvm_buffer_total_capacity_bytes{id="direct",} 90112.0
jvm_buffer_total_capacity_bytes{id="mapped",} 0.0
# HELP tomcat_sessions_expired_sessions_total
# TYPE tomcat_sessions_expired_sessions_total counter
tomcat_sessions_expired_sessions_total 0.0
# HELP jvm_classes_unloaded_classes_total The total number of classes unloaded since the Java virtual machine has started execution
# TYPE jvm_classes_unloaded_classes_total counter
jvm_classes_unloaded_classes_total 1.0
# HELP jvm_buffer_count_buffers An estimate of the number of buffers in the pool
# TYPE jvm_buffer_count_buffers gauge
jvm_buffer_count_buffers{id="direct",} 11.0
jvm_buffer_count_buffers{id="mapped",} 0.0
# HELP system_cpu_usage The "recent cpu usage" for the whole system
# TYPE system_cpu_usage gauge
system_cpu_usage 0.0939447637893599
# HELP jvm_gc_max_data_size_bytes Max size of old generation memory pool
# TYPE jvm_gc_max_data_size_bytes gauge
jvm_gc_max_data_size_bytes 2.841116672E9
# 此處省略超多字...
這些都是按照上文提到的 Metrics 格式組織起來的程序監控指標數據。
metric name>{<label name>=<label value>, ...}
二、Prometheus 配置
配置Prometheus 去收集/actuator/prometheus的指標數據。
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9090']
# demo job
- job_name: 'springboot-actuator-prometheus-test' # job name
metrics_path: '/actuator/prometheus' # 指標獲取路徑
scrape_interval: 5s # 間隔
basic_auth: # Spring Security basic auth
username: 'actuator'
password: 'actuator'
static_configs:
- targets: ['10.60.45.113:8080'] # 實例的地址,默認的協議是http