服务发现
基于文件的服务发现
现有配置:
[root@mcw03 ~]# cat /etc/prometheus.yml # my global config global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: - "rules/node_rules.yml" # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['localhost:9090'] - job_name: 'agent1' static_configs: - targets: ['10.0.0.14:9100','10.0.0.12:9100'] - job_name: 'promserver' static_configs: - targets: ['10.0.0.13:9100'] - job_name: 'server_mariadb' static_configs: - targets: ['10.0.0.13:9104'] - job_name: 'docker' static_configs: - targets: ['10.0.0.12:8080'] metric_relabel_configs: - regex: 'kernelVersion' action: labeldrop [root@mcw03 ~]#
把static_configs 替换成file_sd_configs
配置刷新重载文件配置的时间。可以不用手动刷新
创建目录并修改配置,指定使用的文件配置
下面红色配置错了,直接指定文件路径就可以,不需要targets键
[root@mcw03 ~]# ls /etc/prometheus.yml /etc/prometheus.yml [root@mcw03 ~]# mkdir -p /etc/targets/{nodes,docker} [root@mcw03 ~]# vim /etc/prometheus.yml [root@mcw03 ~]# cat /etc/prometheus.yml # my global config global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: - "rules/node_rules.yml" # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['localhost:9090'] - job_name: 'agent1' file_sd_configs: - files: - targets: targets/nodes/*.json refresh_interval: 5m - job_name: 'promserver' static_configs: - targets: ['10.0.0.13:9100'] - job_name: 'server_mariadb' static_configs: - targets: ['10.0.0.13:9104'] - job_name: 'docker' file_sd_configs: - files: - targets: targets/docker/*.json refresh_interval: 5m # metric_relabel_configs: # - regex: 'kernelVersion' # action: labeldrop [root@mcw03 ~]#
创建配置文件
[root@mcw03 ~]# touch /etc/targets/nodes/nodes.json [root@mcw03 ~]# touch /etc/targets/docker/daemons.json [root@mcw03 ~]#
修改到json文件配置中
[root@mcw03 ~]# vim /etc/targets/nodes/nodes.json [root@mcw03 ~]# vim /etc/targets/docker/daemons.json [root@mcw03 ~]# cat /etc/targets/nodes/nodes.json [{ "targets": [ "10.0.0.14:9100", "10.0.0.12:9100" ] }] [root@mcw03 ~]# cat /etc/targets/docker/daemons.json [{ "targets": [ "10.0.0.12:8080" ] }] [root@mcw03 ~]#
报错了
[root@mcw03 ~]# curl -X POST http://localhost:9090/-/reload failed to reload config: couldn't load configuration (--config.file="/etc/prometheus.yml"): parsing YAML file /etc/prometheus.yml: yaml: unmarshal errors: line 34: cannot unmarshal !!map into string line 45: cannot unmarshal !!map into string [root@mcw03 ~]#
上面配置写错了
[root@mcw03 ~]# vim /etc/prometheus.yml [root@mcw03 ~]# cat /etc/prometheus.yml # my global config global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: - "rules/node_rules.yml" # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['localhost:9090'] - job_name: 'agent1' file_sd_configs: - files: - targets/nodes/*.json refresh_interval: 5m - job_name: 'promserver' static_configs: - targets: ['10.0.0.13:9100'] - job_name: 'server_mariadb' static_configs: - targets: ['10.0.0.13:9104'] - job_name: 'docker' file_sd_configs: - files: - targets/docker/*.json refresh_interval: 5m # metric_relabel_configs: # - regex: 'kernelVersion' # action: labeldrop [root@mcw03 ~]# curl -X POST http://localhost:9090/-/reload [root@mcw03 ~]#
此时看,可以看到服务发现的客户端
http://10.0.0.13:9090/service-discovery
改为yml格式
[root@mcw03 ~]# vim /etc/prometheus.yml [root@mcw03 ~]# cat /etc/prometheus.yml # my global config global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: - "rules/node_rules.yml" # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['localhost:9090'] - job_name: 'agent1' file_sd_configs: - files: - targets/nodes/*.json refresh_interval: 5m - job_name: 'promserver' static_configs: - targets: ['10.0.0.13:9100'] - job_name: 'server_mariadb' static_configs: - targets: ['10.0.0.13:9104'] - job_name: 'docker' file_sd_configs: - files: - targets/docker/*.yml refresh_interval: 5m # metric_relabel_configs: # - regex: 'kernelVersion' # action: labeldrop [root@mcw03 ~]# cp /etc/targets/docker/daemons.json /etc/targets/docker/daemons.yml [root@mcw03 ~]# vim /etc/targets/docker/daemons.yml [root@mcw03 ~]# cat /etc/targets/docker/daemons.yml - targets: - "10.0.0.12:8080" [root@mcw03 ~]#
[root@mcw03 ~]# curl -X POST http://localhost:9090/-/reload [root@mcw03 ~]#
重载之后正常,
从标签里可以看到,服务自动发现来自哪里
因为target是yml或者json数据,所以可以用salt,cmdb等等各种,进行配置集中管理,实现监控
基于文件的自动发现,添加标签的实现
修改配置
[root@mcw03 ~]# vim /etc/targets/nodes/nodes.json [root@mcw03 ~]# cat /etc/targets/nodes/nodes.json [{ "targets": [ "10.0.0.14:9100", "10.0.0.12:9100" ], "labels": { "datacenter": "mcwhome" } }] [root@mcw03 ~]# vim /etc/targets/docker/daemons.yml [root@mcw03 ~]# cat /etc/targets/docker/daemons.yml - targets: - "10.0.0.12:8080" - labels: "datacenter": "mcwymlhome" [root@mcw03 ~]#
不需要重启服务,这个标签自动就有了。不过yml格式的,添加标签,没有生效。不清楚咋添加
基于api的服务发现
基于dns的服务发现