一開始搞的prometheus operator是一鍵式部署的,部署確實簡單了,但是管理使用起來就不方便就沒再用。
後來搞prometheus監控,只在centos7安裝prometheus,使用的監控進程服務,監控mysql,監控docker容器。完事後使用grafana在網頁顯示,使用alertmanager進行報警,功能都算實現了。
再後來監控kubernetes,發現prometheus想要監控k8s集羣就必須在k8s集羣中部署一個prometheus,然後就在集羣中部署了一個prometheus,以及node_exporter、kube-state-metrics分別用來監控集羣的node節點主機信息和pod、node狀態等。。。
進入正題。。
相對於k8s部署,使用centos安裝prometheus就屬於入門級別的。k8s部署prometheus詳細篇在我另一篇文章(感覺還是k8s部署比較方便使用)
一.centos7安裝prometheus
有關prometheus組件的下載地址:https://prometheus.io/download/
1.1網頁下載 prometheus-2.13.0.linux-amd64.tar.gz 或者
wget -c https://github.com/prometheus/prometheus/releases/download/v2.13.0/prometheus-2.13.0.linux-amd64.tar.gz
1.2網頁下載 node_exporter-0.18.1.linux-amd64.tar.gz 或者
wget -c https://github.com/prometheus/node_exporter/releases/download/v0.18.1/node_exporter-0.18.1.linux-amd64.tar.g
1.3網頁下載 process-exporter-0.5.0.linux-amd64.tar.gz
wget -c https://github.com/ncabatoff/process-exporter/releases/download/v0.5.0/process-exporter-0.5.0.linux-amd64.tar.gz
1.4網頁下載 alertmanager-0.19.0.linux-amd64.tar.gz
wget https://github.com/prometheus/alertmanager/releases/download/v0.19.0/alertmanager-0.19.0.linux-amd64.tar.gz
1.5網頁下載 grafana-6.4.1.linux-amd64.tar.gz
wget https://dl.grafana.com/oss/release/grafana-6.4.1.linux-amd64.tar.gz
2.解壓
先在/opt/目錄下創建prometheus、node_exporter 、process_exporter、alertmanager、grafana
mkdir -p /opt//prometheus
mkdir -p /opt/node_exporter
mkdir -p /opt/process_exporter
mkdir -p /opt/alertmanager
mkdir -p /opt/grafana
tar zxf /opt/prometheus-2.13.0.linux-amd64.tar.gz -C /opt/prometheus --strip-components=1
tar zxf /opt/node_exporter-0.18.1.linux-amd64.tar.gz -C /opt/node_exporter --strip-components=1
tar zxf /opt/process-exporter-0.5.0.linux-amd64.tar.gz -C /opt/process_exporter --strip-components=1
tar zxf /opt/alertmanager-0.19.0.linux-amd64.tar.gz -C /opt/alertmanager --strip-components=1
tar zxf /opt/grafana-6.4.1.linux-amd64.tar.gz -C /opt/grafana --strip-components=1
3.配置
vim /opt/prometheus/prometheus.yml
global:
scrape_interval:
evaluation_interval: 15s
alerting:
alertmanagers:
- static_configs:
- targets: ['192.168.1.131:9093']
rule_files:
- "rules.yml"
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['192.168.1.131:9090']
- job_name: 'node_exporter'
static_configs:
- targets: ['192.168.1.131:9100']
- job_name: 'process'
static_configs:
- targets: ['192.168.1.131:9256']
vim /opt/prometheus/rules.yml (查看更多的主機監控規則)
groups:
- name: 監控進程
rules:
- alert: docker_status
expr: namedprocess_namegroup_num_procs{groupname="map[:dockerd]"job="process"} == 0
for: 30s
labels:
area: A
annotations:
summary: "docker進程服務 {{ $labels.instance }} 掛了"
- name: 主機狀態-監控告警
rules:
- alert: 主機狀態
expr: up == 0
for: 1m
labels:
status: 非常嚴重
annotations:
summary: "{{$labels.instance}}:服務器宕機"
description: "{{$labels.instance}}:服務器延時超過5分鐘"
vim /opt/alertmanager/alertmanager.yml
global:
resolve_timeout: 5m
smtp_smarthost: 'smtp.exmail.qq.com:465' # 郵箱smtp服務器代理
smtp_from: '[email protected]' # 發送郵箱名稱
smtp_auth_username: '[email protected]' # 郵箱名稱
smtp_auth_password: 'Caitong12316' # 郵箱密碼或授權碼
smtp_require_tls: false
route:
group_by: ['alertname'] # 報警分組依據
group_wait: 30s # 最初即第一次等待多久時間發送一組警報的通知
group_interval: 5m # 在發送新警報前的等待時間
repeat_interval: 2h # 發送重複警報的週期
receiver: email
receivers:
- name: 'email' # 警報
email_configs: # 郵箱配置
- to: '[email protected]' # 接收警報的email配置
headers: { Subject: "[WARN] 報警郵件"} # 接收郵件的標題
vim /opt/process_exporter/process_conf.yml
process_names:
- name: "{{.Matches}}"
cmdline:
- 'dockerd'
4.啓動命令
# 後臺運行node_exporter
nohup /opt/node_exporter/node_exporter > /opt/node_exporter/node_exporter.stdout 2>&1 &
# 後臺運行prometheus
nohup /opt/prometheus/prometheus > /opt/prometheus/prometheus.stdout 2>&1 &
# 後臺運行process-exporter
nohup /opt/process-exporter/process-exporter -config.path process-conf.yaml > /opt/process-exporter/process-exporter.stdout 2>&1 &
# 後臺運行alertmanager
nohup /opt/alertmanager/alertmanager --config.file="alertmanager.yml" > /opt/alertmanager/alertmanager.stdout 2>&1 &
# 後臺運行grafana
nohup /usr/local/services/grafana/bin/grafana-server > /usr/local/services/grafana/grafana.stdout 2>&1 &
5. 訪問:
http://192.168.1.131:9090
http://192.168.1.131:3000 (admin/admin)