安裝elasticalert配置驗證文檔

docker 安裝、配置、驗證ElasticAlert

created by fangchangtan | 2020/2/24

1.elastalert的場景用途

​ elastalert組件作爲elk中日誌關鍵詞的告警組件。基本的流程是,通過elk日誌獲取程序發出的不間斷的心跳、錯誤日誌關鍵詞ERROR抓取等 ,獲得對程序的健康狀態和穩定性的監控告警。

2.安裝elastalert

2.1 下載git倉庫文件

## git拉去文件
git clone https://github.com/bitsensor/elastalert.git 
##切換目錄
cd elastalert

2.2在本地測試elastalert的docker安裝:

需要切換到elastalert目錄下面,(官方建議的安裝方式)

#啓動elastalert容器
sudo docker run --rm -p 3030:3030 \
    -v `pwd`/config/elastalert.yaml:/opt/elastalert/config.yaml \
    -v `pwd`/config/elastalert-test.yaml:/opt/elastalert/config-test.yaml \
    -v `pwd`/config/config.json:/opt/elastalert-server/config/config.json \
    -v `pwd`/rules:/opt/elastalert/rules \
    -v `pwd`/rule_templates:/opt/elastalert/rule_templates \
    --net="host" \
    --name elastalert-fct2 bitsensor/elastalert:2.0.0

或者,正式的安裝方式(建議方式):

#正式環境,啓動elastalert
docker run --rm \
--name fct-elastalert \
--net "host" \
-p 3030:3030 \
-v /data/poc/trial-production/myelastalert/elastalert/config/elastalert.yaml:/opt/elastalert/config.yaml \
-v /data/poc/trial-production/myelastalert/elastalert/config/config.json:/opt/elastalert-server/config/config.json \
-v /data/poc/trial-production/myelastalert/elastalert/rules:/opt/elastalert/rules \
-v /data/poc/trial-production/myelastalert/elastalert/rule_templates:/opt/elastalert/rule_templates \
-v /data/poc/trial-production/myelastalert/elastalert/config/smtp_auth.yaml:/opt/elastalert/config/smtp_auth.yaml \
-v /data/poc/trial-production/myelastalert/elastalert/server_data:/opt/elastalert/server_data \
-v /data/poc/trial-production/myelastalert/elastalert/logs:/opt/logs \
bitsensor/elastalert:2.0.0 

2.3 配置elastalert的配置文件

其中config.conf文件,主要配置需要連接的es地址,規則rule和rul_templates的路徑,要寫入的es的index的名稱;

{
  "appName": "elastalert-server",
  "port": 3030,
  "wsport": 3333,
  "elastalertPath": "/opt/elastalert",
  "verbose": false,
  "es_debug": false,
  "debug": false,
  "rulesPath": {
    "relative": true,
    "path": "/rules"
  },
  "templatesPath": {
    "relative": true,
    "path": "/rule_templates"
  },
  "es_host": "172.19.32.106",
  "es_port": 9202,
  "writeback_index": "elastalert_status"
}

其中,elastalert.yaml的配置如下

# The elasticsearch hostname for metadata writeback
# Note that every rule can have its own elasticsearch host
es_host: 172.19.32.106

# The elasticsearch port
es_port: 9202

# This is the folder that contains the rule yaml files
# Any .yaml file will be loaded as a rule
rules_folder: rules

# How often ElastAlert will query elasticsearch
# The unit can be anything from weeks to seconds
run_every:
  seconds: 5

# ElastAlert will buffer results from the most recent
# period of time, in case some log sources are not in real time
buffer_time:
  minutes: 1

# Optional URL prefix for elasticsearch
#es_url_prefix: elasticsearch

# Connect with TLS to elasticsearch
#use_ssl: True
use_ssl: False

# Verify TLS certificates
#verify_certs: True
verify_certs: False

# GET request with body is the default option for Elasticsearch.
# If it fails for some reason, you can pass 'GET', 'POST' or 'source'.
# See http://elasticsearch-py.readthedocs.io/en/master/connection.html?highlight=send_get_body_as#transport
# for details
#es_send_get_body_as: GET

# Option basic-auth username and password for elasticsearch
#es_username: someusername
#es_password: somepassword

# The index on es_host which is used for metadata storage
# This can be a unmapped index, but it is recommended that you run
# elastalert-create-index to set a mapping
writeback_index: elastalert_status

# If an alert fails for some reason, ElastAlert will retry
# sending the alert until this time period has elapsed
alert_time_limit:
  days: 2

其次還有一個elastalert-test.yaml文件,該配置只是用來當你使用API來測試規則的時候,這個配置文件可以使你在爲不同的示例測試不同的規則時候,可以寫不同的寫回索引;

elastalert.yaml文件中的smtp_auth.yaml文件配置,

user: [email protected]
password: sdwtyx234

然後,配置elastalert中的告警規則, 掃描es制定索引中的最近1min中,滿足查詢過濾條件日誌的消息數量》5時候,直接發送郵件到[email protected]報警;

如下,是/rules/tank-rules.yaml的elastalert的配置規則文件。

es_host: 172.19.32.106
es_port: 9202

#rule name 必須是獨一的,不然會報錯,這個定義完成之後,會成爲報警郵件的標題
## (Required)
## Rule name, must be unique
name: fct-test-rule-name


#配置一種數據驗證的方式,有 any,blacklist,whitelist,change,frequency,spike,flatline,new_term,cardinality 
#any:只要有匹配就報警;
#blacklist:compare_key字段的內容匹配上 blacklist數組裏任意內容;
#whitelist:compare_key字段的內容一個都沒能匹配上whitelist數組裏內容;
#change:在相同query_key條件下,compare_key字段的內容,在 timeframe範圍內 發送變化;
#frequency:在相同 query_key條件下,timeframe 範圍內有num_events個被過濾出 來的異常;
#spike:在相同query_key條件下,前後兩個timeframe範圍內數據量相差比例超過spike_height。其中可以通過spike_type設置具體漲跌方向是- up,down,both 。還可以通過threshold_ref設置要求上一個週期數據量的下限,threshold_cur設置要求當前週期數據量的下限,如果數據量不到下限,也不觸發;
#flatline:timeframe 範圍內,數據量小於threshold 閾值;
#new_term:fields字段新出現之前terms_window_size(默認30天)範圍內最多的terms_size (默認50)個結果以外的數據;
#cardinality:在相同 query_key條件下,timeframe範圍內cardinality_field的值超過 max_cardinality 或者低於min_cardinality
## (Required)
## Type of alert.
## the frequency rule type alerts when num_events events occur with timeframe time
##我配置的是frequency,這個需要兩個條件滿足,在相同 query_key條件下,timeframe 範圍內有num_events個被過濾出來的異常
type: frequency

#這個index 是指再kibana 裏邊的index,支持正則匹配,支持多個index,同時如果嫌麻煩直接* 也可以。
## (Required)
## Index to search, wildcard supported
index: fct-logstash*

# 只要1最近1min內,有一條事件滿足條件,就滿足規則,出發報警
num_events: 1
timeframe:
    minutes: 1


#這個還是非常關鍵的地方,就是你希望程序的message裏邊出現了什麼樣的關鍵字就報警,這個其實就是elasticsearch 的query語句,支持 AND&OR等。
filter:
- query:
    query_string:
      query: "UNKNOWN"

#在郵件正文會顯示你定義的alert_text
alert_text: "你好,請回復郵件,方昌坦"

# Setup report smtp config 
smtp_host: smtp.163.com
smtp_port: 25
smtp_ssl: False

#SMTP auth
from_addr: [email protected]
email_reply_to: [email protected]
smtp_auth_file: /opt/elastalert/config/smtp_auth.yaml

# (Required)
# # The alert is use when a match is found
alert:
- "email"

# (required, email specific)
# # a list of email addresses to send alerts to
email:
- "[email protected]"
                         

注意此處需要註冊163郵箱,並開通smtp協議

郵箱賬號:[email protected]

郵箱密碼:221123.com

smtp協議密碼:swtx234

其中smtp協議可以允許第三方用戶登錄訪問該郵箱。需要163郵箱開通smtp協議,在163郵箱設置中設置;

2.4 重啓elastalert使得配置生效

最後重新啓elastalert,是的剛纔的新配置生效;

本地測試106主機上,運行elastalert的命令如下:

docker run --rm \
--name fct-elastalert \
--net "host" \
-p 3030:3030 \
-v /data/poc/trial-production/myelastalert/elastalert/config/elastalert.yaml:/opt/elastalert/config.yaml \
-v /data/poc/trial-production/myelastalert/elastalert/config/config.json:/opt/elastalert-server/config/config.json \
-v /data/poc/trial-production/myelastalert/elastalert/rules:/opt/elastalert/rules \
-v /data/poc/trial-production/myelastalert/elastalert/rule_templates:/opt/elastalert/rule_templates \
-v /data/poc/trial-production/myelastalert/elastalert/config/smtp_auth.yaml:/opt/elastalert/config/smtp_auth.yaml \
-v /data/poc/trial-production/myelastalert/elastalert/config/smtp_auth.yaml:/opt/elastalert/config/smtp_auth.yaml \
-v /data/poc/trial-production/myelastalert/elastalert/server_data:/opt/elastalert/server_data \
-v /data/poc/trial-production/myelastalert/elastalert/logs:/opt/logs \
bitsensor/elastalert:2.0.0 

3.驗證郵件推送功能(本地測試)

3.1 啓動logstash發送測試數據

爲了驗證elastalert的告警效果,需要啓動logstash向es中發送測試數據;

在172.19.32.67上,本地啓動logstash驗證:

用來接收kafka中的日誌數據,並通過logstash過濾之後放鬆到elasticsearch中的fct-logstash_*索引中;

docker run \
--rm \
--name fct-alert-logstash \
-p 5047:5044 \
-v /root/fct/logstash-test/logstash_kafka.conf:/logstash/logstash_kafka.conf \
-v /root/fct/logstash-test/logstash.yml:/usr/share/logstash/config/logstash.yml \
registry.marathon.l4lb.thisdcos.directory:5000/logstash:6.6.1 \
logstash -f /logstash/logstash_kafka.conf

3.2 成功的結果表現

在這裏插入圖片描述

出現如上所示,表明發送郵件成功!

3.3 常見錯誤總結

啓動額elastalert服務的日誌中,可以看到如下錯誤。

3.3.1 錯誤1:無法連接163郵箱服務錯誤。

運行過程提示:(提示郵箱配置不正確),需要配置正確的郵箱連接

15:43:43.085Z  INFO elastalert-server: Router:  Listening for GET request on /mapping/:index.
15:43:43.085Z  INFO elastalert-server: Router:  Listening for POST request on /search/:index.
15:43:43.090Z  INFO elastalert-server: ProcessController:  Starting ElastAlert
15:43:43.090Z  INFO elastalert-server: ProcessController:  Creating index
15:43:43.980Z  INFO elastalert-server:
    ProcessController:  Elastic Version:6
    Mapping used for string:{'type': 'keyword'}
    Index elastalert_status already exists. Skipping index creation.
    
15:43:43.980Z  INFO elastalert-server: ProcessController:  Index create exited with code 0
15:43:43.981Z  INFO elastalert-server: ProcessController:  Starting elastalert with arguments [none]
15:43:43.991Z  INFO elastalert-server: ProcessController:  Started Elastalert (PID: 50)
15:43:43.992Z  INFO elastalert-server: Server:  Server listening on port 3030
15:43:43.993Z  INFO elastalert-server: Server:  Websocket listening on port 3333
15:43:43.994Z  INFO elastalert-server: Server:  Server started
15:44:04.860Z ERROR elastalert-server:
    ProcessController:  ERROR:root:Error while running alert email: Error connecting to SMTP host: Connection unexpectedly closed
    
15:48:06.886Z ERROR elastalert-server:
    ProcessController:  WARNING:elasticsearch:GET http://172.19.32.106:9202/elastalert_status/elastalert/_search?size=10000 [status:400 request:0.012s]
    
15:48:06.886Z ERROR elastalert-server:
    ProcessController:  ERROR:root:Error fetching aggregated matches: RequestError(400, u'search_phase_execution_exception', u'parse_exception: Encountered " "-" "- "" at line 1, column 13.\nWas expecting one of:\n    <BAREOPER> ...\n    "(" ...\n    "*" ...\n    <QUOTED> ...\n    <TERM> ...\n    <PREFIXTERM> ...\n    <WILDTERM> ...\n    <REGEXPTERM> ...\n    "[" ...\n    "{" ...\n    <NUMBER> ...\n    ')
    
15:48:26.972Z ERROR elastalert-server:
    ProcessController:  ERROR:root:Error while running alert email: Error connecting to SMTP host: Connection unexpectedly closed

出現該錯誤,表示郵箱沒有連接上去;請檢查配置文件是否正確;

3.3.2 錯誤警告2:163郵箱認爲發送了非法內容被攔截,導致發送郵件失敗。

  SMTPDataError: (554, 'DT:SPM 163 smtp11,D8CowADn5mq2dFNewkQ5Aw--.52552S3 1582527670,please see http://mail.163.com/help/help_spam_16.htm?ip=58.49.28.162&hostid=smtp11&time=1582527670')
    
    
07:01:11.026Z ERROR elastalert-server:
    ProcessController:  ERROR:root:Uncaught exception running rule fct-Example-rule-name: (554, 'DT:SPM 163 smtp11,D8CowADn5mq2dFNewkQ5Aw--.52552S3 1582527670,please see http://mail.163.com/help/help_spam_16.htm?ip=58.49.28.162&hostid=smtp11&time=1582527670')

其中, •554 DT:SPM 發送的郵件內容包含了未被許可的信息,或被系統識別爲垃圾郵件。請檢查是否有用戶發送病毒或者垃圾郵件;

表明,告警程序將使用網易163郵箱發送告警程序到[email protected][email protected]兩個郵箱組成的郵箱用戶組。

解決方法:

1.首先,需要在163郵箱中,網頁版的首頁中,”設置“-》”常規設置“-》”反垃圾/黑白名單 “-》右側主頁中有"白名單”(添加白名單選項卡),將白名單“[email protected]”郵箱地址,添加進入白名單;


提示:目前只是簡單的走通所有的elk的告警流程,對於elastalert的各種告警規則,並沒有深究,尤其是各種告警場景的羅列,下一步需要繼續深入研究。

附註:

關於elasticalert的過濾規則,如下
在這裏插入圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章