安装elasticalert配置验证文档

docker 安装、配置、验证ElasticAlert

created by fangchangtan | 2020/2/24

1.elastalert的场景用途

​ elastalert组件作为elk中日志关键词的告警组件。基本的流程是,通过elk日志获取程序发出的不间断的心跳、错误日志关键词ERROR抓取等 ,获得对程序的健康状态和稳定性的监控告警。

2.安装elastalert

2.1 下载git仓库文件

## git拉去文件
git clone https://github.com/bitsensor/elastalert.git 
##切换目录
cd elastalert

2.2在本地测试elastalert的docker安装:

需要切换到elastalert目录下面,(官方建议的安装方式)

#启动elastalert容器
sudo docker run --rm -p 3030:3030 \
    -v `pwd`/config/elastalert.yaml:/opt/elastalert/config.yaml \
    -v `pwd`/config/elastalert-test.yaml:/opt/elastalert/config-test.yaml \
    -v `pwd`/config/config.json:/opt/elastalert-server/config/config.json \
    -v `pwd`/rules:/opt/elastalert/rules \
    -v `pwd`/rule_templates:/opt/elastalert/rule_templates \
    --net="host" \
    --name elastalert-fct2 bitsensor/elastalert:2.0.0

或者,正式的安装方式(建议方式):

#正式环境,启动elastalert
docker run --rm \
--name fct-elastalert \
--net "host" \
-p 3030:3030 \
-v /data/poc/trial-production/myelastalert/elastalert/config/elastalert.yaml:/opt/elastalert/config.yaml \
-v /data/poc/trial-production/myelastalert/elastalert/config/config.json:/opt/elastalert-server/config/config.json \
-v /data/poc/trial-production/myelastalert/elastalert/rules:/opt/elastalert/rules \
-v /data/poc/trial-production/myelastalert/elastalert/rule_templates:/opt/elastalert/rule_templates \
-v /data/poc/trial-production/myelastalert/elastalert/config/smtp_auth.yaml:/opt/elastalert/config/smtp_auth.yaml \
-v /data/poc/trial-production/myelastalert/elastalert/server_data:/opt/elastalert/server_data \
-v /data/poc/trial-production/myelastalert/elastalert/logs:/opt/logs \
bitsensor/elastalert:2.0.0 

2.3 配置elastalert的配置文件

其中config.conf文件,主要配置需要连接的es地址,规则rule和rul_templates的路径,要写入的es的index的名称;

{
  "appName": "elastalert-server",
  "port": 3030,
  "wsport": 3333,
  "elastalertPath": "/opt/elastalert",
  "verbose": false,
  "es_debug": false,
  "debug": false,
  "rulesPath": {
    "relative": true,
    "path": "/rules"
  },
  "templatesPath": {
    "relative": true,
    "path": "/rule_templates"
  },
  "es_host": "172.19.32.106",
  "es_port": 9202,
  "writeback_index": "elastalert_status"
}

其中,elastalert.yaml的配置如下

# The elasticsearch hostname for metadata writeback
# Note that every rule can have its own elasticsearch host
es_host: 172.19.32.106

# The elasticsearch port
es_port: 9202

# This is the folder that contains the rule yaml files
# Any .yaml file will be loaded as a rule
rules_folder: rules

# How often ElastAlert will query elasticsearch
# The unit can be anything from weeks to seconds
run_every:
  seconds: 5

# ElastAlert will buffer results from the most recent
# period of time, in case some log sources are not in real time
buffer_time:
  minutes: 1

# Optional URL prefix for elasticsearch
#es_url_prefix: elasticsearch

# Connect with TLS to elasticsearch
#use_ssl: True
use_ssl: False

# Verify TLS certificates
#verify_certs: True
verify_certs: False

# GET request with body is the default option for Elasticsearch.
# If it fails for some reason, you can pass 'GET', 'POST' or 'source'.
# See http://elasticsearch-py.readthedocs.io/en/master/connection.html?highlight=send_get_body_as#transport
# for details
#es_send_get_body_as: GET

# Option basic-auth username and password for elasticsearch
#es_username: someusername
#es_password: somepassword

# The index on es_host which is used for metadata storage
# This can be a unmapped index, but it is recommended that you run
# elastalert-create-index to set a mapping
writeback_index: elastalert_status

# If an alert fails for some reason, ElastAlert will retry
# sending the alert until this time period has elapsed
alert_time_limit:
  days: 2

其次还有一个elastalert-test.yaml文件,该配置只是用来当你使用API来测试规则的时候,这个配置文件可以使你在为不同的示例测试不同的规则时候,可以写不同的写回索引;

elastalert.yaml文件中的smtp_auth.yaml文件配置,

user: [email protected]
password: sdwtyx234

然后,配置elastalert中的告警规则, 扫描es制定索引中的最近1min中,满足查询过滤条件日志的消息数量》5时候,直接发送邮件到[email protected]报警;

如下,是/rules/tank-rules.yaml的elastalert的配置规则文件。

es_host: 172.19.32.106
es_port: 9202

#rule name 必须是独一的,不然会报错,这个定义完成之后,会成为报警邮件的标题
## (Required)
## Rule name, must be unique
name: fct-test-rule-name


#配置一种数据验证的方式,有 any,blacklist,whitelist,change,frequency,spike,flatline,new_term,cardinality 
#any:只要有匹配就报警;
#blacklist:compare_key字段的内容匹配上 blacklist数组里任意内容;
#whitelist:compare_key字段的内容一个都没能匹配上whitelist数组里内容;
#change:在相同query_key条件下,compare_key字段的内容,在 timeframe范围内 发送变化;
#frequency:在相同 query_key条件下,timeframe 范围内有num_events个被过滤出 来的异常;
#spike:在相同query_key条件下,前后两个timeframe范围内数据量相差比例超过spike_height。其中可以通过spike_type设置具体涨跌方向是- up,down,both 。还可以通过threshold_ref设置要求上一个周期数据量的下限,threshold_cur设置要求当前周期数据量的下限,如果数据量不到下限,也不触发;
#flatline:timeframe 范围内,数据量小于threshold 阈值;
#new_term:fields字段新出现之前terms_window_size(默认30天)范围内最多的terms_size (默认50)个结果以外的数据;
#cardinality:在相同 query_key条件下,timeframe范围内cardinality_field的值超过 max_cardinality 或者低于min_cardinality
## (Required)
## Type of alert.
## the frequency rule type alerts when num_events events occur with timeframe time
##我配置的是frequency,这个需要两个条件满足,在相同 query_key条件下,timeframe 范围内有num_events个被过滤出来的异常
type: frequency

#这个index 是指再kibana 里边的index,支持正则匹配,支持多个index,同时如果嫌麻烦直接* 也可以。
## (Required)
## Index to search, wildcard supported
index: fct-logstash*

# 只要1最近1min内,有一条事件满足条件,就满足规则,出发报警
num_events: 1
timeframe:
    minutes: 1


#这个还是非常关键的地方,就是你希望程序的message里边出现了什么样的关键字就报警,这个其实就是elasticsearch 的query语句,支持 AND&OR等。
filter:
- query:
    query_string:
      query: "UNKNOWN"

#在邮件正文会显示你定义的alert_text
alert_text: "你好,请回复邮件,方昌坦"

# Setup report smtp config 
smtp_host: smtp.163.com
smtp_port: 25
smtp_ssl: False

#SMTP auth
from_addr: [email protected]
email_reply_to: [email protected]
smtp_auth_file: /opt/elastalert/config/smtp_auth.yaml

# (Required)
# # The alert is use when a match is found
alert:
- "email"

# (required, email specific)
# # a list of email addresses to send alerts to
email:
- "[email protected]"
                         

注意此处需要注册163邮箱,并开通smtp协议

邮箱账号:[email protected]

邮箱密码:221123.com

smtp协议密码:swtx234

其中smtp协议可以允许第三方用户登录访问该邮箱。需要163邮箱开通smtp协议,在163邮箱设置中设置;

2.4 重启elastalert使得配置生效

最后重新启elastalert,是的刚才的新配置生效;

本地测试106主机上,运行elastalert的命令如下:

docker run --rm \
--name fct-elastalert \
--net "host" \
-p 3030:3030 \
-v /data/poc/trial-production/myelastalert/elastalert/config/elastalert.yaml:/opt/elastalert/config.yaml \
-v /data/poc/trial-production/myelastalert/elastalert/config/config.json:/opt/elastalert-server/config/config.json \
-v /data/poc/trial-production/myelastalert/elastalert/rules:/opt/elastalert/rules \
-v /data/poc/trial-production/myelastalert/elastalert/rule_templates:/opt/elastalert/rule_templates \
-v /data/poc/trial-production/myelastalert/elastalert/config/smtp_auth.yaml:/opt/elastalert/config/smtp_auth.yaml \
-v /data/poc/trial-production/myelastalert/elastalert/config/smtp_auth.yaml:/opt/elastalert/config/smtp_auth.yaml \
-v /data/poc/trial-production/myelastalert/elastalert/server_data:/opt/elastalert/server_data \
-v /data/poc/trial-production/myelastalert/elastalert/logs:/opt/logs \
bitsensor/elastalert:2.0.0 

3.验证邮件推送功能(本地测试)

3.1 启动logstash发送测试数据

为了验证elastalert的告警效果,需要启动logstash向es中发送测试数据;

在172.19.32.67上,本地启动logstash验证:

用来接收kafka中的日志数据,并通过logstash过滤之后放松到elasticsearch中的fct-logstash_*索引中;

docker run \
--rm \
--name fct-alert-logstash \
-p 5047:5044 \
-v /root/fct/logstash-test/logstash_kafka.conf:/logstash/logstash_kafka.conf \
-v /root/fct/logstash-test/logstash.yml:/usr/share/logstash/config/logstash.yml \
registry.marathon.l4lb.thisdcos.directory:5000/logstash:6.6.1 \
logstash -f /logstash/logstash_kafka.conf

3.2 成功的结果表现

在这里插入图片描述

出现如上所示,表明发送邮件成功!

3.3 常见错误总结

启动额elastalert服务的日志中,可以看到如下错误。

3.3.1 错误1:无法连接163邮箱服务错误。

运行过程提示:(提示邮箱配置不正确),需要配置正确的邮箱连接

15:43:43.085Z  INFO elastalert-server: Router:  Listening for GET request on /mapping/:index.
15:43:43.085Z  INFO elastalert-server: Router:  Listening for POST request on /search/:index.
15:43:43.090Z  INFO elastalert-server: ProcessController:  Starting ElastAlert
15:43:43.090Z  INFO elastalert-server: ProcessController:  Creating index
15:43:43.980Z  INFO elastalert-server:
    ProcessController:  Elastic Version:6
    Mapping used for string:{'type': 'keyword'}
    Index elastalert_status already exists. Skipping index creation.
    
15:43:43.980Z  INFO elastalert-server: ProcessController:  Index create exited with code 0
15:43:43.981Z  INFO elastalert-server: ProcessController:  Starting elastalert with arguments [none]
15:43:43.991Z  INFO elastalert-server: ProcessController:  Started Elastalert (PID: 50)
15:43:43.992Z  INFO elastalert-server: Server:  Server listening on port 3030
15:43:43.993Z  INFO elastalert-server: Server:  Websocket listening on port 3333
15:43:43.994Z  INFO elastalert-server: Server:  Server started
15:44:04.860Z ERROR elastalert-server:
    ProcessController:  ERROR:root:Error while running alert email: Error connecting to SMTP host: Connection unexpectedly closed
    
15:48:06.886Z ERROR elastalert-server:
    ProcessController:  WARNING:elasticsearch:GET http://172.19.32.106:9202/elastalert_status/elastalert/_search?size=10000 [status:400 request:0.012s]
    
15:48:06.886Z ERROR elastalert-server:
    ProcessController:  ERROR:root:Error fetching aggregated matches: RequestError(400, u'search_phase_execution_exception', u'parse_exception: Encountered " "-" "- "" at line 1, column 13.\nWas expecting one of:\n    <BAREOPER> ...\n    "(" ...\n    "*" ...\n    <QUOTED> ...\n    <TERM> ...\n    <PREFIXTERM> ...\n    <WILDTERM> ...\n    <REGEXPTERM> ...\n    "[" ...\n    "{" ...\n    <NUMBER> ...\n    ')
    
15:48:26.972Z ERROR elastalert-server:
    ProcessController:  ERROR:root:Error while running alert email: Error connecting to SMTP host: Connection unexpectedly closed

出现该错误,表示邮箱没有连接上去;请检查配置文件是否正确;

3.3.2 错误警告2:163邮箱认为发送了非法内容被拦截,导致发送邮件失败。

  SMTPDataError: (554, 'DT:SPM 163 smtp11,D8CowADn5mq2dFNewkQ5Aw--.52552S3 1582527670,please see http://mail.163.com/help/help_spam_16.htm?ip=58.49.28.162&hostid=smtp11&time=1582527670')
    
    
07:01:11.026Z ERROR elastalert-server:
    ProcessController:  ERROR:root:Uncaught exception running rule fct-Example-rule-name: (554, 'DT:SPM 163 smtp11,D8CowADn5mq2dFNewkQ5Aw--.52552S3 1582527670,please see http://mail.163.com/help/help_spam_16.htm?ip=58.49.28.162&hostid=smtp11&time=1582527670')

其中, •554 DT:SPM 发送的邮件内容包含了未被许可的信息,或被系统识别为垃圾邮件。请检查是否有用户发送病毒或者垃圾邮件;

表明,告警程序将使用网易163邮箱发送告警程序到[email protected][email protected]两个邮箱组成的邮箱用户组。

解决方法:

1.首先,需要在163邮箱中,网页版的首页中,”设置“-》”常规设置“-》”反垃圾/黑白名单 “-》右侧主页中有"白名单”(添加白名单选项卡),将白名单“[email protected]”邮箱地址,添加进入白名单;


提示:目前只是简单的走通所有的elk的告警流程,对于elastalert的各种告警规则,并没有深究,尤其是各种告警场景的罗列,下一步需要继续深入研究。

附注:

关于elasticalert的过滤规则,如下
在这里插入图片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章