原来的架构
这样的架构会导致ES压力太大
引入redis架构图
redis不能直接对接ES 所以使用logstash进行转换
redis的数据是filebeat输入的
logstash 从redis(仓库中)拿取数据 给ES
使用redis缓存服务来缓解ES压力
1.安装redis
yum install redis
sed -i 's#^bind 127.0.0.1#bind 127.0.0.1 10.0.0.51#' /etc/redis.conf
systemctl start redis
netstat -lntup|grep redis
redis-cli -h 10.0.0.51
2.停止docker容器
docker stop $(docker ps -q)
3.停止filebeat
systemctl stop filebeat
4.删除旧的ES索引
5.确认nginx日志为json格式
grep "access_log" nginx.conf
6.修改filebeat配置文件
cat >/etc/filebeat/filebeat.yml <<EOF
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/nginx/access.log
json.keys_under_root: true
json.overwrite_keys: true
tags: ["access"]
- type: log
enabled: true
paths:
- /var/log/nginx/error.log
tags: ["error"]
output.redis:
hosts: ["10.0.0.51"]
keys:
- key: "nginx_access"
when.contains:
tags: "access"
- key: "nginx_error"
when.contains:
tags: "error"
setup.template.name: "nginx"
setup.template.pattern: "nginx_*"
setup.template.enabled: false
setup.template.overwrite: true
EOF
7.重启filebaet和nginx
systemctl restart nginx
systemctl restart filebeat
8.生成测试数据
curl 127.0.0.1/haha
9.检查
redis-cli -h 10.0.0.51
keys *
TYPE nginx_access
LLEN nginx_access
LRANGE nginx_access 0 -1
确认是否为json格式
10.安装logstash
rpm -ivh jdk-8u102-linux-x64.rpm
rpm -ivh logstash-6.6.0.rpm
11.配置logstash
cat >/etc/logstash/conf.d/redis.conf<<EOF
input {
redis {
host => "10.0.0.51"
port => "6379"
db => "0"
key => "nginx_access"
data_type => "list"
}
redis {
host => "10.0.0.51"
port => "6379"
db => "0"
key => "nginx_error"
data_type => "list"
}
}
filter {
mutate {
convert => ["upstream_time", "float"]
convert => ["request_time", "float"]
}
}
output {
stdout {}
if "access" in [tags] {
elasticsearch {
hosts => "http://10.0.0.51:9200"
manage_template => false
index => "nginx_access-%{+yyyy.MM}"
}
}
if "error" in [tags] {
elasticsearch {
hosts => "http://10.0.0.51:9200"
manage_template => false
index => "nginx_error-%{+yyyy.MM}"
}
}
}
EOF
12.前台启动测试
/usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/redis.conf
13.检查
logstash输出的内容有没有解析成json
es-head上有没有索引生成
redis里的列表数据有没有在减少
14.将logstash放在后台运行
ctrl+c
systemctl start logstash
听风扇声音,开始转的时候表示logstash启动了 --查看9600端口
redis操作命令
redis-cli -h 10.0.0.51
keys *
type nginx_access
llen nginx_access
lrange nginx_access 0 -1
架构的优化部分
如果现在想要新增加一个日志文件
需要修改4个地方
redis优化:
新增加一个日志 需要修改4个地方
优化后新增加一个日志 需要修改2个地方
原理:只有在logstash输出到ES时 进行分类 前面的filebeat和redis中只是打标签 不需要放到不同的文件中(access error放到一起)
filebeat端口:监控日志 写到redis中
redis端口:6379
logstash端口:9600
elasticsearch端口: 9200 9300
filebeat 不支持链接 redis集群或者哨兵,所以下图中 redis 只是一个一个的工作
多个logstash会不会 同时读取redis中的数据 (是多个logstash工作)
扩容多台redis和logstash
1.增加redis备份节点:
cat >/etc/filebeat/filebeat.yml <<'EOF'
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/nginx/access.log
json.keys_under_root: true
json.overwrite_keys: true
tags: ["access"]
- type: log
enabled: true
paths:
- /var/log/nginx/error.log
tags: ["error"]
output.redis:
hosts: ["10.0.0.51:6379","10.0.0.52:6379"]
key: "nginx_log"
setup.template.name: "nginx"
setup.template.pattern: "nginx_*"
setup.template.enabled: false
setup.template.overwrite: true
EOF
2.logstash从多台redis读取数据:
cat >/etc/logstash/conf.d/redis.conf <<'EOF'
input {
redis {
host => "10.0.0.51"
port => "6379"
db => "0"
key => "nginx_log"
data_type => "list"
}
redis {
host => "10.0.0.52"
port => "6379"
db => "0"
key => "nginx_log"
data_type => "list"
}
}
filter {
mutate {
convert => ["upstream_time", "float"]
convert => ["request_time", "float"]
}
}
output {
stdout {}
if "access" in [tags] {
elasticsearch {
hosts => "http://10.0.0.51:9200"
manage_template => false
index => "nginx_access-%{+yyyy.MM}"
}
}
if "error" in [tags] {
elasticsearch {
hosts => "http://10.0.0.51:9200"
manage_template => false
index => "nginx_error-%{+yyyy.MM}"
}
}
}
EOF
这个架构图中 logstash 读取redis是轮询的
消息队列:
kafka 、zookeeper 配套使用
原理图:
这样的架构比使用redis更大 性能更高,一天100G之前的
生产者放消息 消费者拿消息
kafka和zookeeper 引入
0.配置密钥
cat >/etc/hosts<<EOF
10.0.0.51 db01
10.0.0.52 db02
10.0.0.53 db03
EOF
ssh-keygen
ssh-copy-id 10.0.0.52
ssh-copy-id 10.0.0.53
1.安装zook
###db01操作
cd /data/soft
tar zxf zookeeper-3.4.11.tar.gz -C /opt/
ln -s /opt/zookeeper-3.4.11/ /opt/zookeeper
mkdir -p /data/zookeeper
cat >/opt/zookeeper/conf/zoo.cfg<<EOF
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/data/zookeeper
clientPort=2181
server.1=10.0.0.51:2888:3888
server.2=10.0.0.52:2888:3888
server.3=10.0.0.53:2888:3888
EOF
echo "1" > /data/zookeeper/myid
cat /data/zookeeper/myid
rsync -avz /opt/zookeeper* 10.0.0.52:/opt/
rsync -avz /opt/zookeeper* 10.0.0.53:/opt/
###db02操作
mkdir -p /data/zookeeper
echo "2" > /data/zookeeper/myid
cat /data/zookeeper/myid
###db03操作
mkdir -p /data/zookeeper
echo "3" > /data/zookeeper/myid
cat /data/zookeeper/myid
2.启动zookeeper
/opt/zookeeper/bin/zkServer.sh start
3.检查启动是否成功
/opt/zookeeper/bin/zkServer.sh status
如果启动正常mode应该是
2个follower
1个leader
4.测试zook通讯是否正常
在一个节点上执行,创建一个频道
/opt/zookeeper/bin/zkCli.sh -server 10.0.0.51:2181
create /test "hello"
在其他节点上看能否接收到
/opt/zookeeper/bin/zkCli.sh -server 10.0.0.52:2181
get /test
5.安装kafka
###db01操作
cd /data/soft/
tar zxf kafka_2.11-1.0.0.tgz -C /opt/
ln -s /opt/kafka_2.11-1.0.0/ /opt/kafka
mkdir /opt/kafka/logs
cat >/opt/kafka/config/server.properties<<EOF
broker.id=1
listeners=PLAINTEXT://10.0.0.51:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/opt/kafka/logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=24
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=10.0.0.51:2181,10.0.0.52:2181,10.0.0.53:2181
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
EOF
rsync -avz /opt/kafka* 10.0.0.52:/opt/
rsync -avz /opt/kafka* 10.0.0.53:/opt/
###db02操作
sed -i "s#10.0.0.51:9092#10.0.0.52:9092#g" /opt/kafka/config/server.properties
sed -i "s#broker.id=1#broker.id=2#g" /opt/kafka/config/server.properties
###db03操作
sed -i "s#10.0.0.51:9092#10.0.0.53:9092#g" /opt/kafka/config/server.properties
sed -i "s#broker.id=1#broker.id=3#g" /opt/kafka/config/server.properties
6.先前台启动kafka测试
/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
7.检查是否启动
jps
8.kafka测试命令发送消息
创建命令
/opt/kafka/bin/kafka-topics.sh --create --zookeeper 10.0.0.51:2181,10.0.0.52:2181,10.0.0.53:2181 --partitions 3 --replication-factor 3 --topic messagetest
测试获取所有的频道
/opt/kafka/bin/kafka-topics.sh --list --zookeeper 10.0.0.51:2181,10.0.0.52:2181,10.0.0.53:2181
测试发送消息
/opt/kafka/bin/kafka-console-producer.sh --broker-list 10.0.0.51:9092,10.0.0.52:9092,10.0.0.53:9092 --topic messagetest
其他节点测试接收
/opt/kafka/bin/kafka-console-consumer.sh --zookeeper 10.0.0.51:2181,10.0.0.52:2181,10.0.0.53:2181 --topic messagetest --from-beginning
9.测试成功之后,可以放在后台启动
按ctrl + c 停止kafka的前台启动,切换到后台启动
/opt/kafka/bin/kafka-server-start.sh -daemon /opt/kafka/config/server.properties
10.配置filebeat
cat >/etc/filebeat/filebeat.yml <<EOF
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/nginx/access.log
json.keys_under_root: true
json.overwrite_keys: true
tags: ["access"]
- type: log
enabled: true
paths:
- /var/log/nginx/error.log
tags: ["error"]
output.kafka:
hosts: ["10.0.0.51:9092", "10.0.0.52:9092", "10.0.0.53:9092"]
topic: 'filebeat'
setup.template.name: "nginx"
setup.template.pattern: "nginx_*"
setup.template.enabled: false
setup.template.overwrite: true
EOF
重启filebeat
systemctl restart filebeat
11.访问并检查kafka里有没有收到日志
curl 10.0.0.51
/opt/kafka/bin/kafka-topics.sh --list --zookeeper 10.0.0.51:2181,10.0.0.52:2181,10.0.0.53:2181
/opt/kafka/bin/kafka-console-consumer.sh --zookeeper 10.0.0.51:2181,10.0.0.52:2181,10.0.0.53:2181 --topic filebeat --from-beginning
12.logstash配置文件
cat > /etc/logstash/conf.d/kafka.conf<<EOF
input {
kafka{
bootstrap_servers=>["10.0.0.51:9092,10.0.0.52:9092,10.0.0.53:9092"]
topics=>["filebeat"]
group_id=>"logstash"
codec => "json"
}
}
filter {
mutate {
convert => ["upstream_time", "float"]
convert => ["request_time", "float"]
}
}
output {
stdout {}
if "access" in [tags] {
elasticsearch {
hosts => "http://10.0.0.51:9200"
manage_template => false
index => "nginx_access-%{+yyyy.MM}"
}
}
if "error" in [tags] {
elasticsearch {
hosts => "http://10.0.0.51:9200"
manage_template => false
index => "nginx_error-%{+yyyy.MM}"
}
}
}
EOF
13.前台启动logatash测试
先清空ES以前生成的索引
/usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/kafka.conf
生成访问日志
curl 127.0.0.1
14.总结kafka实验
1.前提条件
- kafka和zook都是基于java的,所以需要java环境
- 这俩比较吃资源,内存得够
2.安装zook注意
- 每台机器的myid要不一样,而且要和配置文件里的id对应上
- 启动测试,角色为leader和follower
- 测试发送和接受消息
3.安装kafka注意
- kafka依赖于zook,所以如果zook不正常,kafka不能工作
- kafka配置文件里要配上zook的所有IP的列表
- kafka配置文件里要注意,写自己的IP地址
- kafka配置文件里要注意,自己的ID是zook里配置的myid
- kafka启动要看日志出现started才算是成功
4.测试zook和kafka
- 一端发送消息
- 两端能实时接收消息
5.配置filebeat
- output要配上kafka的所有的IP列表
6.配置logstash
- input要写上所有的kafka的IP列表,别忘了[]
- 前台启动测试成功后再后台启动
7.毁灭测试结果
- 只要还有1个zook和1个kafka节点,就能正常收集日志