03-Elastic日誌系統-filebeat-kafka-logstash-elasticsearch-kibana-6.8.0搭建流程
1. 介紹
前面寫了使用redis作爲中間緩存,消息隊列,進行日誌流的削峯,但是在使用的過程中,前期日誌量不大的情況下,完全沒有問題,當日志不斷上升,redis隊列不斷堆積的時候,問題就來了,redis是內存型數據庫,日子不能被即時消費的情況下,內存佔用會不斷上升,直至OOM,導致系統崩潰,當然logstash消費日誌的速率也是一個問題。不過還是考慮換掉單節點redis,用三節點kafka吧,並且把elasticsearch的啓動參數做了修改。下面只介紹換爲kafka的配置,和遇到的問題,其他配置參考之前的文章。
2. 準備工作
節點:
192.168.72.56
192.168.72.57
192.168.72.58
2.1 軟件版本
elastic相關軟件全部使用6.8.0 rpm包安裝
zookeeper:3.4.14,下載地址
kafka:2.11-2.4.0,下載地址
系統版本:
CentOS Linux release 7.7.1908 (Core)
2.2 日誌流
filebeat --> kafka集羣 --> logstash --> elasticsearch集羣 --> kibana
3. 配置zookeeper集羣
我們使用了kafka外部的zookeeper集羣,實際kafka安裝包裏面也有一個自帶的zookeeper組件。參考:https://www.cnblogs.com/longBlogs/p/10340251.html
下面介紹配置
wget https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
tar -xvf zookeeper-3.4.14.tar.gz -C /usr/local
cd /usr/local
ln -sv zookeeper-3.4.14 zookeeper
cd zookeeper/conf
cp zoo_sample.cfg zoo.cfg
mkdir -pv /usr/local/zookeeper/{data,logs}
節點一編輯配置文件zoo.cfg
# 指定數據文件夾,日誌文件夾
dataDir=/usr/local/zookeeper/data
dataLogDir=/usr/local/zookeeper/logs
clientPort=2181
server.1=192.168.72.56:2888:3888
server.2=192.168.72.57:2888:3888
server.3=192.168.72.58:2888:3888
# 第一個端口是master和slave之間的通信端口,默認是2888,第二個端口是leader選舉的端口,集羣剛啓動的時候選舉或者leader掛掉之後進行新的選舉的端口默認是3888
配置節點id
echo "1" > /usr/local/zookeeper/data/myid #server1配置,各節點不同,跟上面配置server.1的號碼一樣
echo "2" > /usr/local/zookeeper/data/myid #server2配置,各節點不同,跟上面配置server.2的號碼一樣
echo "3" > /usr/local/zookeeper/data/myid #server3配置,各節點不同,跟上面配置server.3的號碼一樣
啓動停止zookeeper
# 啓動
/usr/local/zookeeper/bin/zkServer.sh start
# 停止
/usr/local/zookeeper/bin/zkServer.sh stop
# 狀態查看
/usr/local/zookeeper/bin/zkServer.sh status
配置zookeeper服務
cd /usr/lib/systemd/system
# vim zookeeper.service
=========================================
[Unit]
Description=zookeeper server daemon
After=zookeeper.target
[Service]
Type=forking
ExecStart=/usr/local/zookeeper/bin/zkServer.sh start
ExecReload=/usr/local/zookeeper/bin/zkServer.sh stop && sleep 2 && /usr/local/zookeeper/bin/zkServer.sh start
ExecStop=/usr/local/zookeeper/bin/zkServer.sh stop
Restart=always
[Install]
WantedBy=multi-user.target
=======================================================
# systemctl start zookeeper
# systemctl enable zookeeper
4. 配置kafka集羣
下載安裝
wget http://mirrors.tuna.tsinghua.edu.cn/apache/kafka/2.4.0/kafka_2.11-2.4.0.tgz
tar -xvf kafka_2.11-2.4.0.tgz -C /usr/local
cd /usr/local
ln -sv kafka_2.11-2.4.0 kafka
cd kafka/config
修改配置
# vim server.properties
broker.id=1 # 每一個broker在集羣中的唯一標示,要求是正數,三節點不同
host.name=192.168.72.56 # 新增項,節點IP
num.network.threads=3 # 每個topic的分區個數, 更多的分區允許更大的並行操作
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/var/log/kafka # 日誌文件夾
num.partitions=3
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168 #segment文件保留的最長時間(小時),超時將被刪除,也就是說7天之前的數據將被清理掉
log.segment.bytes=1073741824 # 日誌文件中每個segmeng的大小(字節),默認爲1G
log.retention.check.interval.ms=300000
log.cleaner.enable=true # 開啓日誌清理
zookeeper.connect=192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 # zookeeper集羣的地址,可以是多個
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
kafka節點默認需要的內存爲1G,如果需要修改內存,可以修改kafka-server-start.sh的配置項
找到KAFKA_HEAP_OPTS配置項,例如修改如下:
export KAFKA_HEAP_OPTS="-Xmx2G -Xms2G"
啓動kafka
cd /usr/local/kafka
./bin/kafka-server-start.sh -daemon ./config/server.properties
設置開機啓動
# cd /usr/lib/systemd/system
# vim kafka.service
=========================================
[Unit]
Description=kafka server daemon
After=kafka.target
[Service]
Type=forking
ExecStart=/usr/local/kafka/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties
ExecReload=/usr/local/kafka/bin/kafka-server-stop.sh && sleep 2 && /usr/local/kafka/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties
ExecStop=/usr/local/kafka/bin/kafka-server-stop.sh
Restart=always
[Install]
WantedBy=multi-user.target
=======================================================
# systemctl start kafka
# systemctl enable kafka
創建topic
創建3分區、3備份
cd /usr/local/kafka
/bin/kafka-topics.sh --create --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --replication-factor 3 --partitions 3 --topic java
常用命令
1) 停止kafka
./bin/kafka-server-stop.sh
2) 創建topic
./bin/kafka-topics.sh --create --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --replication-factor 1 --partitions 1 --topic topic_name
分區擴展
./bin/kafka-topics.sh --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --alter --topic java --partitions 40
3) 展示topic
./bin/kafka-topics.sh --list --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181
4) 查看描述topic
./bin/kafka-topics.sh --describe --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --topic topic_name
5) 生產者發送消息
./bin/kafka-console-producer.sh --broker-list 192.168.89.11:9092 --topic topic_name
6) 消費者消費消息
./bin/kafka-console-consumer.sh --bootstrap-server 192.168.89.11:9092,192.168.89.12:9092,192.168.89.13:9092 --topic topic_name
7) 刪除topic
./bin/kafka-topics.sh --delete --topictopic_name --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181
8) 查看每分區consumer_offsets(可以連接到的消費主機)
./bin/kafka-topics.sh --describe --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --topic __consumer_offsets
5. 配置filebeat輸出
詳細參考https://www.elastic.co/guide/en/beats/filebeat/current/kafka-output.html
output.kafka:
enabled: true
hosts: ["192.168.72.56:9092","192.168.72.56:9092","192.168.72.56:9092"]
topic: java
required_acks: 1
compression: gzip
message.max.bytes: 500000000 # 每次消息最大傳輸字節數,大於這個數值的會被丟掉
重啓filebeat
systemctl restart filebeat
6. 配置logstash input
先安裝kafka輸入模塊
/usr/share/logstash/bin/logstash-plugin install logstash-input-kafka
添加配置文件:
vim /etc/logstash/conf.d/kafka.conf
======================
input {
kafka {
bootstrap_servers => "192.168.72.56:9092"
group_id => "java"
auto_offset_reset => "latest"
consumer_threads => "5"
decorate_events => "false"
topics => ["java"]
codec => json
}
}
output {
elasticsearch {
hosts => ["192.168.72.56:9200","192.168.72.57:9200","192.168.72.58:9200"]
user => "elastic"
password => "changme"
index => "logs-other-%{+YYYY.MM.dd}"
http_compression => true
}
}
添加完成後,先測試一下配置文件
/usr/share/logstash/bin/logstash -t -f /etc/logstash/conf.d/kafka.conf
測試沒問題,重啓logstash
7. 可能會遇到的問題
filebeat報錯
-
*WARN producer/broker/0 maximum request accumulated, waiting for space
參考:https://linux.xiao5tech.com/bigdata/elk/elk_2.2.1_error_filebeat_kafka_waiting_for_space.html
原因:max_message_bytes的緩衝區數值配置的小了 -
dropping too large message of size
參考:https://www.cnblogs.com/zhaosc-haha/p/12133699.html
原因:傳輸的消息字節數超過了限制,修改日誌的掃描頻率或者確認日誌輸出是否異常,有沒有不必要的輸出。太大的日誌會嚴重影響kafka的性能。
設置值:10000000(10MB)