該日誌收集系統,是用來收集基於springCloud分佈式系統的生產環境。爲了方便數據統計和分析,我將日誌生成的格式,轉化爲JSON格式。 具體如何生成,稍後另寫一篇文章介紹。
線上架構流程圖:
一、先搭建filebeat
1、這裏,我沒有使用docker安裝filebeat,而是直接以tar包的形式,解壓到每個服務器上。
官網下載filebeat:
https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.4.2-linux-x86_64.tar.gz
將下載的安裝包,傳到每臺linux服務器上的/opt目錄下。然後解壓,就可以了
官方文檔地址:Filebeat官網文檔教程,可用谷歌翻譯閱讀
2、直接上傳我的配置文件信息
[root@slave1 filebeat-7.4.2]# cat kedafilebeat.yml
###################### Filebeat Configuration Example #########################
# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html
# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.
#=========================== Filebeat inputs =============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
# 採集的數據,是日誌類型,所以 type爲 log
- type: log
# true: 表示啓動
enabled: true
#採集日誌的 路徑
paths:
- /data/springCloud/eureka/keda-eureka/*/logs/error/*.json
#編碼格式
encoding: utf-8
#採集當期時間,往前推12個小時內,所有文件
ignore_older: 12h
# 默認情況下,解碼後的JSON放置在輸出文檔中的“ json”鍵下。
# 如果啓用此設置,則將密鑰複製到輸出文檔的頂層。默認值爲false
fields_under_root: false
# fields 可以添加自定義的 屬性和屬性值,這裏設置的是,標記輸出到那條kafka隊列
fields:
log_topic: eureka_topic
- type: log
enabled: true
paths:
- /data/springCloud/gateway/keda-gateway/*/logs/error/*.json
- /data/springCloud/gateway/keda-gateway/*/logs/info/*.json
- /data/springCloud/gateway/keda-gateway/*/logs/ware/*.json
- /data/springCloud/gateway/keda-gateway/*/logs/trace/*.json
encoding: utf-8
ignore_older: 12h
fields_under_root: false
fields:
log_topic: gateway_topic
- type: log
enabled: true
paths:
- /data/springCloud/gateway/keda-gateway/*/logs/debug/*.json
include_lines: [".*org.apache.ibatis.logging.jdbc.BaseJdbcLogger.*"]
encoding: utf-8
ignore_older: 12h
fields_under_root: false
fields:
log_topic: gateway_topic
# /data/springCloud/project/keda6-information-main/keda-information-main/172.19.174.184-9820/logs/info
- type: log
enabled: true
paths:
#info-20200115
- /data/springCloud/project/keda6-information-main/*/*/logs/*/*.json
# - /data/springCloud/eureka/keda-eureka/47.103.37.44-9800/logs/info/*.json
# 只採集 符合 如下正則表達式的數據
include_lines: ["{\"service\":\".*"]
encoding: utf-8
ignore_older: 12h
fields_under_root: false
fields:
log_topic: keda-information-main_topic
- type: log
enabled: true
paths:
- /data/springCloud/project/*/*/*/logs/*/*.json
include_lines: ["{\"service\":\".*"]
encoding: utf-8
ignore_older: 12h
fields_under_root: false
fields:
log_topic: keda-project_topic
- type: log
enabled: true
paths:
- /usr/docker/software/nginx/logs/*kedaqianbao.log
include_lines: ["{\"remote_addr\":.*"]
encoding: utf-8
# json.keys_under_root: true
# # json.add_error_key: true
# # json.message_key: message
ignore_older: 1h
fields_under_root: false
fields:
log_topic: keda-nginx_topic
# filebeat-timestamp: %{[@timestamp]}
# Exclude lines. A list of regular expressions to match. It drops the lines that are
# matching any regular expression from the list.
#exclude_lines: ['^DBG']
# Include lines. A list of regular expressions to match. It exports the lines that are
# matching any regular expression from the list.
#include_lines: ['^ERR', '^WARN']
# Exclude files. A list of regular expressions to match. Filebeat drops the files that
# are matching any regular expression from the list. By default, no files are dropped.
#exclude_files: ['.gz$']
# Optional additional fields. These fields can be freely picked
# to add additional information to the crawled log files for filtering
#fields:
# level: debug
# review: 1
### Multiline options
# Multiline can be used for log messages spanning multiple lines. This is common
# for Java Stack Traces or C-Line Continuation
# The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
#multiline.pattern: ^\[
# Defines if the pattern set under pattern should be negated or not. Default is false.
#multiline.negate: false
# Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
# that was (not) matched before or after or as long as a pattern is not matched based on negate.
# Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
#multiline.match: after
#============================= Filebeat modules ===============================
filebeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
# Set to true to enable config reloading
reload.enabled: false
# Period on which files under path should be checked for changes
#reload.period: 10s
#==================== Elasticsearch template setting ==========================
setup.template.settings:
index.number_of_shards: 1
#index.codec: best_compression
#_source.enabled: false
#================================ General =====================================
# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:
# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]
# Optional fields that you can specify to add additional information to the
# output.
#fields:
# env: staging
#============================== Dashboards =====================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false
# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:
#============================== Kibana =====================================
# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:
# Kibana Host
# Scheme and port can be left out and will be set to the default (http and 5601)
# In case you specify and additional path, the scheme is required: http://localhost:5601/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
#host: "localhost:5601"
# Kibana Space ID
# ID of the Kibana Space into which the dashboards should be loaded. By default,
# the Default Space will be used.
#space.id:
#============================= Elastic Cloud ==================================
# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).
# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:
# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:
#================================ Outputs =====================================
# Configure what output to use when sending the data collected by the beat.
#-------------------------- Elasticsearch output ------------------------------
#output.elasticsearch:
# Array of hosts to connect to.
# hosts: ["localhost:9200"]
# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "elastic"
#password: "changeme"
#----------------------------- Logstash output --------------------------------
#output.logstash:
# The Logstash hosts
#hosts: ["localhost:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
# 輸出到kafka
#----------------------------- kafka output --------------------------------
output.kafka:
enabled: true
# initial brokers for reading cluster metadata
# kafka集羣
hosts: ["101.132.34.170:19092", "101.133.132.124:19092","47.103.37.44:19092"]
#
# message topic selection + partitioning
# 這裏的topic 參數,就是 上面 自定義的 屬性字段
topic: '%{[fields.log_topic]}'
version: 2.0.0
partition.round_robin:
reachable_only: false
# codec.format:
# string: '{"filebeat-timestamp":"%{[@timestamp]}"}'
# string: '{"agent":%{[agent]},"log":%{[log]},"fields":%{[fields]},"filebeat-timestamp":"%{[@timestamp]}","class":"%{[class]}","method":"%{[method]}","line":"%{[line]}","message":"%{[message]}","time":"%{[timestamp]}","level":"%{[level]}","service":"%{[service]}","local_ip":"%{[local_ip]}","pid":"%{[pid]}","thread":"%{[thread]}","span":"%{[span]}","trace":"%{[trace]}","parent":"%{[parent]}","exportable":"%{[exportable]}"}'
required_acks: 1
compression: gzip
compression_level: 4
max_message_bytes: 1000000
#----------------------------- console output --------------------------------
#output.console:
# pretty: true
# enable: true
# codec.format:
# string: '{"agent":%{[agent]},"log":%{[log]},"fields":%{[fields]},"filebeat-timestamp":"%{[@timestamp]}","class":"%{[class]}","method":"%{[method]}","line":"%{[line]}","message":"%{[message]}","time":"%{[timestamp]}","level":"%{[level]}","service":"%{[service]}","local_ip":"%{[local_ip]}","pid":"%{[pid]}","thread":"%{[thread]}","span":"%{[span]}","trace":"%{[trace]}","parent":"%{[parent]}","exportable":"%{[exportable]}"}'
#================================ iProcessors =====================================
# Configure processors to enhance or manipulate events generated by the beat.
# 全局 過濾配置
processors:
- if:
regexp:
message: '{\"service\":\"keda*'
then:
- decode_json_fields:
fields: ["message"]
target: ""
max_depth: 1
overwrite_keys: true
- if:
regexp:
message: '{\"remote_addr\":.*'
then:
- decode_json_fields:
fields: ["message"]
target: ""
max_depth: 1
overwrite_keys: true
# - decode_json_fields:
# fields: ["message"]
# max_depth: 3
# 排除所有包含 該字段的 日誌行
- if:
regexp:
message: '{\"service\":\"keda*'
then:
- drop_event:
when:
regexp:
trace: '^[A-Za-z0-9]{0}$'
span: '^[A-Za-z0-9]{0}$'
# - rename:
# fields:
# - from: "timestamp"
# to: "time"
# ignore_missing: false
# fail_on_error: true
- if:
regexp:
message: '{\"service\":\"keda*'
then:
- drop_fields:
fields: ["host", "ecs","@version"]
ignore_missing: false
- if:
regexp:
message: '{\"remote_addr\":.*'
then:
- drop_fields:
fields: ["message"]
ignore_missing: false
# has_fields: ['traceee']
# message: '\"trace\":\"\",\"span\":\"\",'
#================================ Logging =====================================
# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug
# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publish", "service".
#logging.selectors: ["*"]
#============================== X-Pack Monitoring ===============================
# filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster. This requires xpack monitoring to be enabled in Elasticsearch. The
# reporting is disabled by default.
# Set to true to enable the monitoring reporter.
#monitoring.enabled: false
# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch
# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.
#monitoring.cluster_uuid:
# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well.
# Note that the settings should point to your Elasticsearch *monitoring* cluster.
# Any setting that is not set is automatically inherited from the Elasticsearch
# output configuration, so if you have the Elasticsearch output configured such
# that it is pointing to your Elasticsearch monitoring cluster, you can simply
# uncomment the following line.
#monitoring.elasticsearch:
#================================= Migration ==================================
# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true
[root@slave1 filebeat-7.4.2]#
3、上腳本
[root@master sh]# cat start.sh
#!/bin/bash
sh stop.sh
nohup sudo /opt/filebeat-7.4.2/filebeat -e -c /opt/filebeat-7.4.2/kedafilebeat.yml &
[root@master sh]# cat stop.sh
#!/bin/bash
PID=$(ps -ef | grep /opt/filebeat-7.4.2/filebeat | grep -v grep | awk '{ print $2 }')
if [ -z "$PID" ]
then
echo filebeat is already stopped
else
echo kill $PID
kill $PID
fi
[root@master sh]#
4、三臺服務器,都是這樣配置。
5、至此,filebeat 搭建完成