阿里雲ECS上使用docker搭建filebeat+kafka集羣+zookeep集羣+logstash+elasticsearch集羣+kibana實現跨主機日誌收集系統【一】

該日誌收集系統,是用來收集基於springCloud分佈式系統的生產環境。爲了方便數據統計和分析,我將日誌生成的格式,轉化爲JSON格式。 具體如何生成,稍後另寫一篇文章介紹。

線上架構流程圖:

 

一、先搭建filebeat

1、這裏,我沒有使用docker安裝filebeat,而是直接以tar包的形式,解壓到每個服務器上。
官網下載filebeat:
https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.4.2-linux-x86_64.tar.gz

將下載的安裝包,傳到每臺linux服務器上的/opt目錄下。然後解壓,就可以了
官方文檔地址Filebeat官網文檔教程,可用谷歌翻譯閱讀




2、直接上傳我的配置文件信息

[root@slave1 filebeat-7.4.2]# cat kedafilebeat.yml 
###################### Filebeat Configuration Example #########################

# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html

# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.

#=========================== Filebeat inputs =============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
# 採集的數據,是日誌類型,所以 type爲 log
- type: log 
  # true: 表示啓動
  enabled: true
  #採集日誌的 路徑
  paths:
    - /data/springCloud/eureka/keda-eureka/*/logs/error/*.json
  #編碼格式
  encoding: utf-8
  #採集當期時間,往前推12個小時內,所有文件
  ignore_older: 12h
  # 默認情況下,解碼後的JSON放置在輸出文檔中的“ json”鍵下。
  # 如果啓用此設置,則將密鑰複製到輸出文檔的頂層。默認值爲false
  fields_under_root: false
  # fields 可以添加自定義的 屬性和屬性值,這裏設置的是,標記輸出到那條kafka隊列
  fields:
    log_topic: eureka_topic

- type: log
  enabled: true
  paths:
    - /data/springCloud/gateway/keda-gateway/*/logs/error/*.json
    - /data/springCloud/gateway/keda-gateway/*/logs/info/*.json
    - /data/springCloud/gateway/keda-gateway/*/logs/ware/*.json
    - /data/springCloud/gateway/keda-gateway/*/logs/trace/*.json
  encoding: utf-8
  ignore_older: 12h
  fields_under_root: false
  fields:
    log_topic: gateway_topic

- type: log
  enabled: true
  paths:
    - /data/springCloud/gateway/keda-gateway/*/logs/debug/*.json
  include_lines: [".*org.apache.ibatis.logging.jdbc.BaseJdbcLogger.*"]
  encoding: utf-8
  ignore_older: 12h
  fields_under_root: false
  fields:
    log_topic: gateway_topic

# /data/springCloud/project/keda6-information-main/keda-information-main/172.19.174.184-9820/logs/info

- type: log
  enabled: true
  paths:
    #info-20200115
    - /data/springCloud/project/keda6-information-main/*/*/logs/*/*.json
   #  - /data/springCloud/eureka/keda-eureka/47.103.37.44-9800/logs/info/*.json
  # 只採集 符合 如下正則表達式的數據
  include_lines: ["{\"service\":\".*"]
  encoding: utf-8
  ignore_older: 12h
  fields_under_root: false
  fields:
    log_topic: keda-information-main_topic

- type: log
  enabled: true
  paths:
    - /data/springCloud/project/*/*/*/logs/*/*.json
  include_lines: ["{\"service\":\".*"]
  encoding: utf-8
  ignore_older: 12h
  fields_under_root: false
  fields:
    log_topic: keda-project_topic

- type: log
  enabled: true
  paths:
    - /usr/docker/software/nginx/logs/*kedaqianbao.log
  include_lines: ["{\"remote_addr\":.*"]
  encoding: utf-8
#  json.keys_under_root: true
#  #  json.add_error_key: true
#  #  json.message_key: message
  ignore_older: 1h
  fields_under_root: false
  fields:
    log_topic: keda-nginx_topic


  #  filebeat-timestamp: %{[@timestamp]}
  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  #exclude_lines: ['^DBG']
  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1
  ### Multiline options

  # Multiline can be used for log messages spanning multiple lines. This is common
  # for Java Stack Traces or C-Line Continuation

  # The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
  #multiline.pattern: ^\[

  # Defines if the pattern set under pattern should be negated or not. Default is false.
  #multiline.negate: false

  # Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
  # that was (not) matched before or after or as long as a pattern is not matched based on negate.
  # Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
  #multiline.match: after


#============================= Filebeat modules ===============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

#==================== Elasticsearch template setting ==========================

setup.template.settings:
  index.number_of_shards: 1
  #index.codec: best_compression
  #_source.enabled: false

#================================ General =====================================

# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:

# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]

# Optional fields that you can specify to add additional information to the
# output.
#fields:
#  env: staging


#============================== Dashboards =====================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false

# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:

#============================== Kibana =====================================

# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:

  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  #host: "localhost:5601"

  # Kibana Space ID
  # ID of the Kibana Space into which the dashboards should be loaded. By default,
  # the Default Space will be used.
  #space.id:

#============================= Elastic Cloud ==================================

# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).

# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:

# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:

#================================ Outputs =====================================

# Configure what output to use when sending the data collected by the beat.

#-------------------------- Elasticsearch output ------------------------------
#output.elasticsearch:
  # Array of hosts to connect to.
#  hosts: ["localhost:9200"]

  # Optional protocol and basic auth credentials.
  #protocol: "https"
  #username: "elastic"
  #password: "changeme"

#----------------------------- Logstash output --------------------------------
#output.logstash:
  # The Logstash hosts
  #hosts: ["localhost:5044"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"


# 輸出到kafka
#----------------------------- kafka output --------------------------------
output.kafka:
  enabled: true
  # initial brokers for reading cluster metadata
  # kafka集羣
  hosts: ["101.132.34.170:19092", "101.133.132.124:19092","47.103.37.44:19092"]
  #
  # message topic selection + partitioning
  # 這裏的topic 參數,就是 上面 自定義的 屬性字段
  topic: '%{[fields.log_topic]}'
  version: 2.0.0
  partition.round_robin:
    reachable_only: false
#  codec.format:
     #  string: '{"filebeat-timestamp":"%{[@timestamp]}"}'
#      string: '{"agent":%{[agent]},"log":%{[log]},"fields":%{[fields]},"filebeat-timestamp":"%{[@timestamp]}","class":"%{[class]}","method":"%{[method]}","line":"%{[line]}","message":"%{[message]}","time":"%{[timestamp]}","level":"%{[level]}","service":"%{[service]}","local_ip":"%{[local_ip]}","pid":"%{[pid]}","thread":"%{[thread]}","span":"%{[span]}","trace":"%{[trace]}","parent":"%{[parent]}","exportable":"%{[exportable]}"}'
  required_acks: 1
  compression: gzip
  compression_level: 4 
  max_message_bytes: 1000000

#----------------------------- console output --------------------------------
#output.console:
#  pretty: true
#  enable: true
#  codec.format:
#    string: '{"agent":%{[agent]},"log":%{[log]},"fields":%{[fields]},"filebeat-timestamp":"%{[@timestamp]}","class":"%{[class]}","method":"%{[method]}","line":"%{[line]}","message":"%{[message]}","time":"%{[timestamp]}","level":"%{[level]}","service":"%{[service]}","local_ip":"%{[local_ip]}","pid":"%{[pid]}","thread":"%{[thread]}","span":"%{[span]}","trace":"%{[trace]}","parent":"%{[parent]}","exportable":"%{[exportable]}"}'

#================================ iProcessors =====================================

# Configure processors to enhance or manipulate events generated by the beat.

# 全局 過濾配置
processors:
- if:
   regexp:
     message: '{\"service\":\"keda*'
  then:
  - decode_json_fields:
      fields: ["message"]
      target: ""
      max_depth: 1
      overwrite_keys: true
- if:
   regexp:
     message: '{\"remote_addr\":.*'
  then:
  - decode_json_fields:
      fields: ["message"]
      target: ""
      max_depth: 1
      overwrite_keys: true

#  - decode_json_fields:
#      fields: ["message"]
#      max_depth: 3
  # 排除所有包含 該字段的 日誌行
- if:
   regexp:
     message: '{\"service\":\"keda*'
  then:
  - drop_event:
      when:
        regexp:
           trace: '^[A-Za-z0-9]{0}$'
           span: '^[A-Za-z0-9]{0}$'
#  - rename:
#      fields:
#        - from: "timestamp"
#          to: "time"
#      ignore_missing: false
#      fail_on_error: true
- if:
   regexp:
     message: '{\"service\":\"keda*'
  then:
  - drop_fields:
      fields: ["host", "ecs","@version"]
      ignore_missing: false
- if:
   regexp:
     message: '{\"remote_addr\":.*'
  then:
  - drop_fields:
     fields: ["message"]
     ignore_missing: false

# has_fields: ['traceee'] 
        #  message: '\"trace\":\"\",\"span\":\"\",'

#================================ Logging =====================================

# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug

# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publish", "service".
#logging.selectors: ["*"]

#============================== X-Pack Monitoring ===============================
# filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster.  This requires xpack monitoring to be enabled in Elasticsearch.  The
# reporting is disabled by default.

# Set to true to enable the monitoring reporter.
#monitoring.enabled: false

# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch
# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.
#monitoring.cluster_uuid:

# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well.
# Note that the settings should point to your Elasticsearch *monitoring* cluster.
# Any setting that is not set is automatically inherited from the Elasticsearch
# output configuration, so if you have the Elasticsearch output configured such
# that it is pointing to your Elasticsearch monitoring cluster, you can simply
# uncomment the following line.
#monitoring.elasticsearch:

#================================= Migration ==================================

# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true
[root@slave1 filebeat-7.4.2]# 

3、上腳本

 

[root@master sh]# cat start.sh 
#!/bin/bash

sh stop.sh

nohup sudo /opt/filebeat-7.4.2/filebeat -e -c /opt/filebeat-7.4.2/kedafilebeat.yml  &


[root@master sh]# cat stop.sh 
#!/bin/bash

PID=$(ps -ef | grep /opt/filebeat-7.4.2/filebeat | grep -v grep | awk '{ print $2 }')
if [ -z "$PID" ]
then
    echo filebeat is already stopped
else
    echo kill $PID
    kill $PID
fi
[root@master sh]# 


4、三臺服務器,都是這樣配置。

5、至此,filebeat 搭建完成

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章