文章目錄

前言

在一個完整的離線大數據處理系統中，除了hdfs+mapreduce+hive組成分析系統的核心之外，還需要數據採集、結果數據導出、任務調度等不可或缺的輔助系統，而這些輔助工具在hadoop生態體系中都有便捷的開源框架，如圖所示：

日誌採集框架Flume

1 Flume介紹

1.1 概述

Flume是一個分佈式、可靠、和高可用的海量日誌採集、聚合和傳輸的系統。
Flume可以採集文件，socket數據包、文件、文件夾、kafka等各種形式源數據，又可以將採集到的數據(下沉sink)輸出到HDFS、hbase、hive、kafka等衆多外部存儲系統中。
一般的採集需求，通過對flume的簡單配置即可實現。
Flume針對特殊場景也具備良好的自定義擴展能力，因此，flume可以適用於大部分的日常數據採集場景。

宏觀角度來看類似生活中的掃碼槍、吸塵器吸頭。

1.2 運行機制

1、Flume分佈式系統中最核心的角色是agent，flume採集系統就是由一個個agent所連接起來形成

2、每一個agent相當於一個數據傳遞員，內部有三個組件：

Source：採集組件，用於跟數據源對接，以獲取數據
Channel：傳輸通道組件，用於從source將數據傳遞到sink
Sink：下沉組件，用於往下一級agent傳遞數據或者往最終存儲系統傳遞數據

Source 到 Channel 到 Sink之間傳遞數據的形式是Event事件；Event事件是一個數據流單元。

1.3 Flume採集系統結構圖

1. 簡單結構
單個agent採集數據

2. 複雜結構
多級agent之間串聯

Flume安裝部署
Flume的安裝非常簡單，只需要解壓即可，當然，前提是已有hadoop環境
上傳安裝包到數據源所在節點上
這裏我們採用在第三臺機器來進行安裝
上傳安裝文件並解壓

tar -zxvf flume-ng-1.6.0-cdh5.14.0.tar.gz -C /export/servers/
cd  /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf
cp  flume-env.sh.template flume-env.sh
vim flume-env.sh

export JAVA_HOME=/export/servers/jdk1.8.0_141

2 Flume實戰案例

2.1 Flume接受telent數據

案例：使用網絡telent命令向一臺機器發送一些網絡數據，然後通過flume採集網絡端口數據

第一步：開發配置文件

根據數據採集的需求配置採集方案，描述在配置文件中(文件名可任意自定義)

配置我們的網絡收集的配置文件
在flume的conf目錄下新建一個配置文件（採集方案）

vim   /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf/netcat-logger.conf

# 定義這個agent中各組件的名字
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# 描述和配置source組件：r1
a1.sources.r1.type = netcat
a1.sources.r1.bind = 192.168.52.120
a1.sources.r1.port = 44444

# 描述和配置sink組件：k1
a1.sinks.k1.type = logger

# 描述和配置channel組件，此處使用是內存緩存的方式
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# 描述和配置source  channel   sink之間的連接關係
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

Channel參數解釋：
capacity： 默認該通道中最大的可以存儲的event數量
trasactionCapacity： 每次最大可以從source中拿到或者送到sink中的event數量

第二步：啓動配置文件

指定採集方案配置文件，在相應的節點上啓動flume agent

先用一個最簡單的例子來測試一下程序環境是否正常
啓動agent去採集數據

bin/flume-ng agent -c conf -f conf/netcat-logger.conf -n a1  -Dflume.root.logger=INFO,console

-c conf 指定flume自身的配置文件所在目錄
-f conf/netcat-logger.con 指定我們所描述的採集方案
-n a1 指定我們這個agent的名字

第三步：安裝telent準備測試

在node02機器上面安裝telnet客戶端，用於模擬數據的發送

yum -y install telnet
telnet  node03  44444   # 使用telnet模擬數據發送

2.2 採集案例

1、採集目錄到HDFS

需求分析
結構示意圖：

採集需求： 某服務器的某特定目錄下，會不斷產生新的文件，每當有新文件出現，就需要把文件採集到HDFS中去
根據需求，首先定義以下3大要素

數據源組件，即source ——監控文件目錄 : spooldir
spooldir特性：
1. 監視一個目錄，只要目錄中出現新文件，就會採集文件中的內容
2. 採集完成的文件，會被agent自動添加一個後綴：COMPLETED
3. 所監視的目錄中不允許重複出現相同文件名的文件
下沉組件，即sink——HDFS文件系統 : hdfs sink
通道組件，即channel——可用file channel 也可以用內存channel

flume配置文件開發
配置文件編寫：

cd  /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf
mkdir -p /export/servers/dirfile
vim spooldir.conf

# Name the components on this agent
a1.sources=r1
a1.channels=c1
a1.sinks=k1
# Describe/configure the source
##注意：不能往監控目中重複丟同名文件
a1.sources.r1.type=spooldir
a1.sources.r1.spoolDir=/export/dir
a1.sources.r1.fileHeader = true
# Describe the sink
a1.sinks.k1.type=hdfs
a1.sinks.k1.hdfs.path=hdfs://node01:8020/spooldir/
# Describe the channel
a1.channels.c1.type=memory
a1.channels.c1.capacity=1000
a1.channels.c1.transactionCapacity=100
# Bind the source and sink to the channel
a1.sources.r1.channels=c1
a1.sinks.k1.channel=c1

啓動flume

bin/flume-ng agent -c ./conf -f ./conf/spooldir.conf -n a1 -Dflume.root.logger=INFO,console

上傳文件到指定目錄
將不同的文件上傳到下面目錄裏面去，注意文件不能重名

cd /export/dir

2、採集文件到HDFS

需求分析：
採集需求：比如業務系統使用log4j生成的日誌，日誌內容不斷增加，需要把追加到日誌文件中的數據實時採集到hdfs

根據需求，首先定義以下3大要素

採集源，即source——監控文件內容更新 : exec ‘tail -F file’
下沉目標，即sink——HDFS文件系統 : hdfs sink
Source和sink之間的傳遞通道——channel，可用file channel 也可以用內存channel

定義flume的配置文件
node03開發配置文件

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf
vim tail-file.conf

配置文件內容

a1.sources=r1
a1.channels=c1
a1.sinks=k1
# Describe/configure tail -F source1
a1.sources.r1.type=exec
a1.sources.r1.command =tail -F /export/taillogs/access_log
 
# Describe sink1
a1.sinks.k1.type=hdfs
a1.sinks.k1.hdfs.path=hdfs://node01:8020/spooldir/

# Use a channel which buffers events in memory
a1.channels.c1.type=memory
a1.channels.c1.capacity=1000
a1.channels.c1.transactionCapacity=100

# Bind the source and sink to the channel
a1.sources.r1.channels=c1
a1.sinks.k1.channel=c1

啓動flume

cd  /export/servers/apache-flume-1.6.0-cdh5.14.0-bin

bin/flume-ng agent -c conf -f conf/tail-file.conf -n agent1  -Dflume.root.logger=INFO,console

開發shell腳本定時追加文件內容

mkdir -p /export/shells/
cd  /export/shells/
vim tail-file.sh

#!/bin/bash
while true
do
 date >> /export/servers/taillogs/access_log;
  sleep 0.5;
done

創建文件夾

mkdir -p /export/servers/taillogs

啓動腳本

sh /export/shells/tail-file.sh

3、兩個agent級聯

需求分析：
第一個agent負責收集文件當中的數據，通過網絡發送到第二個agent當中去，第二個agent負責接收第一個agent發送的數據，並將數據保存到hdfs上面去

第一步：node02安裝flume
將node03機器上面解壓後的flume文件夾拷貝到node02機器上面去

cd  /export/servers
scp -r apache-flume-1.6.0-cdh5.14.0-bin/ node02:$PWD

第二步：node02配置flume配置文件
在node02機器配置我們的flume

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf
vim tail-avro-avro-logger.conf

##################
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /export/taillogs/access_log

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

##sink端的avro是一個數據發送者
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = 192.168.52.120
a1.sinks.k1.port = 4141

#Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

第三步：node02開發定腳本文件往寫入數據
直接將node03下面的腳本和數據拷貝到node02即可，node03機器上執行以下命令

cd  /export
scp -r shells/ taillogs/ node02:$PWD

第五步：node03開發flume配置文件
在node03機器上開發flume的配置文件

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf
vim avro-hdfs.conf

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

##source中的avro組件是一個接收者服務
a1.sources.r1.type = avro
a1.sources.r1.bind = 192.168.52.120
a1.sources.r1.port = 4141

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://node01:8020/avro

# Bind the source and sink to the channel 
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

第六步：順序啓動
node03機器啓動flume進程

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin
bin/flume-ng agent -c conf -f tmpconf/avro-hdfs.conf -n a1  -Dflume.root.logger=INFO,console

node02機器啓動flume進程

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/
bin/flume-ng agent -c conf -f tmpconf/tail-avro-avro-logger.conf -n a1  -Dflume.root.logger=INFO,console

node02機器啓shell腳本生成文件

mkdir /export/taillogs/

cd  /export/servers/shells
sh tail-file.sh

3 更多source和sink組件

Flume支持衆多的source和sink類型，詳細手冊可參考官方文檔
http://archive.cloudera.com/cdh5/cdh/5/flume-ng-1.6.0-cdh5.14.0/FlumeUserGuide.html

4 高可用Flum-NG配置案例failover

在完成單點的Flume NG搭建後，下面我們搭建一個高可用的Flume NG集羣，架構圖如下所示：

　　圖中，我們可以看出，Flume的存儲可以支持多種，這裏只列舉了HDFS和Kafka（如：存儲最新的一週日誌，並給Storm系統提供實時日誌流）。

4.1、角色分配

Flume的Agent和Collector分佈如下表所示：

名稱	HOST	角色
Agent1	node01	Web Server
Collector1	node02	AgentMstr1
Collector2	node03	AgentMstr2

圖中所示，Agent1數據分別流入到Collector1和Collector2，Flume NG本身提供了Failover機制，可以自動切換和恢復。在上圖中，有3個產生日誌服務器分佈在不同的機房，要把所有的日誌都收集到一個集羣中存儲。下面我們開發配置Flume NG集羣

4.2、node01安裝配置flume與拷貝文件腳本

將node03機器上面的flume安裝包以及文件生產的兩個目錄拷貝到node01機器上面去

node03機器執行以下命令

cd /export/servers
scp -r apache-flume-1.6.0-cdh5.14.0-bin/ node01:$PWD
cd /export
scp -r shells/ taillogs/ node01:$PWD

node01機器配置agent的配置文件

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf
vim agent.conf

#agent1 name
agent1.channels = c1
agent1.sources = r1
agent1.sinks = k1 k2
#
##set gruop
agent1.sinkgroups = g1
##set sink group
agent1.sinkgroups.g1.sinks = k1 k2

#
agent1.sources.r1.type = exec
agent1.sources.r1.command = tail -F /export/taillogs/access_log

#
##set channel
agent1.channels.c1.type = memory
agent1.channels.c1.capacity = 1000
agent1.channels.c1.transactionCapacity = 100
## set sink1
agent1.sinks.k1.type = avro
agent1.sinks.k1.hostname = node02
agent1.sinks.k1.port = 52020
#
## set sink2
agent1.sinks.k2.type = avro
agent1.sinks.k2.hostname = node03
agent1.sinks.k2.port = 52020
#
##set failover
agent1.sinkgroups.g1.processor.type = failover
agent1.sinkgroups.g1.processor.priority.k1 = 2
agent1.sinkgroups.g1.processor.priority.k2 = 1
agent1.sinkgroups.g1.processor.maxpenalty = 10000
#
agent1.sources.r1.channels = c1
agent1.sinks.k1.channel = c1
agent1.sinks.k2.channel = c1

maxpenalty故障轉移的默認時間

4.3、node02與node03配置flumecollection

node02機器修改配置文件

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf
vim collector.conf

#set Agent name
a1.sources = r1
a1.channels = c1
a1.sinks = k1

## other node,nna to nns
a1.sources.r1.type = avro
a1.sources.r1.bind = node02
a1.sources.r1.port = 52020

##set channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

#
##set sink to hdfs
a1.sinks.k1.type=hdfs
a1.sinks.k1.hdfs.path= hdfs://node01:8020/flume/failover/


a1.sources.r1.channels=c1
a1.sinks.k1.channel=c1

node03機器修改配置文件

cd  /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf
vim collector.conf

#set Agent name
a1.sources = r1
a1.channels = c1
a1.sinks = k1

## other node,nna to nns
a1.sources.r1.type = avro
a1.sources.r1.bind = node03
a1.sources.r1.port = 52020

##set channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

#
##set sink to hdfs
a1.sinks.k1.type=hdfs
a1.sinks.k1.hdfs.path= hdfs://node01:8020/flume/failover/


a1.sources.r1.channels=c1
a1.sinks.k1.channel=c1

4.4、順序啓動命令

node03機器上面啓動flume

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin
bin/flume-ng agent -n a1 -c conf -f tmpconf/collector.conf -Dflume.root.logger=DEBUG,console

node02機器上面啓動flume

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin
bin/flume-ng agent -n a1 -c conf -f tmpconf/collector.conf -Dflume.root.logger=DEBUG,console

node01機器上面啓動flume

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin
bin/flume-ng agent -n agent1 -c conf -f conf/agent.conf -Dflume.root.logger=DEBUG,console

node01機器啓動文件產生腳本

cd  /export/shells
sh tail-file.sh

4.5、 FAILOVER測試

下面我們來測試下Flume NG集羣的高可用（故障轉移）。場景如下：我們在Agent1節點上傳文件，由於我們配置Collector1的權重比Collector2大，所以 Collector1優先採集並上傳到存儲系統。然後我們kill掉Collector1，此時有Collector2負責日誌的採集上傳工作，之後，我們手動恢復Collector1節點的Flume服務，再次在Agent1上次文件，發現Collector1恢復優先級別的採集工作。具體截圖如下所示：

Collector1優先上傳

HDFS集羣中上傳的log內容預覽

Collector1宕機，Collector2獲取優先上傳權限

重啓Collector1服務，Collector1重新獲得優先上傳的權限

5、flume的負載均衡load balancer

負載均衡是用於解決一臺機器(一個進程)無法解決所有請求而產生的一種算法。Load balancing Sink Processor 能夠實現 load balance 功能，如下圖Agent1 是一個路由節點，負責將 Channel 暫存的 Event 均衡到對應的多個 Sink組件上，而每個 Sink 組件分別連接到一個獨立的 Agent 上，示例配置，如下所示：

在此處我們通過三臺機器來進行模擬flume的負載均衡
三臺機器規劃如下：
node01： 採集數據，發送到node02和node03機器上去
node02： 接收node01的部分數據
node03： 接收node01的部分數據

第一步：開發node01服務器的flume配置

node01服務器配置：

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf
vim load_banlancer_client.conf

#agent name
a1.channels = c1
a1.sources = r1
a1.sinks = k1 k2

#set gruop
a1.sinkgroups = g1
#set sink group
a1.sinkgroups.g1.sinks = k1 k2

#set sources
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /export/taillogs/access_log

#set channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# set sink1
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = node02
a1.sinks.k1.port = 52021

# set sink2
a1.sinks.k2.type = avro
a1.sinks.k2.hostname = node03
a1.sinks.k2.port = 52021


#set failover
a1.sinkgroups.g1.processor.type = load_balance
a1.sinkgroups.g1.processor.backoff = true
a1.sinkgroups.g1.processor.selector = round_robin
a1.sinkgroups.g1.processor.selector.maxTimeOut=10000


a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c1

第二步：開發node02服務器的flume配置

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf
vim load_banlancer_server.conf

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = avro
a1.sources.r1.bind = node02
a1.sources.r1.port = 52021

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Describe the sink
a1.sinks.k1.type = logger

# Bind the source and sink to the channel 
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

第三步：開發node03服務器flume配置

node03服務器配置

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf
vim load_banlancer_server.conf

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = avro
a1.sources.r1.bind = node03
a1.sources.r1.port = 52021

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Describe the sink
a1.sinks.k1.type = logger

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

第四步：準備啓動flume服務

啓動node03的flume服務

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin
bin/flume-ng agent -n a1 -c conf -f conf/load_banlancer_server.conf -Dflume.root.logger=DEBUG,console

啓動node02的flume服務

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin
bin/flume-ng agent -n a1 -c conf -f conf/load_banlancer_server.conf -Dflume.root.logger=DEBUG,console

啓動node01的flume服務

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin
bin/flume-ng agent -n a1 -c conf -f conf/load_banlancer_client.conf -Dflume.root.logger=DEBUG,console

第五步：node01服務器運行腳本產生數據

cd /export/shells
sh tail-file.sh

6、flume過濾器案例一

1. 案例場景

A、B兩臺日誌服務機器實時生產日誌主要類型爲access.log、nginx.log、web.log
現在要求：
把A、B 機器中的access.log、nginx.log、web.log 採集彙總到C機器上然後統一收集到hdfs中。
但是在hdfs中要求的目錄爲：

/source/logs/access/20180101/**
/source/logs/nginx/20180101/**
/source/logs/web/20180101/**

2. 場景分析

3. 數據流程處理分析

4、實現

服務器A對應的IP爲 192.168.52.100
服務器B對應的IP爲 192.168.52.110
服務器C對應的IP爲 192.168.52.120

採集端配置文件開發

node01與node02服務器開發flume的配置文件

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf
vim exec_source_avro_sink.conf

# Name the components on this agent
a1.sources = r1 r2 r3
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /export/taillogs/access.log
a1.sources.r1.interceptors = i1
a1.sources.r1.interceptors.i1.type = static
##  static攔截器的功能就是往採集到的數據的header中插入自己定## 義的key-value對
a1.sources.r1.interceptors.i1.key = type
a1.sources.r1.interceptors.i1.value = access

a1.sources.r2.type = exec
a1.sources.r2.command = tail -F /export/taillogs/nginx.log
a1.sources.r2.interceptors = i2
a1.sources.r2.interceptors.i2.type = static
a1.sources.r2.interceptors.i2.key = type
a1.sources.r2.interceptors.i2.value = nginx

a1.sources.r3.type = exec
a1.sources.r3.command = tail -F /export/taillogs/web.log
a1.sources.r3.interceptors = i3
a1.sources.r3.interceptors.i3.type = static
a1.sources.r3.interceptors.i3.key = type
a1.sources.r3.interceptors.i3.value = web

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 20000
a1.channels.c1.transactionCapacity = 10000

# Describe the sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = node03
a1.sinks.k1.port = 41414

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sources.r2.channels = c1
a1.sources.r3.channels = c1
a1.sinks.k1.channel = c1

注：

a1.sources.r1.interceptors = i1
a1.sources.r1.interceptors.i1.type = static
##  static攔截器的功能就是往採集到的數據的header中插入自己定## 義的key-value對
a1.sources.r1.interceptors.i1.key = type
a1.sources.r1.interceptors.i1.value = access

服務端配置文件開發

在node03上面開發flume配置文件

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf
vim avro_source_hdfs_sink.conf

a1.sources = r1
a1.sinks = k1
a1.channels = c1
#定義source
a1.sources.r1.type = avro
a1.sources.r1.bind = 192.168.52.120
a1.sources.r1.port =41414

#添加時間攔截器
a1.sources.r1.interceptors = i1
a1.sources.r1.interceptors.i1.type = org.apache.flume.interceptor.TimestampInterceptor$Builder


#定義channels
a1.channels.c1.type = memory
a1.channels.c1.capacity = 20000
a1.channels.c1.transactionCapacity = 10000

#定義sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path=hdfs://192.168.52.100:8020/source/logs/%{type}/%Y%m%d
 
#組裝source、channel、sink
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

採集端文件生成腳本

在node01與node02上面開發shell腳本，模擬數據生成

cd /export/servers/shells
vim server.sh

#!/bin/bash
while true
do  
 date >> /export/servers/taillogs/access.log; 
 date >> /export/servers/taillogs/web.log;
 date >> /export/servers/taillogs/nginx.log;
  sleep 0.5;
done

順序啓動服務

node03啓動flume實現數據收集

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin

bin/flume-ng agent -c conf -f conf/avro_source_hdfs_sink.conf -name a1 -Dflume.root.logger=DEBUG,console

node01與node02啓動flume實現數據監控

cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin

bin/flume-ng agent -c conf -f conf/exec_source_avro_sink.conf -name a1 -Dflume.root.logger=DEBUG,console

node01與node02啓動生成文件腳本

cd /export/shells
sh server.sh

Flume基礎知識大全