1 Flume事務

2 Flume Agent內部原理

2.1 ChannelSelector

ChannelSelector 的作用就是選出 Event 將要被髮往哪個 Channel。其共有兩種類型，分別是 Replicating（複製）和 Multiplexing（多路複用）。ReplicatingSelector 會將同一個 Event 發往所有的 Channel， Multiplexing 會根據相應的原則，將不同的 Event 發往不同的 Channel。

2.1.1 Replicating Channel Selector

屬性名	默認值	說明
selector.type	replicating	默認replicating，可用replicating或multiplexing
selector.optional	-	可選的通道

例：

a1.sources = r1
a1.channels = c1 c2 c3
a1.sources.r1.channels = c1 c2 c3
a1.sources.r1.selector.type = replicating
a1.sources.r1.selector.optional = c3

c3是可選通道。寫入c3的失敗將被忽略。由於c1和c2沒有標記爲可選，因此無法寫入這些通道將導致事務失敗。

2.1.2 Multiplexing Channel Selector

屬性名	默認值	說明
selector.type	replicating	默認replicating，可用replicating或multiplexing
selector.header	flume.selector.header
selector.default	-
selector.mapping.*	-

多路複用選擇具有另一組屬性，以使流分叉。這需要指定Event屬性到通道集合的映射。選擇器檢查Event Header中的每個已配置屬性。如果它與指定值匹配，那麼該事件將發送到映射到該值的所有通道。如果沒有匹配項，則將事件發送到配置爲默認值的一組通道。

例：

agent_foo.sources = avro-AppSrv-source1
agent_foo.sources.avro-AppSrv-source1.selector.type = multiplexing
agent_foo.sources.avro-AppSrv-source1.selector.header = State
agent_foo.sources.avro-AppSrv-source1.selector.mapping.CA = mem-channel-1
agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ = file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.mapping.NY = mem-channel-1 file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.default = mem-channel-

選擇器會掃描Event Header的屬性State，如果值爲CA，則將其發送到mem-channel-1；如果其值爲AZ，則將其發送到file-channel-2；如果值爲NY，則兩者都發送，如果未設置State標頭或不匹配這三個標頭中的任何一個，則它將轉到指定爲default的mem-channel-1。

選擇器還支持可選通道。要爲標頭指定可選通道，可通過以下方式使用config參數“ optional”：

# channel selector configuration
agent_foo.sources.avro-AppSrv-source1.selector.type = multiplexing
agent_foo.sources.avro-AppSrv-source1.selector.header = State
agent_foo.sources.avro-AppSrv-source1.selector.mapping.CA = mem-channel-1
agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ = file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.mapping.NY = mem-channel-1 file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.optional.CA = mem-channel-1 file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ = file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.default = mem-channel-1

選擇器將首先嚐試寫入required通道，如果其中一個通道消費Event失敗將導致事務失敗，事務將在所有required通道上重試。一旦所有required通道成功消費了Event，選擇器將嘗試寫入所有可選通道，任何可選通道未能消費事件都將被忽略，不會重試。
如果可選通道和特定標頭的required通道之間存在重疊，則認爲該通道是required的，並且該通道中的故障將導致重試整個required通道集。
如果沒有指定任何可選通道，將會嘗試寫入可選通道和默認通道。

2.2 Sink Processor

SinkProcessor 共有三種類型，分別是 DefaultSinkProcessor 、LoadBalancingSinkProcessor 和 FailoverSinkProcessor DefaultSinkProcessor 對應的是單個的 Sink ， LoadBalancingSinkProcessor 和FailoverSinkProcessor 對應的是 Sink Group， LoadBalancingSinkProcessor 可以實現負載均衡的功能， FailoverSinkProcessor 可以實現故障轉移的功能。

2.2.1 Default Sink Processor

Default Sink Processor 只接受單個sink。用戶不必爲單個sink創建processor (sink group)。相反，用戶可以遵循之前的source-channel-sink模式。

2.2.2 Failover Sink Processor

Failover Sink Processor維護一個按優先級排序的sink列表，確保只要有一個可用的sink，就會處理(交付)事件。
故障轉移機制的工作方式是將失敗的sink轉移到冷卻池中，一旦sink發送event成功，他將恢復到活動池中。可以通過processor.priority.*設置優先級，值越大優先級越高，如果sink發送event失敗，將由下個最高優先級的去重試，沒有指定優先級則按照配置順序來確定。

屬性名	默認值	說明
sinks	-	用空格分割的sink組列表
processor.type	default	可選`default`、`load_balance`、`failover`
processor.priority.<sinkName>	-	優先級設置，值越大優先級越大，必須唯一
processor.maxpenalty	30000	失敗sink的最大回退週期（單位：s），冷卻期

例：

#命名一個g1 sink組
a1.sinkgroups = g1
#g1 sink組下的sink列表
a1.sinkgroups.g1.sinks = k1 k2
#設置Sink Processor類型爲failover
a1.sinkgroups.g1.processor.type = failover
#設置優先級，值唯一
a1.sinkgroups.g1.processor.priority.k1 = 5
a1.sinkgroups.g1.processor.priority.k2 = 10
#失敗sink最大回退週期
a1.sinkgroups.g1.processor.maxpenalty = 10000

2.2.3 Load Balancing Sink Processor

屬性名	默認值	說明
processor.sinks	-	用空格分割的sink組列表
processor.type	default	可選`default`、`load_balance`、`failover`
processor.backoff	false	失敗sink是否設置爲指數回退
processor.selector	round_robin	選擇機制，必須是`round_robin`、`random`或繼承`AbstractSinkSelector`的用戶自定義FQCN
processor.selector.maxTimeOut	30000	失敗sink的最大回退週期（單位：s），結合backoff使用

例：

#命名一個g1 sink組
a1.sinkgroups = g1
#g1 sink組下的sink列表
a1.sinkgroups.g1.sinks = k1 k2
#設置Sink Processor類型爲load_balance
a1.sinkgroups.g1.processor.type = load_balance
#失敗sink是否設置爲指數回退
a1.sinkgroups.g1.processor.backoff = true
#選擇器類型
a1.sinkgroups.g1.processor.selector = random

3 Flume 拓撲結構

3.1 單一流程（one-agent flow）

3.2 多代理流程（multi-agent flow）

爲了使數據跨多個代理或躍點流動，前一個代理的sink和當前躍點的source需要是avro類型並且sink指向source的主機名(或IP地址)和端口。可以將多個Agent順序連接起來，這是最簡單的情況，一般情況下，應該控制這種順序連接的Agent 的數量，因爲數據流經的路徑變長了，如果不考慮failover的話，出現故障將影響整個Flow上的Agent收集服務。

3.3 流合併（Consolidation）

這種情況應用的場景比較多，比如要收集Web網站的用戶行爲日誌， Web網站爲了可用性使用的負載集羣模式，每個節點都產生用戶行爲日誌，可以爲每個節點都配置一個Agent來單獨收集日誌數據，然後多個Agent將數據最終匯聚到一個用來存儲數據的存儲系統，如HDFS上。

3.4 多路複用流程（Multiplexing the flow）

將多種日誌混合在一起流入一個agent，可以agent中將混雜的日誌流分開，然後給每種日誌建立一個自己的傳輸通道，也可以設置成複製流，每個通道都接收所有的流。

3.5 負載均衡流程（Load Balance）

Flume支持使用將多個 sink邏輯上分到一個 sink組，sink組配合不同的 SinkProcessor可以實現負載均衡和故障轉移的功能。

4 案例

4.1 流合併

創建flume1-logger-flume.conf，配置Source用於監控hive.log文件，配置Sink輸出數據到下一級Flume：

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /tmp/root/hive.log
a1.sources.r1.shell = /bin/bash -c

# Describe the sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = 127.0.0.1
a1.sinks.k1.port = 4141

# Describe the channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

創建flume2-netcat-flume.conf，配置Source監控端口44444數據流，配置Sink數據到下一級Flume：

# Name the components on this agent
a2.sources = r1
a2.sinks = k1
a2.channels = c1

# Describe/configure the source
a2.sources.r1.type = netcat
a2.sources.r1.bind = 127.0.0.1
a2.sources.r1.port = 44444

# Describe the sink
a2.sinks.k1.type = avro
a2.sinks.k1.hostname = 127.0.0.1
a2.sinks.k1.port = 4141

# Use a channel which buffers events in memory
a2.channels.c1.type = memory
a2.channels.c1.capacity = 1000
a2.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a2.sources.r1.channels = c1
a2.sinks.k1.channel = c1

創建flume3-flume-logger.conf，配置source用於接收flume1與flume2發送過來的數據流，最終合併後sink到控制檯：

# Name the components on this agent
a3.sources = r1
a3.sinks = k1
a3.channels = c1

# Describe/configure the source
a3.sources.r1.type = avro
a3.sources.r1.bind = 127.0.0.1
a3.sources.r1.port = 4141

# Describe the sink
# Describe the sink
a3.sinks.k1.type = logger

# Describe the channel
a3.channels.c1.type = memory
a3.channels.c1.capacity = 1000
a3.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a3.sources.r1.channels = c1
a3.sinks.k1.channel = c1

執行配置文件，分別開啓對應配置文件：flume3-flume-logger.conf，flume2-netcat-flume.conf，flume1-logger-flume.conf。

[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a3 -c conf/ -f job/flume3-flume-logger.conf -Dflume.root.logger=INFO,console
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a2 -c conf/ -f job/flume2-netcat-flume.conf -Dflume.root.logger=INFO,console
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a1 -c conf/ -f job/flume1-netcat-flume.conf -Dflume.root.logger=INFO,console

4.2 多路複用流程

配置第一個agent，編寫配置文件multi-one.conf：

# Name the components on this agent
a1.sources = r1
a1.sinks = k1 k2
a1.channels = c1 c2
# 將數據流複製給所有channel
a1.sources.r1.selector.type = replicating

# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /tmp/root/hive.log
a1.sources.r1.shell = /bin/bash -c

# Describe the sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = 127.0.0.1
a1.sinks.k1.port = 4141

a1.sinks.k2.type = hdfs
a1.sinks.k2.hdfs.path = hdfs://127.0.0.1:9000/flume/multi/%Y%m%d/%H
#上傳文件的前綴
a1.sinks.k2.hdfs.filePrefix = dfs-
#是否按照時間滾動文件夾
a1.sinks.k2.hdfs.round = true
#多少時間單位創建一個新的文件夾
a1.sinks.k2.hdfs.roundValue = 1
#重新定義時間單位
a1.sinks.k2.hdfs.roundUnit = hour
#是否使用本地時間戳
a1.sinks.k2.hdfs.useLocalTimeStamp = true
#積攢多少個Event才flush到HDFS一次
a1.sinks.k2.hdfs.batchSize = 100
#設置文件類型，可支持壓縮
a1.sinks.k2.hdfs.fileType = DataStream
#多久生成一個新的文件
a1.sinks.k2.hdfs.rollInterval = 600
#設置每個文件的滾動大小
a1.sinks.k2.hdfs.rollSize = 134217700
#設置多少Event滾動，0表示滾動與Event數量無關
a1.sinks.k2.hdfs.rollCount = 0
#最小冗餘數
a1.sinks.k2.hdfs.minBlockReplicas = 1

# Describe the channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

a1.channels.c2.type = memory
a1.channels.c2.capacity = 1000
a1.channels.c2.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1 c2
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c2

輸出的本地目錄必須是已經存在的目錄，如果該目錄不存在，並不會創建新的目錄。

配置第二個agent，編寫配置文件multi-tow.conf：

# Name the components on this agent
a2.sources = r1
a2.sinks = k1
a2.channels = c1

# Describe/configure the source
a2.sources.r1.type = avro
a2.sources.r1.bind = 127.0.0.1
a2.sources.r1.port = 4141

# Describe the sink
a2.sinks.k1.type = file_roll
a2.sinks.k1.sink.directory = /opt/module/datas/flume3

# Describe the channel
a2.channels.c1.type = memory
a2.channels.c1.capacity = 1000
a2.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a2.sources.r1.channels = c1
a2.sinks.k1.channel = c1

啓動Flume

#先啓動avro服務端,再啓動avro客戶端
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a2 -c conf/ -f job/multi-two.conf -Dflume.root.logger=INFO,console
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a1 -c conf/ -f job/multi-one.conf -Dflume.root.logger=INFO,console

4.3 負載均衡和故障轉移

5 用戶自定義

官方自定義組件文檔地址：http://flume.apache.org/releases/content/1.9.0/FlumeDeveloperGuide.html

5.1 自定義Interceptor與多路複用

在實際的開發中，一臺服務器產生的日誌類型可能有很多種，不同類型的日誌可能需要發送到不同的分析系統。此時會用到 Flume 拓撲結構中的 Multiplexing 結構， Multiplexing的原理是，根據 event 中 Header 的某個 key 的值，將不同的 event 發送到不同的 Channel中，所以我們需要自定義一個 Interceptor，爲不同類型的 event 的 Header 中的 key 賦予不同的值。

案例：

創建maven項目，引入以下依賴。

<dependency>
	<groupId>org.apache.flume</groupId>
	<artifactId>flume-ng-core</artifactId>
	<version>1.9.0</version>
</dependency>

自定義Interceptor需要實現org.apache.flume.interceptor.Interceptor接口。

package com.yutao.flume.interceptor;

import org.apache.flume.Context;
import org.apache.flume.Event;
import org.apache.flume.interceptor.Interceptor;

import java.util.List;

/**
 * @author yutyi
 * @date 2019/10/25
 */
public class CustomInterceptor implements Interceptor {
    @Override
    public void initialize() {

    }

    @Override
    public Event intercept(Event event) {
        byte[] body = event.getBody();
        if (body[0] < 'z' && body[0] > 'a') {
            event.getHeaders().put("type", "letter");
        } else if (body[0] > '0' && body[0] < '9') {
            event.getHeaders().put("type", "number");
        }
        return event;
    }

    @Override
    public List<Event> intercept(List<Event> events) {
        for (Event event : events) {
            intercept(event);
        }
        return events;
    }

    @Override
    public void close() {

    }
    
    public static class Builder implements Interceptor.Builder {
        @Override
        public Interceptor build() {
            return new CustomInterceptor();
        }
        @Override
        public void configure(Context context) {
        }
    }
}

將java代碼打成jar包放到lib目錄下

編輯Flume配置文件
配置 1 個 netcat source， 1 個 sink group（2 個 avro sink），並配置相應的 ChannelSelector 和 interceptor。
編輯flume-multi-avro.conf：

# Name the components on this agent
a1.sources = r1
a1.sinks = k1 k2
a1.channels = c1 c2

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = 127.0.0.1
a1.sources.r1.port = 44444
a1.sources.r1.interceptors = i1
a1.sources.r1.interceptors.i1.type = com.yutao.flume.interceptor.CustomInterceptor$Builder
a1.sources.r1.selector.type = multiplexing
a1.sources.r1.selector.header = type
a1.sources.r1.selector.mapping.letter = c1
a1.sources.r1.selector.mapping.number = c2

# Describe the sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = 127.0.0.1
a1.sinks.k1.port = 4141
a1.sinks.k2.type=avro
a1.sinks.k2.hostname = 127.0.0.1
a1.sinks.k2.port = 4242

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Use a channel which buffers events in memory
a1.channels.c2.type = memory
a1.channels.c2.capacity = 1000
a1.channels.c2.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1 c2
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c2

編輯flume-avro-logger.conf：

a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1.sources.r1.type = avro
a1.sources.r1.bind = 127.0.0.1
a1.sources.r1.port = 4141
a1.sinks.k1.type = logger
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
a1.sinks.k1.channel = c1
a1.sources.r1.channels = c1

編輯flume-avro-logger1.conf：

a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1.sources.r1.type = avro
a1.sources.r1.bind = 127.0.0.1
a1.sources.r1.port = 4242
a1.sinks.k1.type = logger
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
a1.sinks.k1.channel = c1
a1.sources.r1.channels = c1

啓動Flume

#先後啓動以下進程
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a1 -c conf/ -f job/flume-avro-logger.conf -Dflume.root.logger=INFO,console
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a1 -c conf/ -f job/flume-avro-logger1.conf -Dflume.root.logger=INFO,console
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a1 -c conf/ -f job/flume-multi-avro.conf -Dflume.root.logger=INFO,console
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a1 -c conf/ -f job/flume-multi-avro.conf -Dflume.root.logger=INFO,console

測試

[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# nc -v 127.0.0.1 44444
hello
1111

5.2 自定義Source

5.3 自定義Sink

6 Flume 數據流監控

6.1 Ganglia 安裝部署

安裝 httpd 服務與 php

[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo yum -y install httpd php

安裝其他依賴

[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo yum -y install rrdtool perl-rrdtool rrdtool-devel apr-devel

安裝 ganglia
```
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo yum -y install ganglia-gmetad
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo yum -y install ganglia-web
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo yum -y install ganglia-gmond
```
Ganglia 由 gmond、 gmetad 和 gweb 三部分組成。
- gmond（Ganglia Monitoring Daemon）是一種輕量級服務，安裝在每臺需要收集指標數據的節點主機上。使用 gmond，你可以很容易收集很多系統指標數據，如 CPU、內存、磁盤、網絡和活躍進程的數據等。
- gmetad（Ganglia Meta Daemon）整合所有信息，並將其以 RRD 格式存儲至磁盤的服務。
- gweb（Ganglia Web） Ganglia 可視化工具， gweb 是一種利用瀏覽器顯示 gmetad 所存儲數據的 PHP 前端。在 Web 界面中以圖表方式展現集羣的運行狀態下收集的多種不同指標數據。

修改配置文件/etc/httpd/conf.d/ganglia.conf

[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# vim /etc/httpd/conf.d/ganglia.conf
# Ganglia monitoring system php web frontend
Alias /ganglia /usr/share/ganglia
<Location /ganglia>
Order deny,allow
#Deny from all
Allow from all
# Allow from 127.0.0.1
# Allow from ::1
# Allow from .example.com
</Location>

修改配置文件/etc/ganglia/gmetad.conf

[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# vim /etc/ganglia/gmetad.conf
data_source "hadoop01" 192.168.88.130

修改配置文件/etc/ganglia/gmond.conf

[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# vim /etc/ganglia/gmond.conf
cluster {
name = "hadoop01"
owner = "unspecified"
latlong = "unspecified"
url = "unspecified"
}
udp_send_channel {
#bind_hostname = yes # Highly recommended, soon to be default.
# This option tells gmond to use a source
address
# that resolves to the machine's hostname.
Without
# this, the metrics may appear to come from any
# interface and the DNS names associated with
# those IPs will be used to create the RRDs.
# mcast_join = 239.2.11.71
host = 192.168.88.130
port = 8649
ttl = 1
}
udp_recv_channel {
# mcast_join = 239.2.11.71
port = 8649
bind = 192.168.88.130
retry_bind = true
# Size of the UDP buffer. If you are handling lots of metrics you
really
# should bump it up to e.g. 10MB or even higher.
# buffer = 10485760
}

修改配置文件/etc/selinux/config

[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# vim /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted

提示： selinux 本次生效關閉必須重啓，如果此時不想重啓，可以臨時生效之：

[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo setenforce 0

啓動 ganglia

[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo service httpd start
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo service gmetad start
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo service gmond start

打開網頁瀏覽 ganglia 頁面
http://192.168.9.102/ganglia
提示：如果完成以上操作依然出現權限不足錯誤，請修改/var/lib/ganglia 目錄的權限：
```
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo chmod -R 777 /var/lib/ganglia
```

6.2 操作 Flume 測試監控

修改/opt/module/flume/conf 目錄下的 flume-env.sh 配置：

JAVA_OPTS="-Dflume.monitoring.type=ganglia
-Dflume.monitoring.hosts=192.168.9.102:8649
-Xms100m
-Xmx200m"

啓動 Flume 任務

[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent --conf conf/ --name a1 --conf-file job/flume-netcat-logger.conf -Dflume.root.logger==INFO,console -Dflume.monitoring.type=ganglia -Dflume.monitoring.hosts=192.168.88.130:8649

發送數據觀察 ganglia 監測圖

[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# nc localhost 44444

字段	字段含義
EventPutAttemptCount	source 嘗試寫入 channel 的事件總數量
EventPutSuccessCount	成功寫入 channel 且提交的事件總數量
EventTakeAttemptCount	sink 嘗試從 channel 拉取事件的總數量
EventTakeSuccessCount	sink 成功讀取的事件的總數量
StartTime	channel 啓動的時間（毫秒）
StopTime	channel 停止的時間（毫秒）
ChannelSize	目前 channel 中事件的總數量
ChannelFillPercentage	channel 佔用百分比
ChannelCapacity	channel 的容量

Flume從入門到放棄（二）