1 Flume事務
2 Flume Agent內部原理
2.1 ChannelSelector
ChannelSelector 的作用就是選出 Event 將要被髮往哪個 Channel。其共有兩種類型,分別是 Replicating(複製)和 Multiplexing(多路複用)。ReplicatingSelector 會將同一個 Event 發往所有的 Channel, Multiplexing 會根據相應的原則,將不同的 Event 發往不同的 Channel。
2.1.1 Replicating Channel Selector
屬性名 | 默認值 | 說明 |
---|---|---|
selector.type | replicating | 默認replicating,可用replicating或multiplexing |
selector.optional | - | 可選的通道 |
例:
a1.sources = r1
a1.channels = c1 c2 c3
a1.sources.r1.channels = c1 c2 c3
a1.sources.r1.selector.type = replicating
a1.sources.r1.selector.optional = c3
c3是可選通道。寫入c3的失敗將被忽略。由於c1和c2沒有標記爲可選,因此無法寫入這些通道將導致事務失敗。
2.1.2 Multiplexing Channel Selector
屬性名 | 默認值 | 說明 |
---|---|---|
selector.type | replicating | 默認replicating,可用replicating或multiplexing |
selector.header | flume.selector.header | |
selector.default | - | |
selector.mapping.* | - |
多路複用選擇具有另一組屬性,以使流分叉。這需要指定Event屬性到通道集合的映射。選擇器檢查Event Header中的每個已配置屬性。如果它與指定值匹配,那麼該事件將發送到映射到該值的所有通道。如果沒有匹配項,則將事件發送到配置爲默認值的一組通道。
例:
agent_foo.sources = avro-AppSrv-source1
agent_foo.sources.avro-AppSrv-source1.selector.type = multiplexing
agent_foo.sources.avro-AppSrv-source1.selector.header = State
agent_foo.sources.avro-AppSrv-source1.selector.mapping.CA = mem-channel-1
agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ = file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.mapping.NY = mem-channel-1 file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.default = mem-channel-
選擇器會掃描Event Header的屬性State
,如果值爲CA,則將其發送到mem-channel-1
;如果其值爲AZ,則將其發送到file-channel-2
;如果值爲NY,則兩者都發送,如果未設置State
標頭或不匹配這三個標頭中的任何一個,則它將轉到指定爲default
的mem-channel-1
。
選擇器還支持可選通道。要爲標頭指定可選通道,可通過以下方式使用config參數“ optional”:
# channel selector configuration
agent_foo.sources.avro-AppSrv-source1.selector.type = multiplexing
agent_foo.sources.avro-AppSrv-source1.selector.header = State
agent_foo.sources.avro-AppSrv-source1.selector.mapping.CA = mem-channel-1
agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ = file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.mapping.NY = mem-channel-1 file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.optional.CA = mem-channel-1 file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ = file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.default = mem-channel-1
- 選擇器將首先嚐試寫入required通道,如果其中一個通道消費Event失敗將導致事務失敗,事務將在所有required通道上重試。一旦所有required通道成功消費了Event,選擇器將嘗試寫入所有可選通道,任何可選通道未能消費事件都將被忽略,不會重試。
- 如果可選通道和特定標頭的required通道之間存在重疊,則認爲該通道是required的,並且該通道中的故障將導致重試整個required通道集。
- 如果沒有指定任何可選通道,將會嘗試寫入可選通道和默認通道。
2.2 Sink Processor
SinkProcessor 共 有 三 種 類 型 , 分 別 是 DefaultSinkProcessor 、LoadBalancingSinkProcessor 和 FailoverSinkProcessor DefaultSinkProcessor 對 應 的 是 單 個 的 Sink , LoadBalancingSinkProcessor 和FailoverSinkProcessor 對應的是 Sink Group, LoadBalancingSinkProcessor 可以實現負載均衡的功能, FailoverSinkProcessor 可以實現故障轉移的功能。
2.2.1 Default Sink Processor
Default Sink Processor 只接受單個sink。用戶不必爲單個sink創建processor (sink group)。相反,用戶可以遵循之前的source-channel-sink模式。
2.2.2 Failover Sink Processor
Failover Sink Processor維護一個按優先級排序的sink列表,確保只要有一個可用的sink,就會處理(交付)事件。
故障轉移機制的工作方式是將失敗的sink轉移到冷卻池中,一旦sink發送event成功,他將恢復到活動池中。可以通過processor.priority.*
設置優先級,值越大優先級越高,如果sink發送event失敗,將由下個最高優先級的去重試,沒有指定優先級則按照配置順序來確定。
屬性名 | 默認值 | 說明 |
---|---|---|
sinks | - | 用空格分割的sink組列表 |
processor.type | default | 可選default 、load_balance 、failover |
processor.priority.<sinkName> | - | 優先級設置,值越大優先級越大,必須唯一 |
processor.maxpenalty | 30000 | 失敗sink的最大回退週期(單位:s),冷卻期 |
例:
#命名一個g1 sink組
a1.sinkgroups = g1
#g1 sink組下的sink列表
a1.sinkgroups.g1.sinks = k1 k2
#設置Sink Processor類型爲failover
a1.sinkgroups.g1.processor.type = failover
#設置優先級,值唯一
a1.sinkgroups.g1.processor.priority.k1 = 5
a1.sinkgroups.g1.processor.priority.k2 = 10
#失敗sink最大回退週期
a1.sinkgroups.g1.processor.maxpenalty = 10000
2.2.3 Load Balancing Sink Processor
屬性名 | 默認值 | 說明 |
---|---|---|
processor.sinks | - | 用空格分割的sink組列表 |
processor.type | default | 可選default 、load_balance 、failover |
processor.backoff | false | 失敗sink是否設置爲指數回退 |
processor.selector | round_robin | 選擇機制,必須是round_robin 、random 或繼承AbstractSinkSelector 的用戶自定義FQCN |
processor.selector.maxTimeOut | 30000 | 失敗sink的最大回退週期(單位:s),結合backoff使用 |
例:
#命名一個g1 sink組
a1.sinkgroups = g1
#g1 sink組下的sink列表
a1.sinkgroups.g1.sinks = k1 k2
#設置Sink Processor類型爲load_balance
a1.sinkgroups.g1.processor.type = load_balance
#失敗sink是否設置爲指數回退
a1.sinkgroups.g1.processor.backoff = true
#選擇器類型
a1.sinkgroups.g1.processor.selector = random
3 Flume 拓撲結構
3.1 單一流程(one-agent flow)
3.2 多代理流程(multi-agent flow)
爲了使數據跨多個代理或躍點流動,前一個代理的sink和當前躍點的source需要是avro類型並且sink指向source的主機名(或IP地址)和端口。可以將多個Agent順序連接起來,這是最簡單的情況,一般情況下,應該控制這種順序連接的Agent 的數量,因爲數據流經的路徑變長了,如果不考慮failover的話,出現故障將影響整個Flow上的Agent收集服務。
3.3 流合併(Consolidation)
這種情況應用的場景比較多,比如要收集Web網站的用戶行爲日誌, Web網站爲了可用性使用的負載集羣模式,每個節點都產生用戶行爲日誌,可以爲每 個節點都配置一個Agent來單獨收集日誌數據,然後多個Agent將數據最終匯聚到一個用來存儲數據的存儲系統,如HDFS上。
3.4 多路複用流程(Multiplexing the flow)
將多種日誌混合在一起流入一個agent,可以agent中將混雜的日誌流分開,然後給每種日誌建立一個自己的傳輸通道,也可以設置成複製流,每個通道都接收所有的流。
3.5 負載均衡流程(Load Balance)
Flume支持使用將多個 sink邏輯上分到一個 sink組,sink組配合不同的 SinkProcessor可以實現負載均衡和故障轉移的功能。
4 案例
4.1 流合併
- 創建
flume1-logger-flume.conf
,配置Source用於監控hive.log文件,配置Sink輸出數據到下一級Flume:
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /tmp/root/hive.log
a1.sources.r1.shell = /bin/bash -c
# Describe the sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = 127.0.0.1
a1.sinks.k1.port = 4141
# Describe the channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
- 創建
flume2-netcat-flume.conf
,配置Source監控端口44444數據流,配置Sink數據到下一級Flume:
# Name the components on this agent
a2.sources = r1
a2.sinks = k1
a2.channels = c1
# Describe/configure the source
a2.sources.r1.type = netcat
a2.sources.r1.bind = 127.0.0.1
a2.sources.r1.port = 44444
# Describe the sink
a2.sinks.k1.type = avro
a2.sinks.k1.hostname = 127.0.0.1
a2.sinks.k1.port = 4141
# Use a channel which buffers events in memory
a2.channels.c1.type = memory
a2.channels.c1.capacity = 1000
a2.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a2.sources.r1.channels = c1
a2.sinks.k1.channel = c1
- 創建
flume3-flume-logger.conf
,配置source用於接收flume1與flume2發送過來的數據流,最終合併後sink到控制檯:
# Name the components on this agent
a3.sources = r1
a3.sinks = k1
a3.channels = c1
# Describe/configure the source
a3.sources.r1.type = avro
a3.sources.r1.bind = 127.0.0.1
a3.sources.r1.port = 4141
# Describe the sink
# Describe the sink
a3.sinks.k1.type = logger
# Describe the channel
a3.channels.c1.type = memory
a3.channels.c1.capacity = 1000
a3.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a3.sources.r1.channels = c1
a3.sinks.k1.channel = c1
- 執行配置文件,分別開啓對應配置文件:
flume3-flume-logger.conf
,flume2-netcat-flume.conf
,flume1-logger-flume.conf
。
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a3 -c conf/ -f job/flume3-flume-logger.conf -Dflume.root.logger=INFO,console
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a2 -c conf/ -f job/flume2-netcat-flume.conf -Dflume.root.logger=INFO,console
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a1 -c conf/ -f job/flume1-netcat-flume.conf -Dflume.root.logger=INFO,console
4.2 多路複用流程
- 配置第一個agent,編寫配置文件
multi-one.conf
:
# Name the components on this agent
a1.sources = r1
a1.sinks = k1 k2
a1.channels = c1 c2
# 將數據流複製給所有channel
a1.sources.r1.selector.type = replicating
# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /tmp/root/hive.log
a1.sources.r1.shell = /bin/bash -c
# Describe the sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = 127.0.0.1
a1.sinks.k1.port = 4141
a1.sinks.k2.type = hdfs
a1.sinks.k2.hdfs.path = hdfs://127.0.0.1:9000/flume/multi/%Y%m%d/%H
#上傳文件的前綴
a1.sinks.k2.hdfs.filePrefix = dfs-
#是否按照時間滾動文件夾
a1.sinks.k2.hdfs.round = true
#多少時間單位創建一個新的文件夾
a1.sinks.k2.hdfs.roundValue = 1
#重新定義時間單位
a1.sinks.k2.hdfs.roundUnit = hour
#是否使用本地時間戳
a1.sinks.k2.hdfs.useLocalTimeStamp = true
#積攢多少個Event才flush到HDFS一次
a1.sinks.k2.hdfs.batchSize = 100
#設置文件類型,可支持壓縮
a1.sinks.k2.hdfs.fileType = DataStream
#多久生成一個新的文件
a1.sinks.k2.hdfs.rollInterval = 600
#設置每個文件的滾動大小
a1.sinks.k2.hdfs.rollSize = 134217700
#設置多少Event滾動,0表示滾動與Event數量無關
a1.sinks.k2.hdfs.rollCount = 0
#最小冗餘數
a1.sinks.k2.hdfs.minBlockReplicas = 1
# Describe the channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
a1.channels.c2.type = memory
a1.channels.c2.capacity = 1000
a1.channels.c2.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1 c2
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c2
輸出的本地目錄必須是已經存在的目錄,如果該目錄不存在,並不會創建新的目錄。
- 配置第二個agent,編寫配置文件
multi-tow.conf
:
# Name the components on this agent
a2.sources = r1
a2.sinks = k1
a2.channels = c1
# Describe/configure the source
a2.sources.r1.type = avro
a2.sources.r1.bind = 127.0.0.1
a2.sources.r1.port = 4141
# Describe the sink
a2.sinks.k1.type = file_roll
a2.sinks.k1.sink.directory = /opt/module/datas/flume3
# Describe the channel
a2.channels.c1.type = memory
a2.channels.c1.capacity = 1000
a2.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a2.sources.r1.channels = c1
a2.sinks.k1.channel = c1
- 啓動Flume
#先啓動avro服務端,再啓動avro客戶端
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a2 -c conf/ -f job/multi-two.conf -Dflume.root.logger=INFO,console
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a1 -c conf/ -f job/multi-one.conf -Dflume.root.logger=INFO,console
4.3 負載均衡和故障轉移
5 用戶自定義
官方自定義組件文檔地址:http://flume.apache.org/releases/content/1.9.0/FlumeDeveloperGuide.html
5.1 自定義Interceptor與多路複用
在實際的開發中,一臺服務器產生的日誌類型可能有很多種,不同類型的日誌可能需要發送到不同的分析系統。此時會用到 Flume 拓撲結構中的 Multiplexing 結構, Multiplexing的原理是,根據 event 中 Header 的某個 key 的值,將不同的 event 發送到不同的 Channel中,所以我們需要自定義一個 Interceptor,爲不同類型的 event 的 Header 中的 key 賦予不同的值。
案例:
- 創建maven項目,引入以下依賴。
<dependency> <groupId>org.apache.flume</groupId> <artifactId>flume-ng-core</artifactId> <version>1.9.0</version> </dependency>
- 自定義Interceptor需要實現
org.apache.flume.interceptor.Interceptor
接口。package com.yutao.flume.interceptor; import org.apache.flume.Context; import org.apache.flume.Event; import org.apache.flume.interceptor.Interceptor; import java.util.List; /** * @author yutyi * @date 2019/10/25 */ public class CustomInterceptor implements Interceptor { @Override public void initialize() { } @Override public Event intercept(Event event) { byte[] body = event.getBody(); if (body[0] < 'z' && body[0] > 'a') { event.getHeaders().put("type", "letter"); } else if (body[0] > '0' && body[0] < '9') { event.getHeaders().put("type", "number"); } return event; } @Override public List<Event> intercept(List<Event> events) { for (Event event : events) { intercept(event); } return events; } @Override public void close() { } public static class Builder implements Interceptor.Builder { @Override public Interceptor build() { return new CustomInterceptor(); } @Override public void configure(Context context) { } } }
- 將java代碼打成jar包放到
lib
目錄下 - 編輯Flume配置文件
配置 1 個 netcat source, 1 個 sink group(2 個 avro sink),並配置相應的 ChannelSelector 和 interceptor。
編輯flume-multi-avro.conf:
編輯flume-avro-logger.conf:# Name the components on this agent a1.sources = r1 a1.sinks = k1 k2 a1.channels = c1 c2 # Describe/configure the source a1.sources.r1.type = netcat a1.sources.r1.bind = 127.0.0.1 a1.sources.r1.port = 44444 a1.sources.r1.interceptors = i1 a1.sources.r1.interceptors.i1.type = com.yutao.flume.interceptor.CustomInterceptor$Builder a1.sources.r1.selector.type = multiplexing a1.sources.r1.selector.header = type a1.sources.r1.selector.mapping.letter = c1 a1.sources.r1.selector.mapping.number = c2 # Describe the sink a1.sinks.k1.type = avro a1.sinks.k1.hostname = 127.0.0.1 a1.sinks.k1.port = 4141 a1.sinks.k2.type=avro a1.sinks.k2.hostname = 127.0.0.1 a1.sinks.k2.port = 4242 # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Use a channel which buffers events in memory a1.channels.c2.type = memory a1.channels.c2.capacity = 1000 a1.channels.c2.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 c2 a1.sinks.k1.channel = c1 a1.sinks.k2.channel = c2
編輯flume-avro-logger1.conf:a1.sources = r1 a1.sinks = k1 a1.channels = c1 a1.sources.r1.type = avro a1.sources.r1.bind = 127.0.0.1 a1.sources.r1.port = 4141 a1.sinks.k1.type = logger a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 a1.sinks.k1.channel = c1 a1.sources.r1.channels = c1
a1.sources = r1 a1.sinks = k1 a1.channels = c1 a1.sources.r1.type = avro a1.sources.r1.bind = 127.0.0.1 a1.sources.r1.port = 4242 a1.sinks.k1.type = logger a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 a1.sinks.k1.channel = c1 a1.sources.r1.channels = c1
- 啓動Flume
#先後啓動以下進程 [root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a1 -c conf/ -f job/flume-avro-logger.conf -Dflume.root.logger=INFO,console [root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a1 -c conf/ -f job/flume-avro-logger1.conf -Dflume.root.logger=INFO,console [root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a1 -c conf/ -f job/flume-multi-avro.conf -Dflume.root.logger=INFO,console [root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent -n a1 -c conf/ -f job/flume-multi-avro.conf -Dflume.root.logger=INFO,console
- 測試
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# nc -v 127.0.0.1 44444 hello 1111
5.2 自定義Source
5.3 自定義Sink
6 Flume 數據流監控
6.1 Ganglia 安裝部署
-
安裝 httpd 服務與 php
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo yum -y install httpd php
-
安裝其他依賴
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo yum -y install rrdtool perl-rrdtool rrdtool-devel apr-devel
-
安裝 ganglia
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm [root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo yum -y install ganglia-gmetad [root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo yum -y install ganglia-web [root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo yum -y install ganglia-gmond
Ganglia 由 gmond、 gmetad 和 gweb 三部分組成。
- gmond(Ganglia Monitoring Daemon)是一種輕量級服務,安裝在每臺需要收集指標數據的節點主機上。使用 gmond,你可以很容易收集很多系統指標數據,如 CPU、內存、磁盤、網絡和活躍進程的數據等。
- gmetad(Ganglia Meta Daemon)整合所有信息,並將其以 RRD 格式存儲至磁盤的服務。
- gweb(Ganglia Web) Ganglia 可視化工具, gweb 是一種利用瀏覽器顯示 gmetad 所存儲數據的 PHP 前端。在 Web 界面中以圖表方式展現集羣的運行狀態下收集的多種不同指標數據。
-
修改配置文件
/etc/httpd/conf.d/ganglia.conf
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# vim /etc/httpd/conf.d/ganglia.conf # Ganglia monitoring system php web frontend Alias /ganglia /usr/share/ganglia <Location /ganglia> Order deny,allow #Deny from all Allow from all # Allow from 127.0.0.1 # Allow from ::1 # Allow from .example.com </Location>
-
修改配置文件
/etc/ganglia/gmetad.conf
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# vim /etc/ganglia/gmetad.conf data_source "hadoop01" 192.168.88.130
-
修改配置文件
/etc/ganglia/gmond.conf
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# vim /etc/ganglia/gmond.conf cluster { name = "hadoop01" owner = "unspecified" latlong = "unspecified" url = "unspecified" } udp_send_channel { #bind_hostname = yes # Highly recommended, soon to be default. # This option tells gmond to use a source address # that resolves to the machine's hostname. Without # this, the metrics may appear to come from any # interface and the DNS names associated with # those IPs will be used to create the RRDs. # mcast_join = 239.2.11.71 host = 192.168.88.130 port = 8649 ttl = 1 } udp_recv_channel { # mcast_join = 239.2.11.71 port = 8649 bind = 192.168.88.130 retry_bind = true # Size of the UDP buffer. If you are handling lots of metrics you really # should bump it up to e.g. 10MB or even higher. # buffer = 10485760 }
-
修改配置文件
/etc/selinux/config
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# vim /etc/selinux/config # This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - No SELinux policy is loaded. SELINUX=disabled # SELINUXTYPE= can take one of these two values: # targeted - Targeted processes are protected, # mls - Multi Level Security protection. SELINUXTYPE=targeted
提示: selinux 本次生效關閉必須重啓,如果此時不想重啓,可以臨時生效之:
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo setenforce 0
-
啓動 ganglia
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo service httpd start [root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo service gmetad start [root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo service gmond start
-
打開網頁瀏覽 ganglia 頁面
http://192.168.9.102/ganglia
提示:如果完成以上操作依然出現權限不足錯誤,請修改/var/lib/ganglia 目錄的權限:[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# sudo chmod -R 777 /var/lib/ganglia
6.2 操作 Flume 測試監控
- 修改/opt/module/flume/conf 目錄下的 flume-env.sh 配置:
JAVA_OPTS="-Dflume.monitoring.type=ganglia
-Dflume.monitoring.hosts=192.168.9.102:8649
-Xms100m
-Xmx200m"
- 啓動 Flume 任務
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# bin/flume-ng agent --conf conf/ --name a1 --conf-file job/flume-netcat-logger.conf -Dflume.root.logger==INFO,console -Dflume.monitoring.type=ganglia -Dflume.monitoring.hosts=192.168.88.130:8649
- 發送數據觀察 ganglia 監測圖
[root@iZnq8v4wpstsagZ apache-flume-1.9.0-bin]# nc localhost 44444
字段 | 字段含義 |
---|---|
EventPutAttemptCount | source 嘗試寫入 channel 的事件總數量 |
EventPutSuccessCount | 成功寫入 channel 且提交的事件總數量 |
EventTakeAttemptCount | sink 嘗試從 channel 拉取事件的總數量 |
EventTakeSuccessCount | sink 成功讀取的事件的總數量 |
StartTime | channel 啓動的時間(毫秒) |
StopTime | channel 停止的時間(毫秒) |
ChannelSize | 目前 channel 中事件的總數量 |
ChannelFillPercentage | channel 佔用百分比 |
ChannelCapacity | channel 的容量 |