前言
- 操作系統:CentOS 7
- Java版本:1.8.0_221
- Flume版本:1.8.0
- HDFS版本:2.7.7
- Flume agent配置:Netcat TCP Source、Memory Channel、HDFS Sink
具體步驟
a) 拷貝Hadoop相關jar包至flume/lib/
路徑下
在hadoop-2.7.7/share/
路徑下找到以下對應jar包,並將其拷貝至flume/lib/
路徑下。Flume啓動時,會將此路徑添加至ClassPath
commons-configuration-1.6.jar
commons-io-2.4.jar
hadoop-auth-2.7.7.jar
hadoop-common-2.7.7.jar
hadoop-hdfs-2.7.7.jar
htrace-core-3.1.0-incubating.jar
b) 根據使用場景配置properties文件
a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1.sources.r1.type = netcat
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 44444
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = /flume/logs/%Y-%m-%d/%H-%M-%S
a1.sinks.k1.hdfs.filePrefix = logs_%Y-%m-%d
a1.sinks.k1.hdfs.rollInterval = 10
a1.sinks.k1.hdfs.rollSize = 134217700
a1.sinks.k1.hdfs.rollCount = 0
a1.sinks.k1.hdfs.batchSize = 100
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 30
a1.sinks.k1.hdfs.roundUnit = second
a1.sinks.k1.hdfs.useLocalTimeStamp = true
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
c) 使用此配置文件啓動agent
啓動腳本前保證HDFS集羣正常運行
[tomandersen@hadoop101 flume-1.8.0]$ call-cluster.sh jps
----------hadoop103----------
18272 Jps
17794 DataNode
17987 NodeManager
18105 JobHistoryServer
17868 SecondaryNameNode
----------hadoop102----------
17826 DataNode
18457 Jps
17950 ResourceManager
18079 NodeManager
----------hadoop101----------
10321 DataNode
10785 Jps
10619 NodeManager
10205 NameNode
----------execute "jps" in cluster takes 6 seconds----------
[tomandersen@hadoop101 flume-1.8.0]$
在Flume安裝路徑下通過bin/flume-ng
腳本啓動agent
./bin/flume-ng agent -n a1 -c conf/ -f job/netcat-memory-hdfs.properties
d) 發送測試數據並檢查HDFS中是否成功上傳對應數據
發送測試數據
[tomandersen@hadoop101 ~]$ echo Hello World! | nc localhost 44444
OK
[tomandersen@hadoop101 ~]$
[tomandersen@hadoop101 ~]$
進入NameNode Web UI頁面查看HDFS文件
下載並查看查看HDFS文件內容
End~