功能:log 日誌收集和分析
流程:1.應用程序產生本地log文件
2.flume監控文件並收集日誌到kafka中
3.spark Structure streaming監聽kafka獲取結構流進行分析,結果輸出到DB
4.頁面通過查詢DB顯示結果
環境搭建:1.flume(apache-flume-1.9.0-bin)
(1)下載壓縮包解壓
(2)修改配置文件(採用spooldir souce ,memory channel,kafka sink)
# define agent
testAgent.sources = testSource
testAgent.channels = testChannel
testAgent.sinks = testSink
# define source
testAgent.sources.testSource.type = spooldir
testAgent.sources.testSource.spoolDir = /bigData/flumeTest
testAgent.sources.testSource.fileHeader = true
#testAgent.sources.testSource.type = TAILDIR
#testAgent.sources.testSource.positionFile = /bigData/flumeTest/taildir_position.json
#testAgent.sources.testSource.filegroups = f1
#testAgent.sources.testSource.filegroups.f1 = /bigData/flumeTest/hello.txt
#testAgent.sources.testSource.headers.f1.headerKey1 = value1
#testAgent.sources.testSource.fileHeader = true
#testAgent.sources.testSource.maxBatchCount = 1000
# define sink
#testAgent.sinks.testSink.type = logger
#testAgent.sinks.testSink.type = file_roll
#testAgent.sinks.testSink.sink.directory = /bigData/sinkTest
testAgent.sinks.testSink.type = org.apache.flume.sink.kafka.KafkaSink
testAgent.sinks.testSink.kafka.topic = test
testAgent.sinks.testSink.kafka.bootstrap.servers = 127.0.0.1:9092
testAgent.sinks.testSink.kafka.flumeBatchSize = 20
testAgent.sinks.testSink.kafka.producer.acks = 1
testAgent.sinks.testSink.kafka.producer.linger.ms = 1
testAgent.sinks.testSink.kafka.producer.compression.type = snappy
# define channel
testAgent.channels.testChannel.type= memory
testAgent.channels.testChannel.capacity=1000
testAgent.channels.testChannel.transactionCapacity=100
#bind source&sink channel
testAgent.sources.testSource.channels = testChannel
testAgent.sinks.testSink.channel = testChannel
2.zookeeper(zookeeper-3.4.5),kafka(kafka_2.12-2.2.0)安裝
(1)下載壓縮包,配置環境變量
3.hadoop(hadoop-2.7.7),spark(spark-2.4.3-bin-hadoop2.7)安裝
(1)下載壓縮包,配置環境變量
測試過程:1.啓動zookeeper
zkserver
2.啓動kafka
.\bin\windows\kafka-server-start.bat .\config\server.properties
3.啓動spark監聽程序
4.啓動flume
bin\flume-ng.cmd agent -n testAgent -c conf -f conf\flume-conf.properties.template -property
"flume.root.logger=INFO,console"
5.flume監控目錄中生成文件
echo {"userId":"0003","userName":"testUser","userAge":43} >> test.json