安装配置
1、将下载的flume包,解压到/usr/local/flume目录中
2、修改 flume-env.sh 配置文件,主要是JAVA_HOME变量设置
3、验证是否安装成功 flume-ng version
常见的几种Flume日志收集案例
案例1:Avro
Avro可以发送一个给定的文件给Flume,Avro 源使用AVRO RPC机制。
(a)创建agent配置文件
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = avro
a1.sources.r1.channels = c1
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 4141
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
b)启动flume agent a1
/usr/local/flume/flume150/bin/flume-ng agent -c . -f /usr/local/flume/flume150/conf/avro.conf -n a1 -Dflume.root.logger=INFO,console
c)创建指定文件
echo "hello world" > /usr/local/flume/flume150/logs/log.00
d)使用avro-client发送文件
/usr/local/flume/flume150/bin/flume-ng avro-client -c . -H itcast01 -p 4141 -F /usr/local/flume/flume150/logs/log.00
flume控制台
案例2:Spool
Spool监测配置的目录下新增的文件,并将文件中的数据读取出来。需要注意两点:
1) 拷贝到spool目录下的文件不可以再打开编辑。
2) spool目录下不可包含相应的子目录
(a)创建agent配置文件
# vi /usr/local/flume/flume150/conf/spool.conf
a1.sources = r1
a1.channels = c1
a1.sinks = k1
# Describe/configure the source
a1.sources.r1.type = spooldir
a1.sources.r1.channels = c1
a1.sources.r1.spoolDir =/usr/local/flume/flume150/logs
a1.sources.r1.fileHeader = true
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel = c1
b)启动flume agent a1
/usr/local/flume/flume150/bin/flume-ng agent -c . -f /usr/local/flume/flume150/conf/spool.conf -n a1 -Dflume.root.logger=INFO,console
c)追加文件到/usr/local/flume/flume150/logs目录,观察flume控制台信息
案例3:Exec
EXEC执行一个给定的命令获得输出的源,如果要使用tail命令,必选使得file足够大才能看到输出内容
a)创建agent配置文件
a1.sources = r1
a1.channels = c1
a1.sinks = k1
# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.channels = c1
a1.sources.r1.command = tail -F /usr/local/flume/flume150/logs/log_exec_tail
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel =c1
b)启动flume agent a1
/usr/local/flume/flume150/bin/flume-ng agent -c . -f /usr/local/flume/flume150/conf/exec_tail.conf -n a1 -Dflume.root.logger=INFO,console
c)生成足够多的内容在文件里
for i in {1..100};do echo "exec tail$i" >> /usr/local/flume/flume150/logs/log_exec_tail;echo $i;sleep 0.1;done
flume控制台
案例4:JSONHandler
a)创建agent配置文件
a1.sources = r1
a1.channels = c1
a1.sinks = k1
# Describe/configure the source
a1.sources.r1.type = org.apache.flume.source.http.HTTPSource
a1.sources.r1.port = 5142
a1.sources.r1.channels = c1
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel = c1
b)启动flume agent a1
/usr/local/flume/flume150/bin/flume-ng agent -c . -f /usr/local/flume/flume150/conf/post_json.conf -n a1 -Dflume.root.logger=INFO,console
c)生成JSON 格式的POST request
curl -X POST -d '[{ "headers" :{"a" : "a1","b" : "b1"},"body" : "idoall.org_body"}]' http://localhost:5142
flume控制台
案例5:Hadoop sink
a)创建agent配置文件
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = org.apache.flume.source.http.HTTPSource
a1.sources.r1.port = 5142
a1.sources.r1.channels = c1
# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = hdfs://itcast01:9000/user/root/user
a1.sinks.k1.hdfs.filePrefix = Syslog
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
b)启动flume agent a1
/usr/local/flume/flume150/bin/flume-ng agent -c . -f /usr/local/flume/flume150/conf/hdfs_sink.conf -n a1 -Dflume.root.logger=INFO,console
c)生成JSON 格式的POST request
curl -X POST -d '[{ "headers" :{"a" : "a1","b" : "b1"},"body" : "idoall.org_body"}]' http://localhost:5142
flume控制台
案例6 File Roll Sink
a)创建agent配置文件
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = org.apache.flume.source.http.HTTPSource
a1.sources.r1.port = 5142
a1.sources.r1.channels = c1
# Describe the sink
a1.sinks.k1.type = file_roll
a1.sinks.k1.sink.directory = /usr/local/flume/flume150/logs
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
c)生成JSON 格式的POST request
curl -X POST -d '[{ "headers" :{"a" : "a1","b" : "b1"},"body" : "idoall.org_body"}]' http://localhost:5142
查看 /usr/local/flume/flume150/logs 下生成的文件,默认每30秒生成一个新文件
案例7 kafka sink
注意:flume1.6开始支持kafka
a)创建agent配置文件
agent.sources = s1
agent.channels = c1
agent.sinks = k1
agent.sources.s1.type=exec
agent.sources.s1.command=tail -F /usr/local/flume/flume160/logs/kafka.log
agent.sources.s1.channels=c1
agent.channels.c1.type=memory
agent.channels.c1.capacity=10000
agent.channels.c1.transactionCapacity=100
#设置Kafka接收器
agent.sinks.k1.type= org.apache.flume.sink.kafka.KafkaSink
#设置Kafka的broker地址和端口号
agent.sinks.k1.brokerList=itcast01:9092
#设置Kafka的Topic
agent.sinks.k1.topic=slavetest
#设置序列化方式
agent.sinks.k1.serializer.class=kafka.serializer.StringEncoder
agent.sinks.k1.channel=c1
b)启动kafka服务,创建topic,启动消费监听
启动kafka,先启动zookeeper
bin/zookeeper-server-start.sh -daemon config/zookeeper.properties &
bin/kafka-server-start.sh -daemon config/server.properties &
创建topic
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic slavetest
生产数据
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic slavetest
消费数据
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic slavetest --from-beginning
c)启动flume agent
/usr/local/flume/flume160/bin/flume-ng agent --conf conf -f /usr/local/flume/flume160/conf/kafka_sink.conf -name agent -Dlume.root.logger=DEBUG,console
d)生成日志
for((i=0;i<=1000;i++));
do echo "kafka_test-"+$i>>/usr/local/flume/flume160/logs/kafka.log;
done
e)查看kafka 消费窗口