flume环境部署和配置案例详解

安装配置

 1、将下载的flume包,解压到/usr/local/flume目录中

 2、修改 flume-env.sh 配置文件,主要是JAVA_HOME变量设置

    3、验证是否安装成功 flume-ng version

 

常见的几种Flume日志收集案例

案例1:Avro
             Avro可以发送一个给定的文件给Flume,Avro 源使用AVRO RPC机制。

  (a)创建agent配置文件

a1.sources = r1
a1.sinks = k1
a1.channels = c1
  
# Describe/configure the source
a1.sources.r1.type = avro
a1.sources.r1.channels = c1
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 4141
  
# Describe the sink
a1.sinks.k1.type = logger
  
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
  
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

b)启动flume agent a1

/usr/local/flume/flume150/bin/flume-ng agent -c . -f /usr/local/flume/flume150/conf/avro.conf -n a1 -Dflume.root.logger=INFO,console

c)创建指定文件

echo "hello world" > /usr/local/flume/flume150/logs/log.00

d)使用avro-client发送文件

/usr/local/flume/flume150/bin/flume-ng avro-client -c .  -H itcast01 -p 4141 -F /usr/local/flume/flume150/logs/log.00

flume控制台

 

案例2:Spool
Spool监测配置的目录下新增的文件,并将文件中的数据读取出来。需要注意两点:
    1) 拷贝到spool目录下的文件不可以再打开编辑。
    2) spool目录下不可包含相应的子目录

  (a)创建agent配置文件

# vi /usr/local/flume/flume150/conf/spool.conf
a1.sources = r1
a1.channels = c1
a1.sinks = k1

# Describe/configure the source
a1.sources.r1.type = spooldir
a1.sources.r1.channels = c1
a1.sources.r1.spoolDir =/usr/local/flume/flume150/logs
a1.sources.r1.fileHeader = true

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel = c1

  b)启动flume agent a1

/usr/local/flume/flume150/bin/flume-ng agent -c . -f /usr/local/flume/flume150/conf/spool.conf -n a1 -Dflume.root.logger=INFO,console

c)追加文件到/usr/local/flume/flume150/logs目录,观察flume控制台信息

案例3:Exec
  EXEC执行一个给定的命令获得输出的源,如果要使用tail命令,必选使得file足够大才能看到输出内容

a)创建agent配置文件

a1.sources = r1
a1.channels = c1
a1.sinks = k1

# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.channels = c1
a1.sources.r1.command = tail -F /usr/local/flume/flume150/logs/log_exec_tail

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100


# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel =c1

b)启动flume agent a1

/usr/local/flume/flume150/bin/flume-ng agent -c . -f /usr/local/flume/flume150/conf/exec_tail.conf -n a1 -Dflume.root.logger=INFO,console

c)生成足够多的内容在文件里

for i in {1..100};do echo "exec tail$i" >> /usr/local/flume/flume150/logs/log_exec_tail;echo $i;sleep 0.1;done

flume控制台

 

案例4:JSONHandler

a)创建agent配置文件

a1.sources = r1
a1.channels = c1
a1.sinks = k1

# Describe/configure the source
a1.sources.r1.type = org.apache.flume.source.http.HTTPSource
a1.sources.r1.port = 5142
a1.sources.r1.channels = c1

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel = c1

b)启动flume agent a1

/usr/local/flume/flume150/bin/flume-ng agent -c . -f /usr/local/flume/flume150/conf/post_json.conf -n a1 -Dflume.root.logger=INFO,console

c)生成JSON 格式的POST request

curl -X POST -d '[{ "headers" :{"a" : "a1","b" : "b1"},"body" : "idoall.org_body"}]' http://localhost:5142

flume控制台

案例5:Hadoop sink

a)创建agent配置文件

a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = org.apache.flume.source.http.HTTPSource
a1.sources.r1.port = 5142
a1.sources.r1.channels = c1

# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = hdfs://itcast01:9000/user/root/user
a1.sinks.k1.hdfs.filePrefix = Syslog
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

b)启动flume agent a1

/usr/local/flume/flume150/bin/flume-ng agent -c . -f /usr/local/flume/flume150/conf/hdfs_sink.conf -n a1 -Dflume.root.logger=INFO,console

c)生成JSON 格式的POST request

curl -X POST -d '[{ "headers" :{"a" : "a1","b" : "b1"},"body" : "idoall.org_body"}]' http://localhost:5142

flume控制台

案例6 File Roll Sink

a)创建agent配置文件

a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = org.apache.flume.source.http.HTTPSource
a1.sources.r1.port = 5142
a1.sources.r1.channels = c1

# Describe the sink
a1.sinks.k1.type = file_roll
a1.sinks.k1.sink.directory = /usr/local/flume/flume150/logs

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

c)生成JSON 格式的POST request

curl -X POST -d '[{ "headers" :{"a" : "a1","b" : "b1"},"body" : "idoall.org_body"}]' http://localhost:5142

查看 /usr/local/flume/flume150/logs 下生成的文件,默认每30秒生成一个新文件

 

案例7 kafka sink   

注意:flume1.6开始支持kafka

a)创建agent配置文件

agent.sources = s1                                                                                                                  
agent.channels = c1                                                                                                                 
agent.sinks = k1                                                                                                                    
                                                                                                                                      
agent.sources.s1.type=exec                                                                                                          
agent.sources.s1.command=tail -F /usr/local/flume/flume160/logs/kafka.log                                                                          
agent.sources.s1.channels=c1

agent.channels.c1.type=memory                                                                                                       
agent.channels.c1.capacity=10000                                                                                                    
agent.channels.c1.transactionCapacity=100                                                                                           
                                                                                                                                      
#设置Kafka接收器                                                                                                                    
agent.sinks.k1.type= org.apache.flume.sink.kafka.KafkaSink                                                                          
#设置Kafka的broker地址和端口号                                                                                                      
agent.sinks.k1.brokerList=itcast01:9092                                                                                               
#设置Kafka的Topic                                                                                                                   
agent.sinks.k1.topic=slavetest                                                                                                      
#设置序列化方式                                                                                                                     
agent.sinks.k1.serializer.class=kafka.serializer.StringEncoder                                                                      
                                                                                                                                      
agent.sinks.k1.channel=c1

b)启动kafka服务,创建topic,启动消费监听

启动kafka,先启动zookeeper

bin/zookeeper-server-start.sh -daemon config/zookeeper.properties &

bin/kafka-server-start.sh -daemon config/server.properties &  

创建topic
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic slavetest

生产数据
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic slavetest

消费数据
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic slavetest --from-beginning

c)启动flume agent

/usr/local/flume/flume160/bin/flume-ng agent --conf conf -f /usr/local/flume/flume160/conf/kafka_sink.conf -name agent -Dlume.root.logger=DEBUG,console

d)生成日志

for((i=0;i<=1000;i++));
do echo "kafka_test-"+$i>>/usr/local/flume/flume160/logs/kafka.log;
done

e)查看kafka 消费窗口

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章