flume环境部署和配置案例详解

原創

jeffrey_ding

2020-07-02 19:22

安装配置

　1、将下载的flume包，解压到/usr/local/flume目录中

　2、修改 flume-env.sh 配置文件,主要是JAVA_HOME变量设置

3、验证是否安装成功 flume-ng version

常见的几种Flume日志收集案例

案例1：Avro
Avro可以发送一个给定的文件给Flume，Avro 源使用AVRO RPC机制。

(a)创建agent配置文件

a1.sources = r1
a1.sinks = k1
a1.channels = c1
  
# Describe/configure the source
a1.sources.r1.type = avro
a1.sources.r1.channels = c1
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 4141
  
# Describe the sink
a1.sinks.k1.type = logger
  
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
  
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

b)启动flume agent a1

/usr/local/flume/flume150/bin/flume-ng agent -c . -f /usr/local/flume/flume150/conf/avro.conf -n a1 -Dflume.root.logger=INFO,console

c)创建指定文件

echo "hello world" > /usr/local/flume/flume150/logs/log.00

d)使用avro-client发送文件

/usr/local/flume/flume150/bin/flume-ng avro-client -c .  -H itcast01 -p 4141 -F /usr/local/flume/flume150/logs/log.00

flume控制台

案例2：Spool
Spool监测配置的目录下新增的文件，并将文件中的数据读取出来。需要注意两点：
　 1) 拷贝到spool目录下的文件不可以再打开编辑。
　 2) spool目录下不可包含相应的子目录

(a)创建agent配置文件

# vi /usr/local/flume/flume150/conf/spool.conf
a1.sources = r1
a1.channels = c1
a1.sinks = k1

# Describe/configure the source
a1.sources.r1.type = spooldir
a1.sources.r1.channels = c1
a1.sources.r1.spoolDir =/usr/local/flume/flume150/logs
a1.sources.r1.fileHeader = true

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel = c1

b)启动flume agent a1

/usr/local/flume/flume150/bin/flume-ng agent -c . -f /usr/local/flume/flume150/conf/spool.conf -n a1 -Dflume.root.logger=INFO,console

c)追加文件到/usr/local/flume/flume150/logs目录,观察flume控制台信息

案例3：Exec
　　EXEC执行一个给定的命令获得输出的源,如果要使用tail命令，必选使得file足够大才能看到输出内容

a)创建agent配置文件

a1.sources = r1
a1.channels = c1
a1.sinks = k1

# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.channels = c1
a1.sources.r1.command = tail -F /usr/local/flume/flume150/logs/log_exec_tail

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100


# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel =c1

b)启动flume agent a1

/usr/local/flume/flume150/bin/flume-ng agent -c . -f /usr/local/flume/flume150/conf/exec_tail.conf -n a1 -Dflume.root.logger=INFO,console

c)生成足够多的内容在文件里

for i in {1..100};do echo "exec tail$i" >> /usr/local/flume/flume150/logs/log_exec_tail;echo $i;sleep 0.1;done

flume控制台

案例4：JSONHandler

a)创建agent配置文件

a1.sources = r1
a1.channels = c1
a1.sinks = k1

# Describe/configure the source
a1.sources.r1.type = org.apache.flume.source.http.HTTPSource
a1.sources.r1.port = 5142
a1.sources.r1.channels = c1

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel = c1

b)启动flume agent a1

/usr/local/flume/flume150/bin/flume-ng agent -c . -f /usr/local/flume/flume150/conf/post_json.conf -n a1 -Dflume.root.logger=INFO,console

c)生成JSON 格式的POST request

curl -X POST -d '[{ "headers" :{"a" : "a1","b" : "b1"},"body" : "idoall.org_body"}]' http://localhost:5142

flume控制台

案例5：Hadoop sink

a)创建agent配置文件

a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = org.apache.flume.source.http.HTTPSource
a1.sources.r1.port = 5142
a1.sources.r1.channels = c1

# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = hdfs://itcast01:9000/user/root/user
a1.sinks.k1.hdfs.filePrefix = Syslog
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

b)启动flume agent a1

/usr/local/flume/flume150/bin/flume-ng agent -c . -f /usr/local/flume/flume150/conf/hdfs_sink.conf -n a1 -Dflume.root.logger=INFO,console

c)生成JSON 格式的POST request

curl -X POST -d '[{ "headers" :{"a" : "a1","b" : "b1"},"body" : "idoall.org_body"}]' http://localhost:5142

flume控制台

案例6 File Roll Sink

a)创建agent配置文件

a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = org.apache.flume.source.http.HTTPSource
a1.sources.r1.port = 5142
a1.sources.r1.channels = c1

# Describe the sink
a1.sinks.k1.type = file_roll
a1.sinks.k1.sink.directory = /usr/local/flume/flume150/logs

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

c)生成JSON 格式的POST request

curl -X POST -d '[{ "headers" :{"a" : "a1","b" : "b1"},"body" : "idoall.org_body"}]' http://localhost:5142

查看 /usr/local/flume/flume150/logs 下生成的文件，默认每30秒生成一个新文件

案例7 kafka sink

注意：flume1.6开始支持kafka

a)创建agent配置文件

agent.sources = s1                                                                                                                  
agent.channels = c1                                                                                                                 
agent.sinks = k1                                                                                                                    
                                                                                                                                      
agent.sources.s1.type=exec                                                                                                          
agent.sources.s1.command=tail -F /usr/local/flume/flume160/logs/kafka.log                                                                          
agent.sources.s1.channels=c1

agent.channels.c1.type=memory                                                                                                       
agent.channels.c1.capacity=10000                                                                                                    
agent.channels.c1.transactionCapacity=100                                                                                           
                                                                                                                                      
#设置Kafka接收器                                                                                                                    
agent.sinks.k1.type= org.apache.flume.sink.kafka.KafkaSink                                                                          
#设置Kafka的broker地址和端口号                                                                                                      
agent.sinks.k1.brokerList=itcast01:9092                                                                                               
#设置Kafka的Topic                                                                                                                   
agent.sinks.k1.topic=slavetest                                                                                                      
#设置序列化方式                                                                                                                     
agent.sinks.k1.serializer.class=kafka.serializer.StringEncoder                                                                      
                                                                                                                                      
agent.sinks.k1.channel=c1

b)启动kafka服务，创建topic，启动消费监听

启动kafka，先启动zookeeper

bin/zookeeper-server-start.sh -daemon config/zookeeper.properties &

bin/kafka-server-start.sh -daemon config/server.properties &  

创建topic
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic slavetest

生产数据
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic slavetest

消费数据
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic slavetest --from-beginning

c)启动flume agent

/usr/local/flume/flume160/bin/flume-ng agent --conf conf -f /usr/local/flume/flume160/conf/kafka_sink.conf -name agent -Dlume.root.logger=DEBUG,console

d）生成日志

for((i=0;i<=1000;i++));
do echo "kafka_test-"+$i>>/usr/local/flume/flume160/logs/kafka.log;
done

e)查看kafka 消费窗口

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

flume环境部署和配置案例详解

安装配置

常见的几种Flume日志收集案例

Android启动过程-万字长文(Android14)

【SQL进阶】CASE语句的使用

optional install error: Error: Unsupported URL Type: npm:vue-loader@^16.1.0

这种嵌套字典类型的数据，我想把它读取到df里，如何操作？

微调真的能让LLM学到新东西吗:引入新知识可能让模型产生更多的幻觉

iNeuOS工业互联网操作系统，增加电力IEC104协议

微服务实践k8s&dapr开发部署实验（3）订阅发布

chromedriver版本

kbgressdb之数据结构V0.2

springboot常見面試題

Eureka源碼深度解析（2）

flume環境部署和配置案例詳解

Eureka源碼深度解析（1）

IntelliJ IDEA 最常用配置，適合新手

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結