6.Flume 企業開發案例與架構設計

案列一:多路複用

1)案例需求
使用 Flume-1 監控文件變動,Flume-1 將變動內容傳遞給 Flume-2,Flume-2 負責存儲
到 HDFS。同時 Flume-1 將變動內容傳遞給Flume-3,Flume-3 負責輸出到Local FileSystem。
2 需求架構圖:
在這裏插入圖片描述
步驟:
1.創建group1文件夾,創建flume-file-flume.conf文件

a1.sources=s1
a1.channels=c1 c2
a1.sinks=k1 k2

a1.sources.s1.type=TAILDIR
a1.sources.s1.posititionFile=/opt/flume/job/qiye/group1/posititionFile1.json
a1.sources.s1.filegroups=f1
a1.sources.s1.filegroups.f1=/opt/flume/job/qiye/group1/.*log

a1.channels.c1.type=memory
a1.channels.c2.type=memory

a1.sinks.k1.type=avro
a1.sinks.k1.hostname=h1
a1.sinks.k1.port=10000

a1.sinks.k2.type=avro
a1.sinks.k2.hostname=h1
a1.sinks.k2.port=10001

a1.sources.s1.channels=c1 c2
a1.sinks.k1.channel=c1
a1.sinks.k2.channel=c2
  1. 創建flume-flume-hdfs.conf文件
a2.sources=s1
a2.sinks=k1
a2.channels=c1



a2.sources.s1.type=avro
a2.sources.s1.bind=h1
a2.sources.s1.port=10000

a2.sinks.k1.type=hdfs
a2.sinks.k1.hdfs.path=hdfs://h1:9000/flume/%Y%m%d/%H
a2.sinks.k1.hdfs.filePrefix=flume-
a2.sinks.k1.hdfs.useLocalTimeStamp = true
a2.sinks.k1.hdfs.batchSize = 100
a2.sinks.k1.hdfs.rollInterval = 1000

a2.channels.c1.type=memory

a2.sources.s1.channels=c1
a2.sinks.k1.channel=c1

3.創建flume-flume-dir.conf文件

a3.sources = r1
a3.sinks = k1 
a3.channels = c2

# Describe/configure the source 
a3.sources.r1.type = avro 
a3.sources.r1.bind = h1
a3.sources.r1.port = 10001

a3.sinks.k1.type = file_roll 
a3.sinks.k1.sink.directory = /opt/flume/job/qiye/group1

# Describe the channel 
a3.channels.c2.type = memory 

# Bind the source and sink to the channel 
a3.sources.r1.channels = c2 
a3.sinks.k1.channel = c2

4.先開啓a2,a3,再開啓a1

cd /opt/flume/bin


flume-ng agent   --name a2  --conf conf --conf-file /opt/flume/job/qiye/group1/flume-flume-hdfs.conf
flume-ng agent   --name a3  --conf conf --conf-file /opt/flume/job/qiye/group1/flume-flume-dir.conf
flume-ng agent   --name a1  --conf conf --conf-file /opt/flume/job/qiye/group1/flume-flie-flume.conf
  1. 在/opt/flume/job/qiye/group1文件夾下創建tes.log文件,查看hdfs

案例二:負載均衡和故障轉移

1)案例需求
使用 Flume1 監控一個端口,其 sink 組中的 sink 分別對接 Flume2 和 Flume3,採用FailoverSinkProcessor,實現故障轉移的功能。
2 ) 架構圖:
在這裏插入圖片描述

  1. 步驟:
    1.創建 flume-netcat-flume.conf
    配置 1 個 netcat source 和 1 個 channel、1 個 sink group(2 個 sink),分別輸送給 flume- flume-console1 和 flume-flume-console2。
a1.sources = r1
a1.channels = c1 
a1.sinkgroups = g1 
a1.sinks = k1 k2

a1.sources.r1.type = netcat 
a1.sources.r1.bind = localhost 
a1.sources.r1.port = 44444

a1.sinkgroups.g1.processor.type=failover
a1.sinkgroups.g1.processor.priority.k1=5
a1.sinkgroups.g1.processor.priority.k2=10
a1.sinkgroups.g1.processor.maxpenalty=10000

a1.sinks.k1.type = avro 
a1.sinks.k1.hostname = hadoop102
a1.sinks.k2.type = avro 
a1.sinks.k2.hostname = hadoop102 
a1.sinks.k2.port = 4142

# Describe the channel a1.channels.c1.type = memory a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel 
a1.sources.r1.channels = c1 
a1.sinkgroups.g1.sinks = k1 k2 
a1.sinks.k1.channel = c1 
a1.sinks.k2.channel = c1

2.創建 flume-flume-console1.conf


# Name the components on this agent
 a2.sources = r1
a2.sinks = k1 
a2.channels = c1

# Describe/configure the source 
a2.sources.r1.type = avro 
a2.sources.r1.bind = hadoop102 
a2.sources.r1.port = 4141

# Describe the sink a2.sinks.k1.type = logger

# Describe the channel 
a2.channels.c1.type = memory 
a2.channels.c1.capacity = 1000
a2.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel 
a2.sources.r1.channels = c1 
a2.sinks.k1.channel = c1

  1. a3與a2差不多

4)先開啓a2,a3,再開啓a1



bin/flume-ng agent --conf conf/ --name a3	--conf-file	job/group2/flume-flume-console2.conf	-
Dflume.root.logger=INFO,console

 bin/flume-ng agent --conf conf/ --name a2	--conf-file	job/group2/flume-flume-console1.conf	-
Dflume.root.logger=INFO,console

bin/flume-ng agent --conf conf/ --name a1 --conf-file job/group2/flume-netcat-flume.conf

5)往4444端口發送消息:

$ nc localhost 44444

案例三:自定義 Interceptor

1)案例需求
使用 Flume 採集服務器本地日誌,需要按照日誌類型的不同,將不同種類的日誌發往不同的分析系統。
2)需求分析
在實際的開發中,一臺服務器產生的日誌類型可能有很多種,不同類型的日誌可能需要發送到不同的分析系統。此時會用到 Flume 拓撲結構中的 Multiplexing 結構,Multiplexing的原理是,根據 event 中 Header 的某個 key 的值,將不同的 event 發送到不同的 Channel
不同的值。

在該案例中,我們以kafka發送數據模擬日誌,以數字(單個)和字母(單個)模擬不同類型的日誌,我們需要自定義 interceptor 區分數字和字母,將其分別發往不同的分析系統
(Channel)。
3)架構圖:
在這裏插入圖片描述
4步驟:

  1. 創建maven項目,引入依賴:
<dependency>
<groupId>org.apache.flume</groupId>
<artifactId>flume-ng-core</artifactId>
<version>1.8.0</version>
</dependency>

2.繼承Intercetpor

package com.flume;


import org.apache.flume.Context;
import org.apache.flume.Event;
import org.apache.flume.interceptor.Interceptor;

import java.util.List;

public class CustomInterceptor implements Interceptor {

    @Override
    public void initialize() {

    }

    @Override
    public Event intercept(Event event) {
        byte[] body=event.getBody();
        if(body[0]<'z'&&body[0]>'a'){
            event.getHeaders().put("type","letter");
        }else if(body[0]>'0'&&body[0]<'9'){
            event.getHeaders().put("type","number");
        }
        return event;
    }

    @Override
    public List<Event> intercept(List<Event> list) {
        for(Event event:list){
            intercept(event);
        }
        return list;
    }

    @Override
    public void close() {

    }
    public static class Builder implements Interceptor.Builder{

        @Override
        public Interceptor build() {
            return new CustomInterceptor();
        }

        @Override
        public void configure(Context context) {

        }
    }
}

3.打包,將jar包放入flume的Lib文件夾下面
4.創建kafka-flume.conf

a1.sources = s1
a1.sinks = k1 k2 
a1.channels = c1 c2


a1.sources.s1.type = org.apache.flume.source.kafka.KafkaSource
#元數據的位置
a1.sources.s1.kafka.bootstrap.servers=h1:9092
a1.sources.s1.kafka.topics=topic_test
#監控的目錄
a1.sources.s1.filegroups= f1
a1.sources.s1.filegroups.f1= /opt/flume/job/qiye/group1/inter/.*log


a1.sources.s1.interceptors=i1
a1.sources.s1.interceptors.i1.type=com.flume.CustomInterceptor$Builder

a1.sources.s1.selector.type=multiplexing
a1.sources.s1.selector.header=type
a1.sources.s1.selector.mapping.letter=c1
a1.sources.s1.selector.mapping.number=c2

a1.sinks.k1.type = avro 
a1.sinks.k1.hostname = h1
a1.sinks.k1.port = 10000
 
a1.sinks.k2.type = avro 
a1.sinks.k2.hostname = h1
a1.sinks.k2.port = 10001

a1.channels.c1.type = memory 
a1.channels.c2.type = memory 


a1.sinks.k1.channel=c1
a1.sinks.k2.channel=c2
a1.sources.s1.channels=c1 c2

5.創建logger1.conf

a2.sources = r1 
a2.sinks = k1 
a2.channels = c1

a2.sources.r1.type = avro 
a2.sources.r1.bind = h1
a2.sources.r1.port = 10000

a2.sinks.k1.type = logger

a2.channels.c1.type = memory 

a2.sinks.k1.channel = c1 
a2.sources.r1.channels = c1

6.logger2和logger1差不多

7.先開啓a2,a2,在開啓a1,最後開啓kafka發送消息,注意順序:

flume-ng agent   --name a3  --conf conf --conf-file /opt/flume/job/qiye/group1/inter/logger2.conf  -Dflume.root.logger=INFO,console
flume-ng agent   --name a2  --conf conf --conf-file /opt/flume/job/qiye/group1/inter/logger1.conf  -Dflume.root.logger=INFO,console
flume-ng agent   --name a1  --conf conf --conf-file /opt/flume/job/qiye/group1/inter/kafka-flume.conf   -Dflume.root.logger=INFO,console


#kafka
bin/kafka-console-producer.sh -broker-list h1:9092 --topic topic_test
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章