flume系統使用以及與storm的初步整合

 
Flume NG的簡單使用可以參考介紹文檔:http://blog.csdn.net/pelick/article/details/18193527,圖片也來源此blog:


 
 
 
下載完flume後,就可以在 https://flume.apache.org/FlumeUserGuide.html 中根據教程來啓動agent console
 
啓動完成後,在console中打印出現下面的日誌信息:
2016-06-21 13:00:06,890 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.source.NetcatSource.start(NetcatSource.java:164)] Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/172.16.79.12:44444]
 
 
可以通過telnet 172.16.79.12 44444 的方式來發送數據,發送完成後就可以在啓動的agent中查看到該日誌輸出,至此一個簡單的agent示例就演示完成。
 
2016-06-21 13:00:28,905 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: 61 62 63 64 65 0D                               abcde. }
 
 

規劃配置flume用於日誌收集

 
 
經過規劃,我們使用flume用來收集日誌的場景圖如下所示,每臺web服務器均配置一個agent用來傳輸日誌,並上傳至統一的agent4中,
 


 
 
 
對於每臺web server上的agent,我們採用Exec Sources類型的source來配置簡單的tail -f 來實現對日誌進行處理,並打印到日誌控制檯中,配置方法如下,其中type需要聲明爲exec,需要指定執行的命令(tail -F,根據需要還可以以管道的方式加入grep等命令):
 
zhenmq-agent.sources = zhenmq-source
zhenmq-agent.sinks = zhenmq-sink
zhenmq-agent.channels = zhenmq-channel

# Describe/configure the source
zhenmq-agent.sources.zhenmq-source.type = exec
zhenmq-agent.sources.zhenmq-source.command = tail -F /usr/local/tomcat/tomcat-zhenmq/logs/apilog/common-all.log

# Describe the sink
zhenmq-agent.sinks.zhenmq-sink.type = logger

# Use a channel which buffers events in memory
zhenmq-agent.channels.zhenmq-channel.type = memory
zhenmq-agent.channels.zhenmq-channel.capacity = 1000
zhenmq-agent.channels.zhenmq-channel.transactionCapacity = 100

# Bind the source and sink to the channel
zhenmq-agent.sources.zhenmq-source.channels = zhenmq-channel
zhenmq-agent.sinks.zhenmq-sink.channel = zhenmq-channel
 
 
日誌流經過channel(可以根據條件選擇memory還是file)後,需要輸出到統一的collector,這時候就需要指定使用flume中內置的序列化方式,這裏我們使用比較通用的Avro Source/Sink,source用來接收其他服務端發送的日誌流,sink用於將日誌數據輸出。
 
如果希望將flume進行分層設計,可以使用中間序列化方式將收集到的日誌傳輸到不同的服務器中,此時可以使用flume中自帶的avro source和sink組件,需要指定type爲avro,以及hostname和port(端口號)。
 
# Describe the sink
zhenmq-agent.sinks.zhenmq-sink.type = avro
zhenmq-agent.sinks.zhenmq-sink.hostname = 192.168.1.12
zhenmq-agent.sinks.zhenmq-sink.port = 23004

collector-agent.sources.collector-source.type = avro
collector-agent.sources.collector-source.bind= 192.168.1.13
collector-agent.sources.collector-source.port = 23004
 
 
注意,首先要在163服務器上啓動flume服務,在先啓動collector-source的情況下會報出拒絕連接的錯誤:
 
org.apache.flume.EventDeliveryException: Failed to send events
    at org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:392)
    at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
    at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flume.FlumeException: NettyAvroRpcClient { host: 192.168.1.163, port: 23004 }: RPC connection error
    at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:182)
    at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:121)
    at org.apache.flume.api.NettyAvroRpcClient.configure(NettyAvroRpcClient.java:638)
    at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:89)
    at org.apache.flume.sink.AvroSink.initializeRpcClient(AvroSink.java:127)
    at org.apache.flume.sink.AbstractRpcSink.createConnection(AbstractRpcSink.java:211)
    at org.apache.flume.sink.AbstractRpcSink.verifyConnection(AbstractRpcSink.java:272)
    at org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:349)
    ... 3 more
Caused by: java.io.IOException: Error connecting to /192.168.1.163:23004
    at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:261)
    at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:203)
    at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:152)
    at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:168)
    ... 10 more
Caused by: java.net.ConnectException: 拒絕連接
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
    at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:496)
    at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:452)
    at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:365)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    ... 1 more
 
 
啓動完成後,會在163 collector服務中看到如下的日誌,說明已經啓動成功。
 
2016-06-22 18:48:30,179 (New I/O server boss #1 ([id: 0xb85f59b4, /192.168.1.163:23004])) [INFO - org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)] [id: 0xf57de901, /192.168.1.162:52778 => /192.168.1.163:23004] OPEN
2016-06-22 18:48:30,181 (New I/O  worker #1) [INFO - org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)] [id: 0xf57de901, /192.168.1.162:52778 => /192.168.1.163:23004] BOUND: /192.168.1.163:23004
2016-06-22 18:48:30,181 (New I/O  worker #1) [INFO - org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)] [id: 0xf57de901, /192.168.1.162:52778 => /192.168.1.163:23004] CONNECTED: /192.168.1.162:52778
 
 
 
 
Flume的負載均衡與故障轉移
 
 
由於在圖中agent4爲單點,加入agent4掛掉的話會導致日誌無法正常輸出,故採用flume的負載均衡/故障轉移模式來避免這一單點生效。即每次按照一定的算法選擇sink輸出到指定地方,如果在文件輸出量很大的情況下,負載均衡還是很有必要的,通過多個通道輸出緩解輸出壓力。
 
flume內置的負載均衡的算法默認是round robin,輪詢算法,按序選擇。
 
source裏的event流經channel,進入sink組,在sink組內部根據負載算法(round_robin、random)選擇sink,後續可以選擇不同機器上的agent實現負載均衡。
 


 
 
 
如果是採用故障轉移,這組sinke將會組成一個failover sink processor,此時如果有一個sink處理失敗,flume會將這個sink放到一個地方等待冷卻時間,等到正常處理event的時候再拿回來。event通過通過一個channel流向一個sink組,在sink組內部根據優先級選擇具體的sink,一個失敗後再轉向另一個sink,流程圖如下:
 


 
 
鑑於我們當前的日誌規模不算太大,先採用故障轉移的方式來進行,後續如果處理不過來可以採用負載均衡。
 
 

配置故障轉移

 
首先需要定義sinkgroups,定義group的處理類型,以及每個sink的優先級,此時先會往優先級較高的服務端發送日誌,如果該服務不可用,則放到冷卻池中,使用優先級較低的sink來處理。
 
注意啓動順序,一定是被依賴的flume先啓動。
 
zhenmq-agent.sources = zhenmq-source
zhenmq-agent.sinks = collector-sink1 collector-sink2
zhenmq-agent.channels = zhenmq-channel

# Describe/configure the source
zhenmq-agent.sources.zhenmq-source.type = exec
zhenmq-agent.sources.zhenmq-source.command = tail -F /usr/local/tomcat/tomcat-zhenmq/logs/apilog/common-all.log

# Describe the sink
zhenmq-agent.sinks.collector-sink1.type = avro
zhenmq-agent.sinks.collector-sink1.channel= zhenmq-channel
zhenmq-agent.sinks.collector-sink1.hostname = 192.168.1.163
zhenmq-agent.sinks.collector-sink1.port = 23004

zhenmq-agent.sinks.collector-sink2.type = avro
zhenmq-agent.sinks.collector-sink2.channel= zhenmq-channel
zhenmq-agent.sinks.collector-sink2.hostname = 192.168.1.165
zhenmq-agent.sinks.collector-sink2.port = 23004

# Use a channel which buffers events in memory
zhenmq-agent.channels.zhenmq-channel.type = memory
zhenmq-agent.channels.zhenmq-channel.capacity = 1000
zhenmq-agent.channels.zhenmq-channel.transactionCapacity = 100

zhenmq-agent.sinkgroups = g1
zhenmq-agent.sinkgroups.g1.sinks = collector-sink1 collector-sink2

zhenmq-agent.sinkgroups.g1.processor.type = failover
zhenmq-agent.sinkgroups.g1.processor.priority.collector-sink1 = 10
zhenmq-agent.sinkgroups.g1.processor.priority.collector-sink2 = 11
 
 

Flume連接到Storm

 
一般情況下,flume的數據需要經過一輪轉換至kafka中,然後storm讀取kafka中的消息,來達到實時分析的目的。但我們可以暫時跳過kafka,直接將flume的輸出結果輸出到strom中。
 
參考開源實現:https://github.com/rvisweswara/flume-storm-connector,但通過分析其源碼可以看出,其內部通過啓動一個flume agent組件(SourceRunner,Channel,SinkCounter)來通過avro協議接收flume傳輸出來的流來完成此目的,FlumeSpout類型的整體類型圖如下:
 


 
 
由於原來的實例是三年前寫的,jar包比較老,可能無法啓動,可以clone下面的鏈接本地啓動(master分支):https://github.com/clamaa/flume-storm-connector
 
 
 
測試用例的啓動入口類型爲:FlumeConnectorTopology,其main方法中首先需要配置一個topology.properties文件,用來指定在FlumeSpout啓動的Agent source類型和端口(一般情況下的type爲avro,只需要指定對應的bind和port即可)。
 
flume-agent.source.type=avro
flume-agent.channel.type=memory
flume-agent.source.bind=127.0.0.1
flume-agent.source.port=10101
 
 
根據MaterializedConfigurationProvider以及相關配置,生成啓動agent對應的MaterializedConfiguration(flume相關),在FlumeSpout.open的方法中,MaterializedConfiguration可以生成 sourceRunner(avro類型), channel(內存級別的,可以從中直接獲取數據)。
 
構造flume agent的過程,由於不需要sink,也不需要添加SinkRunner,只加入SinkCounter用於輸出計數使用(MXBean類型,可以通過JMX Console監聽其關鍵輸出指標)。
flumeAgentProps = StormEmbeddedAgentConfiguration.configure(
                FLUME_AGENT_NAME, flumeAgentProps);
        MaterializedConfiguration conf = configurationProvider.get(
                getFlumePropertyPrefix(), flumeAgentProps);

        Map<String, Channel> channels = conf.getChannels();
        if (channels.size() != 1) {
            throw new FlumeException("Expected one channel and got "
                    + channels.size());
        }
        Map<String, SourceRunner> sources = conf.getSourceRunners();
        if (sources.size() != 1) {
            throw new FlumeException("Expected one source and got "
                    + sources.size());
        }

        this.sourceRunner = sources.values().iterator().next();
        this.channel = channels.values().iterator().next();

        if (sinkCounter == null) {
            sinkCounter = new SinkCounter(FlumeSpout.class.getName());
        }
 
 
nextTurple方法中,定時對內部啓動的Flume Channel進行take操作,獲取最新event,
for (int i = 0; i < this.batchSize; i++) {
                Event event = channel.take();
                if (event == null) {
                    break;
                }
                batch.add(event);
            }
 
 
並將這些event包裝成Values,由Collector進行emit(發射)操作,這裏由於日誌的格式可能會有多種類型,FlumeSpout可以設置TurpleProducer,根據對應的event自定義消息類型,以及聲明的字段名稱。
 
for (Event event : batch) {
                Values vals = this.getTupleProducer().toTuple(event);
                this.collector.emit(vals);
                this.pendingMessages.put(
                        event.getHeaders().get(Constants.MESSAGE_ID), event);

                LOG.debug("NextTuple:"
                        + event.getHeaders().get(Constants.MESSAGE_ID));
            }
 
 
消息在發送之前會暫時存在FlumeSpout.pendingMessages中(ConcurrentHashMap),以支持消息確認,在確認完成後,會將其刪除;如果確認失敗,會根據消息id進行重發。
 
   
 /*
     * When a message is succeeded remove from the pending list
     * 
     * @see backtype.storm.spout.ISpout#ack(java.lang.Object)
     */
    public void ack(Object msgId) {
        this.pendingMessages.remove(msgId.toString());
    }

    /*
     * When a message fails, retry the message by pushing the event back to channel.
     * Note: Please test this situation...
     * 
     * @see backtype.storm.spout.ISpout#fail(java.lang.Object)
     */
    public void fail(Object msgId) {
        //on a failure, push the message from pending to flume channel;

        Event ev = this.pendingMessages.get(msgId.toString());
        if(null != ev){
            this.channel.put(ev);
        }
    }
 
同時,該connector中也提供AvroSinkBolt,用於將storm生成的消息通過avro的方式再傳回至flume中,其基本原理就是維持一個與flume的avro agent的連接RpcClient,並可以自定義flume事件生成器,將storm產生的Turple轉換成storm對應的Event,這裏就不再詳細說明。
 
    private RpcClient rpcClient;
    private FlumeEventProducer flumeEventProducer; 
 
Flume收集日誌的agent進程仍然可能出現另一種情況,就是掛掉,此時日誌中出現錯誤:
<!--?xml version="1.0" encoding="UTF-8" standalone="no"?-->
 
2016-07-06 11:14:19,951 (pool-5-thread-1) [INFO - org.apache.flume.source.ExecSource$ExecRunnable.run(ExecSource.java:376)] Command [tail -F /usr/local/tomcat/tomcat-shopapi/logs/apilog/common-warn.log] exited with 137
 
<!--?xml version="1.0" encoding="UTF-8" standalone="no"?-->
exec source中有兩個屬性,用於處理當進程異常退出時嘗試重啓操作。
 
restartThrottle 10000 Amount of time (in millis) to wait before attempting a restart
restart false Whether the executed cmd should be restarted if it dies
 
 
 
 
 
 
  • cc4d5acb-8a6b-319f-94d1-1ac8a3d9a7c2-thumb.png
  • 大小: 27.7 KB
  • 00ad4717-d16b-32e6-99a4-d799fe7e702a-thumb.jpg
  • 大小: 36.4 KB
  • aa00f420-6591-371d-b4bb-3fe683f63e73-thumb.png
  • 大小: 39.2 KB
  • e28bfffa-b9e4-3066-8a0c-41565cff95f5-thumb.png
  • 大小: 42.7 KB
  • 37334be4-b4fe-3e5c-bb16-c51eb40eadfe-thumb.png
  • 大小: 39.6 KB
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章