利用開源日誌收集軟件fluentd收集日誌到HDFS文件系統中

說明:本來研究開源日誌的系統是flume,後來發現配置比較麻煩,網上搜索到fluentd也是開源的日誌收集系統,配置簡單多了,性能不錯,所以就改研究這個東東了!官方主頁,大家可以看看:fluentd.org,支持300+的plugins,應該是不錯的!


fluentd是通過hadoop中的webHDFS與HDFS進行通信的,所以在配置fluentd時,一定要保證webHDFS能正常通信,和通過webHDFS寫數據到hdfs中!

原理圖如下:

wKiom1SAB_PCQmAmAADSV4dSD3E785.jpg


webHDFS的相關配置與測試,請看這篇文章:http://shineforever.blog.51cto.com/1429204/1585942


安裝環境大致說明:

1)fluentd和hadoop中的namenode要安裝到一臺物理機器上;

2)os版本:rhel 5.7 64位

3)hadoop版本:1.2.1

4)jdk1.7.0_67

5)ruby版本:ruby 2.1.2p95 


1.安裝前的準備工作,安裝ruby,因爲fluentd是ruby開發的:

yum install openssl-devel zlib-devel gcc gcc-c++ make autoconf readline-devel curl-devel expat-devel gettext-devel


卸載系統自帶ruby版本:

yum erase ruby ruby-libs ruby-mode ruby-rdoc ruby-irb ruby-ri ruby-docs


通過源碼安裝ruby:

wget -c http://cache.ruby-lang.org/pub/ruby/2.1/ruby-2.1.2.tar.gz

然後解壓包,編譯,把ruby安裝到目錄 /usr/local/ruby即可,然後設置profile環境變量。

測試ruby:

[root@node1 install]# ruby -v

ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-linux]


出現以上字段,代表ruby安裝成功。


2.fluentd安裝:

fluentd有源碼安裝,gem安裝或者rpm方式安裝三種方式;

本文采用rpm的安裝方式官方文檔已經幫我們寫好了腳本,直接執行就行了:


curl -L http://toolbelt.treasuredata.com/sh/install-redhat-td-agent2.sh | sh


安裝成功以後,啓動腳本是:/etc/init.d/td-agent start

配置文件路徑是:/etc/td-agent/


[root@node1 install]# cd /etc/td-agent/

You have new mail in /var/spool/mail/root

[root@node1 td-agent]# pwd

/etc/td-agent

[root@node1 td-agent]# ls

logrotate.d  plugin  prelink.conf.d  td-agent.conf  


3.利用gem安裝fluentd插件fluent-plugin-webhdfs

1)由於國內防火牆block了ruby源,請更換gem的源:

[root@node1 bin]# td-agent-gem source --remove https://ruby.taobao.org/

https://ruby.taobao.org/ removed from sources

[root@node1 bin]# td-agent-gem source -a https://ruby.taobao.org/      

https://ruby.taobao.org/ added to sources


2)安裝插件:

td-agent-gem  install fluent-plugin-webhdfs


查看gem的安裝列表:

td-agent-gem list


*** LOCAL GEMS ***


bigdecimal (1.2.4)

bundler (1.7.7)

cool.io (1.2.4)

fluent-mixin-config-placeholders (0.3.0)

fluent-mixin-plaintextformatter (0.2.6)

fluent-plugin-webhdfs (0.4.1)

fluentd (0.12.0.pre.2)

http_parser.rb (0.6.0)

io-console (0.4.2)

json (1.8.1)

ltsv (0.1.0)

minitest (4.7.5)

msgpack (0.5.9)

psych (2.0.5)

rake (10.1.0)

rdoc (4.1.0)

sigdump (0.2.2)

string-scrub (0.0.5)

test-unit (2.1.2.0)

thread_safe (0.3.4)

tzinfo (1.2.2)

tzinfo-data (1.2014.10)

uuidtools (2.1.5)

webhdfs (0.6.0)

yajl-ruby (1.2.1)


4)配置flunetd,加載fluent-plugin-webhdfs 模塊;

加入以下字段:

vim /etc/td-agent/td-agent.conf

<match hdfs.*.*>
  type webhdfs
  host node1.test.com
  port 50070
  path /log/%Y%m%d_%H/access.log.${hostname}
  flush_interval 1s
</match>


重啓td-agent服務;


5)設置hdfs相關配置:

創建log目錄

 hadoop fs -mkdir /log

賦予log目錄權限爲777,如果不賦予,數據寫不進去,官方文檔沒有說明,測試了好久才發現!

hadoop fs -chmod 777 /log


6)再次重啓td-agent服務,開始測試,測試命令如下:

curl -X POST -d 'json={"json":"message"}' http://172.16.41.151:8888/hdfs.access.test


這時就發現hadoop裏面文件有變化了!

wKioL1SAGl3hvxMKAAEJZGry3HE760.jpg

安裝配置過程中的報錯:

1)

2014-12-03 15:56:12 +0800 [warn]: failed to communicate hdfs cluster, path: /log/20141203_15/access.log.node1.test.com

2014-12-03 15:56:12 +0800 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-03 15:56:28 +0800 error_class="WebHDFS::ClientError" error="{\"RemoteException\":{\"exception\":\"IllegalArgumentException\",\"javaClassName\":\"java.lang.IllegalArgumentException\",\"message\":\"n must be positive\"}}" instance=23456251808160

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:313:in `request'

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:231:in `operate_requests'

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:45:in `create'

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:189:in `rescue in send_data'

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:186:in `send_data'

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:205:in `write'

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:296:in `write_chunk'

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:276:in `pop'

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:311:in `try_flush'

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:132:in `run'


出現以上情況,是你的hdfs文件系統有問題,不能寫數據等等,請單獨測試hdfs的是否運行正常!


2)

2014-12-04 14:44:55 +0800 [warn]: failed to communicate hdfs cluster, path: /log/20141204_14/access.log.node1.test.com

2014-12-04 14:44:55 +0800 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-04 14:45:30 +0800 error_class="WebHDFS::IOError" error="{\"RemoteException\":{\"exception\":\"AccessControlException\",\"javaClassName\":\"org.apache.hadoop.security.AccessControlException\",\"message\":\"org.apache.hadoop.security.AccessControlException: Permission denied: user=webuser, access=WRITE, inode=\\\"\\\":hadoop:supergroup:rwxr-xr-x\"}}" instance=23456251808060

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:317:in `request'

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:242:in `operate_requests'

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:45:in `create'

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:189:in `rescue in send_data'

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:186:in `send_data'

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:205:in `write'

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:296:in `write_chunk'

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:276:in `pop'

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:311:in `try_flush'

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:132:in `run'

2014-12-04 14:45:31 +0800 [warn]: failed to communicate hdfs cluster, path: /log/20141204_14/access.log.node1.test.com

2014-12-04 14:45:31 +0800 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-04 14:46:26 +0800 error_class="WebHDFS::IOError" error="{\"RemoteException\":{\"exception\":\"AccessControlException\",\"javaClassName\":\"org.apache.hadoop.security.AccessControlException\",\"message\":\"org.apache.hadoop.security.AccessControlException: Permission denied: user=webuser, access=WRITE, inode=\\\"\\\":hadoop:supergroup:rwxr-xr-x\"}}" instance=23456251808060

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:317:in `request'

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:242:in `operate_requests'

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:45:in `create'

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:189:in `rescue in send_data'

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:186:in `send_data'

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:205:in `write'

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:296:in `write_chunk'

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:276:in `pop'

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:311:in `try_flush'

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:132:in `run'


出現以上情況,一般是你的hdfs沒有設置好權限,把存放日誌的hdfs目錄chmod 777,就可以了


如果日誌寫入hdfs正常,日誌顯示的是:2014-12-04 14:48:40 +0800 [warn]: retry succeeded. instance=23456251808060



發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章