KafkaConsumer及其監控

Kafka目前爲java提供了兩種consumer的API:

- high level consumer api

該consumer api 封裝了很多consumer需要的高級功能,如


  • Auto/Hidden Offset Management
  • Auto(Simple) Partition Assignment
  • Broker Failover => Auto Rebalance
  • Consumer Failover => Auto Rebalance
  • If user do not want any of these, then simple consumer is sufficient
  • If user want to control over offset management with others unchanged, one option is to expose the current ZK implementation of the high-level consumer to users and allow them to override; another option is to change the high-level consumer API to return the offset vector associated with messages
  • If user want to control partition assignment, one option is to change the high-level consumer API to allow such config info be passed in while creating the stream; another option is ad-hoc: just make a single-partition topic and assign it to the consumer.
  • If user just want the automatic partition assignment be more "smart" with co-location consideration, etc, one option is to store the host/rack info in ZK and let the rebalance algorithm read them while doing the computation.

該consumer默認會把自己的信息寫在zk路徑 /consumers/<groupId>,其中包括

  • offsets  該topic的<partition_num>上的offset的值
  • owners 當前<topic>的每個partition,在該<groupId>下能收取數據的consumer的唯一ID
  • ids 當前<groupId>的所有consumer列表

正常情況下,High level consumer可以滿足我們日常大多數用途。


- simple consumer api

只有最基本的鏈接、讀取功能,可以自己去讀offset,並指定offset的讀取方式。適合於各種自定義。


Kafka的監控目前有兩種方式:

1. JMX

Kafka內置有一個Mx4jLoader的程序,該程序如果在classpath中發現了mx4j-tools.jar,就會加載該jar,在8082 可以查看MX4J提供的網頁信息。

除該內置的接口外,也可以自行修改Java啓動命令,加入jmx。然後基於jmx集成到各大監控系統,如Zabbix, Ganglia等。後者直接github上直接有一個項目(猛擊這裏


2. zookeeper

典型監控有kafkamonitor 和 kafka-web-console

兩者的安裝都比較簡單。這裏就不再多寫了,可直接參見。


看官方wiki說,0.9開始似乎要對consumer的api有大改動,個人是比較支持的。目前consumer的api看上去是有點要麼過於簡單、要麼封裝過深。

wiki:https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Client+Re-Design

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章