kafka

- user case

  • log collection
  • message system
  • user activity
  • stream processing
  • event source

- design

  • kafka broker leader, multiple brokers contend for being leader by creating ephemeral node in zookeeper. only one can succeed. others become followers/watchers. Once leader is down, followers try to be new leader and restore the partitions from the down broker
  • consumer group, only one consumer in the group can get message from the same topic. create more consumer group if you want to consume message multiple times from the same topic. consumer group will consume messages from all partitions of the topic. the number of consumers should be same as number of topic partitions.  message from the same partition is consumed by the same consumer in the group. consumer needs to have offset so that message is sequentially read. offset can be maintained in zookeeper (high level API) or by client itself (low level API).
       message consistency, at most once (save offset before processing), at least once (save offset after processing) and exact once (at least once plus consumer keeping latest number to avoid duplicated message)
  • consumer rebalance condition, consumer increase or decrease; broker increase and decrease
  • topic & partition, one topic can have multiple partitions. the number of partitions should be equal or bigger than num of brokers. replicas of partition should be in different machine. partition leader and followers info are kept in zookeeper. partition leader push message to followers.
  • producer, message delivery modes: just send, master-slave, master ack. producer get topic's partition leaders from any broker and create socket connection to all of them. message is sent to broker directly via sockets. only all replicas have message logged can message be consumed.
  • partition ack, no ack from broker, leader ack, leader + one of follower ack, -1 all ack
  • partition has multiple segments which names are offset.kafka. all write and read are from leader partition .
  • batch push, sync/async push
  • buffer between producer and receiver
  • message can be compressed. message can be buffered before sending and receiving. offheap memory to avoid duplicate data in memory.

================kafka components cooperation==========

1, producer, broker and consumer

  • producer use zookeeper to find all brokers, as well as partition leaders of topic. create sockets to them
  • Broker use zookeeper to register itself and monitor partition leaders. its topics and partitions are kept in zookeeper too.
  • consumer registers itself in zookeeper, including partitions consumed by itself. get all brokers and create socket connections to partition leaders


2, leader election





發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章