【Kafka筆記】5.Kafka 多線程消費消息

Kafka多線程消費理解

Kafka Java Consumer設計

Kafka Java Consumer採用的是單線程的設計。其入口類KafkaConsumer是一個雙線程的設計,即用戶主線程和心跳線程。
用戶主線程,指的是啓動Consumer應用程序main方法的線程,心跳線程(Heartbeat Thread)只負責定期給對應的Broker機器發送心跳請求,以表示消費者應用的存活性。

官網文檔對於consumer多線程的處理方式 :

  1. 每個線程一個消費者

    每個線程自己的消費者實例。這裏是這種方法的優點和缺點:

    • PRO: 這是最容易實現的
    • PRO: 因爲它不需要在線程之間協調,所以通常它是最快的。
    • CON: 更多的消費者意味着更多的TCP連接到集羣(每個線程一個)。一般kafka處理連接非常的快,所以這是一個小成本。
    • CON: 更多的消費者意味着更多的請求被髮送到服務器,但稍微較少的數據批次可能導致I/O吞吐量的一些下降。
    • CON: 所有進程中的線程總數受到分區總數的限制。
  2. 解耦消費和處理

    另一個替代方式是一個或多個消費者線程,它來消費所有數據,其消費所有數據並將ConsumerRecords實例切換到由實際處理記錄處理的處理器線程池來消費的阻塞隊列。這個選項同樣有利弊:

    • 可擴展消費者和處理進程的數量。這樣單個消費者的數據可分給多個處理器線程來執行,避免對分區的任何限制。
    • CON: 跨多個處理器的順序保證需要特別注意,因爲線程是獨立的執行,後來的消息可能比遭到的消息先處理,這僅僅是因爲線程執行的運氣。如果對排序沒有問題,這就不是個問題。
    • CON: 手動提交變得更困難,因爲它需要協調所有的線程以確保處理對該分區的處理完成。

這是兩種不同的處理方式。

Kafka多線程消費實例

1. 消費者程序擁有多個消費者並擁有自己的主題

解釋: 消費者程序啓動多個線程,每個線程維護專屬的KafkaConsumer,負責完整的消息獲取、消息處理流程。 (其實就是一個消費者客戶端開啓多個線程,每個線程都有各自的Consumer對同一個topic或者多個topic進行消費,這些消費者(線程)組成了一個消費者組)

借用網上的圖:

在這裏插入圖片描述

topic數據實例:

在這裏插入圖片描述

代碼:

public class KafkaConsumerThread  implements Runnable{

    private KafkaConsumer<String,String> consumer;
    private AtomicBoolean closed = new AtomicBoolean(false);
    public KafkaConsumerThread(){

    }
    // 構造方法 生成自己的consumer
    public KafkaConsumerThread(Properties props) {
        this.consumer = new KafkaConsumer<>(props);
    }

    @Override
    public void run() {
        try {
            // 消費同一主題
            consumer.subscribe(Collections.singletonList("six-topic"));
            // 線程名稱
            String threadName = Thread.currentThread().getName();
            while (!closed.get()){
                ConsumerRecords<String, String> records = consumer.poll(3000);
                for (ConsumerRecord<String, String> record : records) {
                    System.out.printf("Context: Thread-name= %s, topic= %s partition= %s, offset= %d, key= %s,value= %s\n",threadName,record.topic(),record.partition(),record.offset(),record.key(),record.value());
                }
            }
        }catch (WakeupException e){
            e.printStackTrace();
        }finally {
            consumer.close();
        }
    }

    /**
     * 關閉消費
     */
    public void shutdown(){
        closed.set(true);
        // wakeup 可以安全地從外部線程來中斷活動操作
        consumer.wakeup();
    }

    public static void main(String[] args) {
        Properties props = new Properties();
        props.put("bootstrap.servers", "XXXXXXX:9093");
        props.put("group.id", "thread-1");//消費者組,只要group.id相同,就屬於同一個消費者組
        props.put("enable.auto.commit", "true");//自動提交offset
        props.put("auto.offset.reset", "earliest");
        props.put("auto.commit.interval.ms", "1000");
        props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        props.put("session.timeout.ms", "30000");
        props.put("max.poll.records",6);
        // 運行三個線程,消費同一個topic 這個topic的分區必須大於等於3 否則會有消費者消費不到數據
        for (int i = 0; i < 3 ; i++) {
            new Thread(new KafkaConsumerThread(props),"Thread"+i).start();
        }
    }
}

日誌:

Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 0, key= ImKey-0-one,value= ImValue-0-one
Context: Thread-name= Thread2, topic= six-topic partition= 2, offset= 0, key= ImKey-1-one,value= ImValue-1-one
Context: Thread-name= Thread2, topic= six-topic partition= 2, offset= 1, key= ImKey-5-one,value= ImValue-5-one
Context: Thread-name= Thread2, topic= six-topic partition= 2, offset= 2, key= ImKey-8-one,value= ImValue-8-one
Context: Thread-name= Thread2, topic= six-topic partition= 2, offset= 3, key= ImKey-10-one,value= ImValue-10-one
Context: Thread-name= Thread2, topic= six-topic partition= 2, offset= 4, key= ImKey-13-one,value= ImValue-13-one
Context: Thread-name= Thread2, topic= six-topic partition= 2, offset= 5, key= ImKey-14-one,value= ImValue-14-one
Context: Thread-name= Thread1, topic= six-topic partition= 1, offset= 0, key= ImKey-4-one,value= ImValue-4-one
Context: Thread-name= Thread1, topic= six-topic partition= 1, offset= 1, key= ImKey-6-one,value= ImValue-6-one
Context: Thread-name= Thread1, topic= six-topic partition= 1, offset= 2, key= ImKey-7-one,value= ImValue-7-one
Context: Thread-name= Thread1, topic= six-topic partition= 1, offset= 3, key= ImKey-11-one,value= ImValue-11-one
Context: Thread-name= Thread1, topic= six-topic partition= 1, offset= 4, key= ImKey-15-one,value= ImValue-15-one
Context: Thread-name= Thread1, topic= six-topic partition= 1, offset= 5, key= ImKey-21-one,value= ImValue-21-one
Context: Thread-name= Thread1, topic= six-topic partition= 1, offset= 6, key= ImKey-25-one,value= ImValue-25-one
Context: Thread-name= Thread1, topic= six-topic partition= 1, offset= 7, key= ImKey-27-one,value= ImValue-27-one
Context: Thread-name= Thread1, topic= six-topic partition= 1, offset= 8, key= ImKey-29-one,value= ImValue-29-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 1, key= ImKey-2-one,value= ImValue-2-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 2, key= ImKey-3-one,value= ImValue-3-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 3, key= ImKey-9-one,value= ImValue-9-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 4, key= ImKey-12-one,value= ImValue-12-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 5, key= ImKey-16-one,value= ImValue-16-one
Context: Thread-name= Thread2, topic= six-topic partition= 2, offset= 6, key= ImKey-17-one,value= ImValue-17-one
Context: Thread-name= Thread2, topic= six-topic partition= 2, offset= 7, key= ImKey-24-one,value= ImValue-24-one
Context: Thread-name= Thread2, topic= six-topic partition= 2, offset= 8, key= ImKey-32-one,value= ImValue-32-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 6, key= ImKey-18-one,value= ImValue-18-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 7, key= ImKey-19-one,value= ImValue-19-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 8, key= ImKey-20-one,value= ImValue-20-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 9, key= ImKey-22-one,value= ImValue-22-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 10, key= ImKey-23-one,value= ImValue-23-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 11, key= ImKey-26-one,value= ImValue-26-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 12, key= ImKey-28-one,value= ImValue-28-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 13, key= ImKey-30-one,value= ImValue-30-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 14, key= ImKey-31-one,value= ImValue-31-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 15, key= ImKey-33-one,value= ImValue-33-one
Context: Thread-name= Thread0, topic= six-topic partition= 0, offset= 16, key= ImKey-34-one,value= ImValue-34-one

可以看到三個線程,一個消費者組,每個線程消費者得到一個topic的分區去消費消息。

2. 單個消費者,多個線程處理消息

解釋: 消費者程序使用單或多線程獲取消息,同時創建多個消費線程執行消息處理邏輯。獲取消息的線程可以是一個或多個,每個維護專屬KafkaConsumer實例,處理消息交由特定線程池來做,從而實現消息獲取與消息處理的真正解耦。

在這裏插入圖片描述

這裏的多線程處理消息邏輯可以有多種方法,這裏就列出來幾種:

使用隊列存儲消息,多線程處理隊列:

  1. 使用獨立鎖(FIFO)隊列LinkedBlockingQueue

    該隊列是線程安全的先進先出隊列

    public class KafkaConsumerThread2 implements Runnable {
        // 存儲消息 先進先出隊列
        private LinkedBlockingQueue<ConsumerRecords<String,String>> list;
        private AtomicBoolean closed = new AtomicBoolean(false);
    
        public KafkaConsumerThread2() {
        }
    
        public KafkaConsumerThread2(LinkedBlockingQueue<ConsumerRecords<String, String>> list) {
            this.list = list;
        }
    
        @Override
        public void run() {
            // 線程名稱
            String threadName = Thread.currentThread().getName();
            // 處理消息
            while (!closed.get()){
                try {
                    ConsumerRecords<String, String> records = list.take();
                    System.out.println("消息數量"+records.count());
                    if (records.isEmpty()){
                        System.out.printf("隊列爲空,不消費數據,Thread-name= %s\n",threadName);
                    }else {
                        for (ConsumerRecord<String, String> record : records) {
                            Thread.sleep(3000);
                            System.out.printf("Context: Thread-name= %s, topic= %s partition= %s, offset= %d, key= %s,value= %s\n",threadName,record.topic(),record.partition(),record.offset(),record.key(),record.value());
                        }
                    }
                }catch (InterruptedException e){
                    e.printStackTrace();
                }
            }
        }
    
        public static void main(String[] args) {
            LinkedBlockingQueue<ConsumerRecords<String, String>> list = new LinkedBlockingQueue<>();
            Properties props = new Properties();
            props.put("bootstrap.servers", "10.33.68.68:9093");
            props.put("group.id", "thread-5");//消費者組,只要group.id相同,就屬於同一個消費者組
            props.put("enable.auto.commit", "true");//自動提交offset
            props.put("auto.offset.reset", "earliest");
            props.put("auto.commit.interval.ms", "1000");
            props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
            props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
            props.put("session.timeout.ms", "30000");
            props.put("max.poll.records",5);
            KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
            // 消費同一主題
            consumer.subscribe(Collections.singletonList("six-topic"));
            // 開啓三個線程處理隊列中的消息
            for (int i = 0; i <3 ; i++) {
                new Thread(new KafkaConsumerThread2(list),"thread-"+i).start();
            }
            while (true){
                ConsumerRecords<String, String> records = consumer.poll(1000);
                try {
                    list.put(records);
                    //Thread.sleep(3000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        }
    }
    
  2. 創建線程池,使用線程池處理消息邏輯

    邏輯處理類ConsumerDealThread:

    public class ConsumerDealThread implements Runnable{
        private ConsumerRecord record;
    
        public ConsumerDealThread(ConsumerRecord record) {
            this.record = record;
        }
    
        public void run() {
            try {
                Thread.sleep(2000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            System.out.printf("Context: Thread-name= %s, topic= %s partition= %s, offset= %d, key= %s,value= %s\n",Thread.currentThread().getName(),record.topic(),record.partition(),record.offset(),record.key(),record.value());
        }
    }
    

    運行類KafkaConsumerThread3:

    public class KafkaConsumerThread3 {
        public static void main(String[] args) {
            LinkedBlockingQueue<ConsumerRecords<String, String>> list = new LinkedBlockingQueue<>();
            Properties props = new Properties();
            props.put("bootstrap.servers", "10.33.68.68:9093");
            props.put("group.id", "thread-18");//消費者組,只要group.id相同,就屬於同一個消費者組
            props.put("enable.auto.commit", "true");//自動提交offset
            props.put("auto.offset.reset", "earliest");
            props.put("auto.commit.interval.ms", "1000");
            props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
            props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
            props.put("session.timeout.ms", "30000");
            props.put("max.poll.records",5);
            KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
            // 消費同一主題
            consumer.subscribe(Collections.singletonList("six-topic"));
            ExecutorService executor = new ThreadPoolExecutor(3, 3, 0, TimeUnit.MILLISECONDS,
                    new ArrayBlockingQueue<>(1000), new ThreadPoolExecutor.CallerRunsPolicy());
            while (true){
                ConsumerRecords<String, String> records = consumer.poll(1000);
                try {
                    for (ConsumerRecord<String, String> record : records) {
                        executor.submit(new ConsumerDealThread(record));
                    }
                } catch (Exception e) {
                    e.printStackTrace();
                    consumer.wakeup();
                    executor.shutdown();
                    try {
                        if (!executor.awaitTermination(5000, TimeUnit.MILLISECONDS)) {
                            System.out.println("超時,未關閉線程池");
                        }
                    } catch (InterruptedException e2) {
                        e.printStackTrace();
                    }
                }
                BlockingQueue<Runnable> queue = ((ThreadPoolExecutor) executor).getQueue();
                System.out.println("隊列數量:"+queue.size());
            }
        }
    }
    
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章