kafka只讓Producer自動創建Topic

背景

最近我們要做從mysql 到大數據平臺的數據流轉，定下的方案是maxwell同步binlog到kafka中，再由flink消費kafka的數據寫往kudu裏，最後利用kudu和hive利用impala提供的視圖作統一查詢，其中kudu保留近七天的數據，七天前數據滾動下沉到hive表。
maxwell實例和kafka topic的對應關係是一個maxwell任務對應一個mysql實例，對應kafka的多個Topic，爲了解決數據的有序性和傾斜的問題，最後還是採用一張表對應一個Topic的方式，所以現在的需求是maxwell producer 能自動創建topic，flink consumer不能自動創建topic。

操作

先刪除一個topic看看，從現象中可以看到刪除的一瞬間，topic又被自動創建出來了，kafka server的日誌如下

刪除Topic時的日誌
......
[2020-07-02 15:08:32,044] INFO Log for partition autoCreatTest-0 is renamed to /data/kafka-logs/autoCreatTest-0.f10b571f162249fa85dbcdfa2ae6fb01-delete and is scheduled for deletion (kafka.log.LogManager)
[2020-07-02 15:08:32,048] INFO Log for partition autoCreatTest-1 is renamed to /data/kafka-logs/autoCreatTest-1.3be2864a4fac4132826f09eecb1079d5-delete and is scheduled for deletion (kafka.log.LogManager)
[2020-07-02 15:08:32,049] INFO Log for partition autoCreatTest-2 is renamed to /data/kafka-logs/autoCreatTest-2.557a65f116284c4bb2560b1829a4ac3e-delete and is scheduled for deletion (kafka.log.LogManager)

創建Topic的日誌
[2020-07-02 15:08:32,555] INFO Creating topic autoCreatTest with configuration {} and initial partition assignment HashMap(0 -> ArrayBuffer(228, 229, 227), 1 -> ArrayBuffer(229, 227, 228), 2 -> ArrayBuffer(227, 228, 229)) (kafka.zk.AdminZkClient)
[2020-07-02 15:08:32,560] INFO [KafkaApi-228] Auto creation of topic autoCreatTest with 3 partitions and replication factor 3 is successful (kafka.server.KafkaApis)
......

由於我們使用的flink-connector-kafka-0.11_2.11，所以我們查看的是kafka-client-0.11.0.2代碼，可以看到創建topic這個動作由kafka.server.KafkaApis這個類觸發
看這個類代碼，搜索日誌內容"Auto creation of topic"發現在這裏調用的

  private def createTopic(topic: String,
                          numPartitions: Int,
                          replicationFactor: Int,
                          properties: Properties = new Properties()): MetadataResponse.TopicMetadata = {
    
    AdminUtils.createTopic(zkUtils, topic, numPartitions, 
    info("Auto creation of topic %s with %d partitions and replication factor %d is successful"
      .format(topic, numPartitions, replicationFactor))
    ......

再搜索createTopic再哪些地方被調用，發現有3處，其中兩處是創建內部topic:__consumer_offsets 和 __transaction_state,真正由我們調用的是這裏

  private def getTopicMetadata(allowAutoTopicCreation: Boolean, topics: Set[String], listenerName: ListenerName,
                               errorUnavailableEndpoints: Boolean): Seq[MetadataResponse.TopicMetadata] = {
    ......
    else if (allowAutoTopicCreation && config.autoCreateTopicsEnable) {
          createTopic(topic, config.numPartitions, config.defaultReplicationFactor)
        }
    ......
  }

可以看到是我們的消費者或者生產者在獲取topic的metadata的時候，如果集羣配置了auto.create.topics.enable=true,就會自動創建topic了，可是除了這個判斷條件我們發現還有一個條件就是allowAutoTopicCreation,繼續查看調用關係就可以發現了

 def handleTopicMetadataRequest(request: RequestChannel.Request) {
    val metadataRequest = request.body[MetadataRequest]
    ......
      else
        getTopicMetadata(metadataRequest.allowAutoTopicCreation, authorizedTopics, request.listenerName,
          errorUnavailableEndpoints)
    ......

 def handle(request: RequestChannel.Request) {
    ......
    try {
        case ApiKeys.METADATA => handleTopicMetadataRequest(request)
    }
    ......
  }

從這裏就可以差不多可以判定的到這個參數是有我們客戶端的request發過來,我們可以轉而去看client裏的consumer的代碼了,在kafka\clients\consumer\KafkaConsumer.java裏搜索metadata

public class KafkaConsumer<K, V> implements Consumer<K, V> {
 private final Metadata metadata;
   this.metadata = new Metadata(retryBackoffMs, config.getLong(ConsumerConfig.METADATA_MAX_AGE_CONFIG),
                    true, false, clusterResourceListeners);
}

看看MetaData的構造方法

public Metadata(long refreshBackoffMs, long metadataExpireMs, boolean allowAutoTopicCreation,
                    boolean topicExpiryEnabled, ClusterResourceListeners clusterResourceListeners)

到這裏大概明白了，0.11.0.2版本的client裏的kafkaconsumer在獲取metadata的時候，把allowAutoTopicCreation寫死成true了，難怪consumer也會自動創建topic了，知道了原因後面就好辦了，改成false重新打包，我們通過consumer demo來測試，果然創建不出來了。

    public static void main(String[] args) throws Exception {
        Properties props = new Properties();
        props.put("bootstrap.servers", "10.30.130.227:9093");
        props.put("group.id", "test");
        props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
        consumer.subscribe(Arrays.asList("autoCreateTest"));
        while (true) {
            ConsumerRecords<String, String> records = consumer.poll(100);
            for (ConsumerRecord<String, String> record : records)
                System.out.printf(" offset = %d, key = %s, value = %s % n",
                        record.offset(), record.key(), record.value());
        }
    }

本以爲到這裏就結束了，替換掉我們flink程序依賴的kafka-client版本，可是，之後打包還是自動創建了topic，繼續崩潰,通過IDEA的debug功能,結合kafka server的日誌，看topic再哪一行被創建的，先將斷點設置在FlinkKafkaConsumerBase 的run 方法的第一行，發現程序走到這裏，topic已經創建了，繼續把斷點往前設置，open的第一行,然後F7一行一行的走，

  FlinkKafkaConsumerBase-182: List<KafkaTopicPartition> allPartitions = this.partitionDiscoverer.discoverPartitions();
  ->
  AbstractPartitionDiscoverer-51:newDiscoveredPartitions = this.getAllPartitionsForTopics(this.topicsDescriptor.getFixedTopics());
  ->
  Kafka09PartitionDiscoverer-77:kafkaConsumer.partitionsFor(topic)
  ->
  KafkaConsumer-1389: Map<String, List<PartitionInfo>> topicMetadata = fetcher.getTopicMetadata(new MetadataRequest.Builder(Collections.singletonList(topic), true), requestTimeoutMs);
  ->
  MetadataRequest-49:
    public Builder(List<String> topics, boolean allowAutoTopicCreation) {
            super(ApiKeys.METADATA);
            this.topics = topics;
            this.allowAutoTopicCreation = allowAutoTopicCreation;
        }

奶奶個腿，終於發現問題了，原來flink配置了自動發現新的kafka分區，所以在真正消費前會獲取所有topic partition信息，在這一步他就把topic自動創建出來了，改完KafkaConsumer的1389行寫死的true改爲false，再打包，終於實現了flink消費者不自動創建topic的功能了。

後記

發現了MetadataRequest.Builder之後，再查一下所有consumer裏調用這個方法的地方，把寫死的ture都改爲false，徹底避免後患，再看kafka新版本的時候，發現在kafka2.3的時候，KAFKA-7320.已經優化了這個問題，可以給consumer傳遞一個配置allow.auto.create.topics=false即可禁用consumer自動創建topic功能了，可惜目前我們使用的flink版本不支持這麼高的客戶端，雖然浪費了不少時間，好歹也瞭解了一番kafka consumer的流程，這波操作不虧。

kafka只讓Producer自動創建Topic同時禁止consumer自動創建Topic

kafka只讓Producer自動創建Topic

背景

操作

後記

再談23種設計模式（3）：行爲型模式（學習筆記）

Power Automate Desktop 安裝完，登錄後老是提示one driver 錯誤

微前端學習筆記(4):從微前端到微模塊之EMP與hel-micro方案探索

微前端學習筆記（1）：微前端總體架構概述，從微服務發微

985 碩士程序員，空窗 4 個月沒有 Offer！

一文搞懂 Spring 循環依賴

賽博鬥地主——使用大語言模型扮演Agent智能體玩牌類遊戲。

VScode右鍵打開(添加到右鍵)

記一次 .NET某工控視覺自動化系統卡死分析

WindowsServer--SQL Server搭建主從同步實現讀寫分離 - 事務性分發

kafka使用mysql進行認證管理

修改源碼使kafka-console-consumer.sh支持從指定時間開始消費

kafka只讓Producer自動創建Topic同時禁止consumer自動創建Topic

Maven編譯系列（一）——Plugin

大數據平臺部署-----ambari在線和離線安裝

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結