kafka只讓Producer自動創建Topic同時禁止consumer自動創建Topic

kafka只讓Producer自動創建Topic

背景

    最近我們要做從mysql 到大數據平臺的數據流轉,定下的方案是maxwell同步binlog到kafka中,再由flink消費kafka的數據寫往kudu裏,最後利用kudu和hive利用impala提供的視圖作統一查詢,其中kudu保留近七天的數據,七天前數據滾動下沉到hive表。
    maxwell實例和kafka topic的對應關係是一個maxwell任務對應一個mysql實例,對應kafka的多個Topic,爲了解決數據的有序性和傾斜的問題,最後還是採用一張表對應一個Topic的方式,所以現在的需求是maxwell producer 能自動創建topic,flink consumer不能自動創建topic。

操作

    先刪除一個topic看看,從現象中可以看到刪除的一瞬間,topic又被自動創建出來了,kafka server的日誌如下

刪除Topic時的日誌
......
[2020-07-02 15:08:32,044] INFO Log for partition autoCreatTest-0 is renamed to /data/kafka-logs/autoCreatTest-0.f10b571f162249fa85dbcdfa2ae6fb01-delete and is scheduled for deletion (kafka.log.LogManager)
[2020-07-02 15:08:32,048] INFO Log for partition autoCreatTest-1 is renamed to /data/kafka-logs/autoCreatTest-1.3be2864a4fac4132826f09eecb1079d5-delete and is scheduled for deletion (kafka.log.LogManager)
[2020-07-02 15:08:32,049] INFO Log for partition autoCreatTest-2 is renamed to /data/kafka-logs/autoCreatTest-2.557a65f116284c4bb2560b1829a4ac3e-delete and is scheduled for deletion (kafka.log.LogManager)

創建Topic的日誌
[2020-07-02 15:08:32,555] INFO Creating topic autoCreatTest with configuration {} and initial partition assignment HashMap(0 -> ArrayBuffer(228, 229, 227), 1 -> ArrayBuffer(229, 227, 228), 2 -> ArrayBuffer(227, 228, 229)) (kafka.zk.AdminZkClient)
[2020-07-02 15:08:32,560] INFO [KafkaApi-228] Auto creation of topic autoCreatTest with 3 partitions and replication factor 3 is successful (kafka.server.KafkaApis)
......

由於我們使用的flink-connector-kafka-0.11_2.11,所以我們查看的是kafka-client-0.11.0.2代碼,可以看到創建topic這個動作由kafka.server.KafkaApis這個類觸發
看這個類代碼,搜索日誌內容"Auto creation of topic"發現在這裏調用的

  private def createTopic(topic: String,
                          numPartitions: Int,
                          replicationFactor: Int,
                          properties: Properties = new Properties()): MetadataResponse.TopicMetadata = {
    
    AdminUtils.createTopic(zkUtils, topic, numPartitions, 
    info("Auto creation of topic %s with %d partitions and replication factor %d is successful"
      .format(topic, numPartitions, replicationFactor))
    ......
  

再搜索createTopic再哪些地方被調用,發現有3處,其中兩處是創建內部topic:__consumer_offsets 和 __transaction_state,真正由我們調用的是這裏

  private def getTopicMetadata(allowAutoTopicCreation: Boolean, topics: Set[String], listenerName: ListenerName,
                               errorUnavailableEndpoints: Boolean): Seq[MetadataResponse.TopicMetadata] = {
    ......
    else if (allowAutoTopicCreation && config.autoCreateTopicsEnable) {
          createTopic(topic, config.numPartitions, config.defaultReplicationFactor)
        }
    ......
  }

可以看到是我們的消費者或者生產者在獲取topic的metadata的時候,如果集羣配置了auto.create.topics.enable=true,就會自動創建topic了,可是除了這個判斷條件我們發現還有一個條件就是allowAutoTopicCreation,繼續查看調用關係就可以發現了

 def handleTopicMetadataRequest(request: RequestChannel.Request) {
    val metadataRequest = request.body[MetadataRequest]
    ......
      else
        getTopicMetadata(metadataRequest.allowAutoTopicCreation, authorizedTopics, request.listenerName,
          errorUnavailableEndpoints)
    ......

 def handle(request: RequestChannel.Request) {
    ......
    try {
        case ApiKeys.METADATA => handleTopicMetadataRequest(request)
    }
    ......
  }

從這裏就可以差不多可以判定的到這個參數是有我們客戶端的request發過來,我們可以轉而去看client裏的consumer的代碼了,在kafka\clients\consumer\KafkaConsumer.java裏搜索metadata

public class KafkaConsumer<K, V> implements Consumer<K, V> {
 private final Metadata metadata;
   this.metadata = new Metadata(retryBackoffMs, config.getLong(ConsumerConfig.METADATA_MAX_AGE_CONFIG),
                    true, false, clusterResourceListeners);
}

看看MetaData的構造方法

public Metadata(long refreshBackoffMs, long metadataExpireMs, boolean allowAutoTopicCreation,
                    boolean topicExpiryEnabled, ClusterResourceListeners clusterResourceListeners) 

到這裏大概明白了,0.11.0.2版本的client裏的kafkaconsumer在獲取metadata的時候,把allowAutoTopicCreation寫死成true了,難怪consumer也會自動創建topic了,知道了原因後面就好辦了,改成false重新打包,我們通過consumer demo來測試,果然創建不出來了。

    public static void main(String[] args) throws Exception {
        Properties props = new Properties();
        props.put("bootstrap.servers", "10.30.130.227:9093");
        props.put("group.id", "test");
        props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
        consumer.subscribe(Arrays.asList("autoCreateTest"));
        while (true) {
            ConsumerRecords<String, String> records = consumer.poll(100);
            for (ConsumerRecord<String, String> record : records)
                System.out.printf(" offset = %d, key = %s, value = %s % n",
                        record.offset(), record.key(), record.value());
        }
    }

本以爲到這裏就結束了,替換掉我們flink程序依賴的kafka-client版本,可是,之後打包還是自動創建了topic,繼續崩潰,通過IDEA的debug功能,結合kafka server的日誌,看topic再哪一行被創建的,先將斷點設置在FlinkKafkaConsumerBase 的run 方法的第一行,發現程序走到這裏,topic已經創建了,繼續把斷點往前設置,open的第一行,然後F7一行一行的走,

  FlinkKafkaConsumerBase-182: List<KafkaTopicPartition> allPartitions = this.partitionDiscoverer.discoverPartitions();
  ->
  AbstractPartitionDiscoverer-51:newDiscoveredPartitions = this.getAllPartitionsForTopics(this.topicsDescriptor.getFixedTopics());
  ->
  Kafka09PartitionDiscoverer-77:kafkaConsumer.partitionsFor(topic)
  ->
  KafkaConsumer-1389: Map<String, List<PartitionInfo>> topicMetadata = fetcher.getTopicMetadata(new MetadataRequest.Builder(Collections.singletonList(topic), true), requestTimeoutMs);
  ->
  MetadataRequest-49:
    public Builder(List<String> topics, boolean allowAutoTopicCreation) {
            super(ApiKeys.METADATA);
            this.topics = topics;
            this.allowAutoTopicCreation = allowAutoTopicCreation;
        }

奶奶個腿,終於發現問題了,原來flink配置了自動發現新的kafka分區,所以在真正消費前會獲取所有topic partition信息,在這一步他就把topic自動創建出來了,改完KafkaConsumer的1389行寫死的true改爲false,再打包,終於實現了flink消費者不自動創建topic的功能了。

後記

發現了MetadataRequest.Builder之後,再查一下所有consumer裏調用這個方法的地方,把寫死的ture都改爲false,徹底避免後患,再看kafka新版本的時候,發現在kafka2.3的時候,KAFKA-7320.已經優化了這個問題,可以給consumer傳遞一個配置allow.auto.create.topics=false即可禁用consumer自動創建topic功能了,可惜目前我們使用的flink版本不支持這麼高的客戶端,雖然浪費了不少時間,好歹也瞭解了一番kafka consumer的流程,這波操作不虧。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章