背景
最近我們要做從mysql 到大數據平臺的數據流轉,定下的方案是maxwell同步binlog到kafka中,再由flink消費kafka的數據寫往kudu裏,最後利用kudu和hive利用impala提供的視圖作統一查詢,其中kudu保留近七天的數據,七天前數據滾動下沉到hive表。
maxwell實例和kafka topic的對應關係是一個maxwell任務對應一個mysql實例,對應kafka的多個Topic,爲了解決數據的有序性和傾斜的問題,最後還是採用一張表對應一個Topic的方式,所以現在的需求是maxwell producer 能自動創建topic,flink consumer不能自動創建topic。
操作
先刪除一個topic看看,從現象中可以看到刪除的一瞬間,topic又被自動創建出來了,kafka server的日誌如下
刪除Topic時的日誌
......
[2020-07-02 15:08:32,044] INFO Log for partition autoCreatTest-0 is renamed to /data/kafka-logs/autoCreatTest-0.f10b571f162249fa85dbcdfa2ae6fb01-delete and is scheduled for deletion (kafka.log.LogManager)
[2020-07-02 15:08:32,048] INFO Log for partition autoCreatTest-1 is renamed to /data/kafka-logs/autoCreatTest-1.3be2864a4fac4132826f09eecb1079d5-delete and is scheduled for deletion (kafka.log.LogManager)
[2020-07-02 15:08:32,049] INFO Log for partition autoCreatTest-2 is renamed to /data/kafka-logs/autoCreatTest-2.557a65f116284c4bb2560b1829a4ac3e-delete and is scheduled for deletion (kafka.log.LogManager)
創建Topic的日誌
[2020-07-02 15:08:32,555] INFO Creating topic autoCreatTest with configuration {} and initial partition assignment HashMap(0 -> ArrayBuffer(228, 229, 227), 1 -> ArrayBuffer(229, 227, 228), 2 -> ArrayBuffer(227, 228, 229)) (kafka.zk.AdminZkClient)
[2020-07-02 15:08:32,560] INFO [KafkaApi-228] Auto creation of topic autoCreatTest with 3 partitions and replication factor 3 is successful (kafka.server.KafkaApis)
......
由於我們使用的flink-connector-kafka-0.11_2.11,所以我們查看的是kafka-client-0.11.0.2代碼,可以看到創建topic這個動作由kafka.server.KafkaApis這個類觸發
看這個類代碼,搜索日誌內容"Auto creation of topic"發現在這裏調用的
private def createTopic(topic: String,
numPartitions: Int,
replicationFactor: Int,
properties: Properties = new Properties()): MetadataResponse.TopicMetadata = {
AdminUtils.createTopic(zkUtils, topic, numPartitions,
info("Auto creation of topic %s with %d partitions and replication factor %d is successful"
.format(topic, numPartitions, replicationFactor))
......
再搜索createTopic再哪些地方被調用,發現有3處,其中兩處是創建內部topic:__consumer_offsets 和 __transaction_state,真正由我們調用的是這裏
private def getTopicMetadata(allowAutoTopicCreation: Boolean, topics: Set[String], listenerName: ListenerName,
errorUnavailableEndpoints: Boolean): Seq[MetadataResponse.TopicMetadata] = {
......
else if (allowAutoTopicCreation && config.autoCreateTopicsEnable) {
createTopic(topic, config.numPartitions, config.defaultReplicationFactor)
}
......
}
可以看到是我們的消費者或者生產者在獲取topic的metadata的時候,如果集羣配置了auto.create.topics.enable=true,就會自動創建topic了,可是除了這個判斷條件我們發現還有一個條件就是allowAutoTopicCreation,繼續查看調用關係就可以發現了
def handleTopicMetadataRequest(request: RequestChannel.Request) {
val metadataRequest = request.body[MetadataRequest]
......
else
getTopicMetadata(metadataRequest.allowAutoTopicCreation, authorizedTopics, request.listenerName,
errorUnavailableEndpoints)
......
def handle(request: RequestChannel.Request) {
......
try {
case ApiKeys.METADATA => handleTopicMetadataRequest(request)
}
......
}
從這裏就可以差不多可以判定的到這個參數是有我們客戶端的request發過來,我們可以轉而去看client裏的consumer的代碼了,在kafka\clients\consumer\KafkaConsumer.java裏搜索metadata
public class KafkaConsumer<K, V> implements Consumer<K, V> {
private final Metadata metadata;
this.metadata = new Metadata(retryBackoffMs, config.getLong(ConsumerConfig.METADATA_MAX_AGE_CONFIG),
true, false, clusterResourceListeners);
}
看看MetaData的構造方法
public Metadata(long refreshBackoffMs, long metadataExpireMs, boolean allowAutoTopicCreation,
boolean topicExpiryEnabled, ClusterResourceListeners clusterResourceListeners)
到這裏大概明白了,0.11.0.2版本的client裏的kafkaconsumer在獲取metadata的時候,把allowAutoTopicCreation寫死成true了,難怪consumer也會自動創建topic了,知道了原因後面就好辦了,改成false重新打包,我們通過consumer demo來測試,果然創建不出來了。
public static void main(String[] args) throws Exception {
Properties props = new Properties();
props.put("bootstrap.servers", "10.30.130.227:9093");
props.put("group.id", "test");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("autoCreateTest"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records)
System.out.printf(" offset = %d, key = %s, value = %s % n",
record.offset(), record.key(), record.value());
}
}
本以爲到這裏就結束了,替換掉我們flink程序依賴的kafka-client版本,可是,之後打包還是自動創建了topic,繼續崩潰,通過IDEA的debug功能,結合kafka server的日誌,看topic再哪一行被創建的,先將斷點設置在FlinkKafkaConsumerBase 的run 方法的第一行,發現程序走到這裏,topic已經創建了,繼續把斷點往前設置,open的第一行,然後F7一行一行的走,
FlinkKafkaConsumerBase-182: List<KafkaTopicPartition> allPartitions = this.partitionDiscoverer.discoverPartitions();
->
AbstractPartitionDiscoverer-51:newDiscoveredPartitions = this.getAllPartitionsForTopics(this.topicsDescriptor.getFixedTopics());
->
Kafka09PartitionDiscoverer-77:kafkaConsumer.partitionsFor(topic)
->
KafkaConsumer-1389: Map<String, List<PartitionInfo>> topicMetadata = fetcher.getTopicMetadata(new MetadataRequest.Builder(Collections.singletonList(topic), true), requestTimeoutMs);
->
MetadataRequest-49:
public Builder(List<String> topics, boolean allowAutoTopicCreation) {
super(ApiKeys.METADATA);
this.topics = topics;
this.allowAutoTopicCreation = allowAutoTopicCreation;
}
奶奶個腿,終於發現問題了,原來flink配置了自動發現新的kafka分區,所以在真正消費前會獲取所有topic partition信息,在這一步他就把topic自動創建出來了,改完KafkaConsumer的1389行寫死的true改爲false,再打包,終於實現了flink消費者不自動創建topic的功能了。
後記
發現了MetadataRequest.Builder之後,再查一下所有consumer裏調用這個方法的地方,把寫死的ture都改爲false,徹底避免後患,再看kafka新版本的時候,發現在kafka2.3的時候,KAFKA-7320.已經優化了這個問題,可以給consumer傳遞一個配置allow.auto.create.topics=false即可禁用consumer自動創建topic功能了,可惜目前我們使用的flink版本不支持這麼高的客戶端,雖然浪費了不少時間,好歹也瞭解了一番kafka consumer的流程,這波操作不虧。