問題描述
TopicExistsException: Topic 'xxx' is marked for deletion.
在寫kafka工具時,有兩個方法:批量創建topic和批量刪除topic。
運維操作一般是,批量刪除一堆topic,然後再重建刪除的那些topic。在創建時,可能會遇到如下錯誤:
2022-09-19 08:37:55.150 INFO 20376 --- [nio-8080-exec-4] c.w.w.k.service.TopicManagerImpl : input topics num: 100, deleted topics num: 100
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TopicExistsException: Topic 'test-16' is marked for deletion.
at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165)
at cn.whu.wy.kafkamate.service.TopicManagerImpl.createTopics(TopicManagerImpl.java:56)
at cn.whu.wy.kafkamate.restapi.TopicController.createTopics(TopicController.java:24)
Caused by: org.apache.kafka.common.errors.TopicExistsException: Topic 'test-16' is marked for deletion.
這是因爲,AdminClient僅僅是將刪除topic的請求發送到服務端就返回了,服務端執行刪除topic是一個異步的複雜的過程。在服務端還沒真正刪除topic時,再次創建同名的topic,就會遇到上述錯誤。
StackOverflow上有相關問題,但暫無答案:
java - How can I make sure that a Kafka topic has been deleted? - Stack Overflow
這篇文章講了topic的刪除過程:
16 | TopicDeletionManager: Topic是怎麼被刪除的? (geekbang.org)
一些嘗試
1 在批量刪除topic之後再創建一個foo topic,試圖觸發kafka的刪除機制
public Object deleteTopics(Set<String> topics) throws ExecutionException, InterruptedException {
Object o = doDeleteTopics(topics);
createFooTopic4TriggeringDelete();
return o;
}
/**
* 由於kafka刪除topic不會立即生效,只是標記爲刪除。
* 該方法創建一個臨時topic,然後將其刪除,試圖快速觸發kafka的刪除機制
*/
private void createFooTopic4TriggeringDelete() throws ExecutionException, InterruptedException {
NewTopic foo = new NewTopic("foo_" + System.currentTimeMillis(), 1, (short) 1);
adminClient.createTopics(Collections.singleton(foo)).all().get();
adminClient.deleteTopics(Collections.singleton(foo.name())).all().get();
}
該方法有一定效果,或許是因爲createFooTopic4TriggeringDelete()
方法消耗了一些時間,topic在這段時間內正好被徹底刪除了。
測試邏輯:每次先刪除500個topic,然後再創建500個。
測試結果:可以堅持到第4輪。
delete topics: input size=500, actually deleted size=376, use 2256 ms.
create topics: input size=500, actually created size=500, use 7956 ms.
delete topics: input size=500, actually deleted size=500, use 1707 ms.
create topics: input size=500, actually created size=500, use 8164 ms.
delete topics: input size=500, actually deleted size=500, use 2131 ms.
create topics: input size=500, actually created size=500, use 8641 ms.
delete topics: input size=500, actually deleted size=500, use 1708 ms.
create topics:
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TopicExistsException: Topic 'test-217' is marked for deletion.
delete topics: input size=500, actually deleted size=351, use 1705 ms.
未使用該方法時,在第1輪就報錯了:
delete topics: input size=500, actually deleted size=500, use 2464 ms.
create topics:
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TopicExistsException: Topic 'test-145' is marked for deletion.
2 創建foo topic + sleep
在前面的基礎上,增加sleep 2s。測試發現,重複刪除/創建500個topic 30輪,未出現異常。
public Object deleteTopics(Set<String> topics) throws ExecutionException, InterruptedException {
Object o = doDeleteTopics(topics);
createFooTopic4TriggeringDelete();
TimeUnit.SECONDS.sleep(2);
return o;
}
3 使用topic uuid進行刪除
毫無效果。
private Object doDeleteTopics(Set<String> topics) throws ExecutionException, InterruptedException {
long start = System.currentTimeMillis();
Collection<TopicListing> existTopics = listTopics(false);
List<Uuid> topicsToDelete = existTopics.stream()
.filter(t -> topics.contains(t.name()))
.map(TopicListing::topicId)
.collect(Collectors.toList());
adminClient.deleteTopics(TopicCollection.ofTopicIds(topicsToDelete)).all().get();
long now = System.currentTimeMillis();
String info = String.format("delete topics: input size=%d, actually deleted size=%d, use %d ms.",
topics.size(), topicsToDelete.size(), now - start);
log.info(info);
return info;
}
結論
我看了前文提到的TopicDeletionManager,這是服務端的類,scala語言寫的。在客戶端這邊,我沒有找到與這個類相關的API。我期望的是客戶端這邊有一個回調方法,如onTopicDeleted(),當服務端確保topic刪除後,可以回調此方法。可惜沒找到。
如有大佬知道怎麼做,還請指教。