kafka在Windows上使用的一個bug

kafka在Windows 7 64上使用的一個bug:

問題描述

刪除topic時導致集羣崩潰,報錯ERROR Shutdown broker because all log dirs in D:\tmp\kafka-logs have failed.

測試了kafka_2.11-1.1.0、kafka_2.13-2.5.0、kafka_2.13-2.6.2、kafka_2.13-2.7.0四個版本,都有這個問題。

搜索了網絡,發現這個bug很早之前就提出了,但至今沒解決。

Linux上無此問題。

使用如下命名即可復現:

D:\kafka_2.13-2.6.2>bin\windows\kafka-topics.bat --bootstrap-server 127.0.0.1:9092 --create --topic testwy
Created topic testwy.

D:\kafka_2.13-2.6.2>bin\windows\kafka-topics.bat --bootstrap-server 127.0.0.1:9092 --list
testwy

D:\kafka_2.13-2.6.2>bin\windows\kafka-topics.bat --bootstrap-server 127.0.0.1:9092 --delete --topic testwy

此時集羣崩潰,zookeeper日誌:

[2021-07-05 16:32:52,965] WARN Exception causing close of session 0x1003d29550c0000: 遠程主機強迫關閉了一個現有的連接。 (org.apache.zookeeper.server.NIOServerCnxn)
[2021-07-05 16:33:11,170] INFO Expiring session 0x1003d29550c0000, timeout of 18000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer)

但zk仍然在運行。

kafka server日誌:

[2021-07-05 16:32:52,574] INFO [GroupCoordinator 0]: Removed 0 offsets associated with deleted partitions: testwy-0. (kafka.coordinator.group.GroupCoordinator)
[2021-07-05 16:32:52,599] INFO [ReplicaFetcherManager on broker 0] Removed fetcher for partitions Set(testwy-0) (kafka.server.ReplicaFetcherManager)
[2021-07-05 16:32:52,600] INFO [ReplicaAlterLogDirsManager on broker 0] Removedfetcher for partitions Set(testwy-0) (kafka.server.ReplicaAlterLogDirsManager)
[2021-07-05 16:32:52,607] INFO [ReplicaFetcherManager on broker 0] Removed fetcher for partitions Set(testwy-0) (kafka.server.ReplicaFetcherManager)
[2021-07-05 16:32:52,608] INFO [ReplicaAlterLogDirsManager on broker 0] Removedfetcher for partitions Set(testwy-0) (kafka.server.ReplicaAlterLogDirsManager)
[2021-07-05 16:32:52,615] ERROR Error while renaming dir for testwy-0 in log dir D:\tmp\kafka-logs (kafka.server.LogDirFailureChannel)
java.nio.file.AccessDeniedException: D:\tmp\kafka-logs\testwy-0 -> D:\tmp\kafka-logs\testwy-0.a61ed5f8a99e4df58e8bf86c6c5e537c-delete
        at java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:89)
        at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
        at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:395)
        at java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:288)
        at java.base/java.nio.file.Files.move(Files.java:1421)
        at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:917)
        at kafka.log.Log.$anonfun$renameDir$2(Log.scala:1012)
        at kafka.log.Log.renameDir(Log.scala:2387)
        at kafka.log.LogManager.asyncDelete(LogManager.scala:973)
        at kafka.log.LogManager.$anonfun$asyncDelete$3(LogManager.scala:1008)
        at kafka.log.LogManager.$anonfun$asyncDelete$2(LogManager.scala:1006)
        at kafka.log.LogManager.$anonfun$asyncDelete$2$adapted(LogManager.scala:1004)
        at scala.collection.mutable.HashSet$Node.foreach(HashSet.scala:435)
        at scala.collection.mutable.HashSet.foreach(HashSet.scala:361)
        at kafka.log.LogManager.asyncDelete(LogManager.scala:1004)
        at kafka.server.ReplicaManager.stopReplicas(ReplicaManager.scala:481)
        at kafka.server.KafkaApis.handleStopReplicaRequest(KafkaApis.scala:271)
        at kafka.server.KafkaApis.handle(KafkaApis.scala:142)
        at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:74)
        at java.base/java.lang.Thread.run(Thread.java:834)
        Suppressed: java.nio.file.AccessDeniedException: D:\tmp\kafka-logs\testwy-0 -> D:\tmp\kafka-logs\testwy-0.a61ed5f8a99e4df58e8bf86c6c5e537c-delete
                at java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:89)
                at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
                at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:309)
                at java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:288)
                at java.base/java.nio.file.Files.move(Files.java:1421)
                at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:914)
                ... 14 more
[2021-07-05 16:32:52,630] WARN [ReplicaManager broker=0] Stopping serving replicas in dir D:\tmp\kafka-logs (kafka.server.ReplicaManager)
[2021-07-05 16:32:52,641] WARN [ReplicaManager broker=0] Broker 0 stopped fetcher for partitions  and stopped moving logs for partitions  because they are in the failed log directory D:\tmp\kafka-logs. (kafka.server.ReplicaManager)
[2021-07-05 16:32:52,643] WARN Stopping serving logs in dir D:\tmp\kafka-logs (kafka.log.LogManager)
[2021-07-05 16:32:52,647] ERROR Shutdown broker because all log dirs in D:\tmp\kafka-logs have failed (kafka.log.LogManager)

kafka server直接關閉了。

解決辦法

直接重啓kafka-sever依然會報錯。只能刪除tmp目錄下的全部log文件。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章