問題
Flink 提交作業,直接報錯:
java.lang.NoSuchMethodError: org.apache.hadoop.tracing.TraceUtils.wrapHadoopConf(Ljava/lang/String;Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/htrace/core/HTraceConfiguration;
at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:689)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:673)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:155)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.viewfs.ChRootedFileSystem.<init>(ChRootedFileSystem.java:103)
at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.getTargetFileSystem(ViewFileSystem.java:173)
at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.getTargetFileSystem(ViewFileSystem.java:167)
at org.apache.hadoop.fs.viewfs.InodeTree.createLink(InodeTree.java:261)
at org.apache.hadoop.fs.viewfs.InodeTree.<init>(InodeTree.java:333)
at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.<init>(ViewFileSystem.java:167)
at org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:167)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:172)
at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:770)
at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:593)
at org.apache.flink.yarn.YarnClusterDescriptor.deployApplicationCluster(YarnClusterDescriptor.java:458)
at org.apache.flink.client.deployment.application.cli.ApplicationClusterDeployer.run(ApplicationClusterDeployer.java:67)
at org.apache.flink.client.cli.CliFrontend.runApplication(CliFrontend.java:213)
at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1057)
at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132)
可疑的Library
- hbase-shaded-client-1.4.3.jar
分析
查看2.7.x 的源碼如下:
import org.apache.htrace.HTraceConfiguration;
/**
* This class provides utility functions for tracing.
*/
@InterfaceAudience.Private
public class TraceUtils {
private static List<ConfigurationPair> EMPTY = Collections.emptyList();
public static HTraceConfiguration wrapHadoopConf(final String prefix,
final Configuration conf) {
return wrapHadoopConf(prefix, conf, EMPTY);
}
發現報錯的是
不仔細看,還真看不出來不是方法沒有,是返回值的包不對。
NoSuchMethod | TraceUtils實際的Method |
---|---|
public static HTraceConfiguration wrapHadoopConf(final String prefix, final Configuration conf) | public static HTraceConfiguration wrapHadoopConf(final String prefix, final Configuration conf) |
Lorg/apache/htrace/core/HTraceConfiguration; | 返回值 org.apache.htrace.HTraceConfiguration |
hadoop 的髮型版差異:
- Apache Hadoop Common:2.6.x 沒有TraceUtils這個類,2.7.x 纔有。
- Apache Hadoop Common: 2.6.0-CDH5.12.1 有這個類。
查看 2.6.0-CDH5.12.1的 TraceUtils
import org.apache.htrace.core.HTraceConfiguration;
查看 2.7.x 的TraceUtils
import org.apache.htrace.HTraceConfiguration;
所以錯誤發生在,2.6.0-cdh5.12.1的FsTracer通過hadoop-common 加載TraceUtils,而實際上加載了2.7.x的TraceUtils。
解決
- 方案1:刪除2.7.x的 hadoop-common,或者shade了這個包的library。
- 方案2:增加2.7.x的 hadoop-hdfs,它會加載相應的 2.7.x的hadoop-common,或者shade了這個包的 library。
自問自答
爲什麼要費這麼大勁兒解決這個問題?
- 因爲用戶使用的flink-shaded-hadoop-2-uber-xxxx-2.6.0 這種jar,他有問題!!內含hadoop-hdfs-2.7.1 之前的bug。
- 且Flink從1.11+起就放棄更新flink-shaded-hadoop-2-uber-jar。