java.net.UnknownHostException 錯誤排查

x帥timeline項目起不來,連接phoenix的時候,報錯java.net.UnknownHostException: beh,此錯誤以前遇到過,沒有深入研究,大概上是因爲無法找到hdfs的主機,一般上這個地址是在hdfs-site.xml和core-site.xml配置中配置的。

本機測試,發現沒有問題,然後採用對比排除法(即逐步對比x帥的配置與自己的配置),最後因爲這個坑,導致自己在替換的過程中,沒有做到完全客觀,忽略了resource路徑下的hbase-site.xml文件,導致最後自己的項目在配置未更換的情況下也起不來了。

後更換164集羣進行測試,結果依然。

還是懷疑是配置問題,但是已經沒有線索,只能根據堆棧,生啃代碼。
報錯如下:

11:26:44.779 [main] WARN  c.b.m.t.p.DefaultPhoenixDataSource - Unable to connect to HBase store using Phoenix.
java.sql.SQLException: ERROR 103 (08004): Unable to establish connection.
    at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422)
    at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
    at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:393)
    at org.apache.phoenix.query.ConnectionQueryServicesImpl.access$300(ConnectionQueryServicesImpl.java:211)
    at org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(ConnectionQueryServicesImpl.java:2272)
    at org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(ConnectionQueryServicesImpl.java:2251)
    at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:78)
    at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2251)
    at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:233)
    at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:135)
    at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:202)
    at java.sql.DriverManager.getConnection(DriverManager.java:664)
    at java.sql.DriverManager.getConnection(DriverManager.java:270)
    at com.bonc.manager.timeline.phoenix.DefaultPhoenixDataSource.getConnection(DefaultPhoenixDataSource.java:141)
    at com.bonc.manager.timeline.phoenix.PhoenixHBaseAccessor.getConnection(PhoenixHBaseAccessor.java:119)
    at com.bonc.manager.timeline.phoenix.PhoenixHBaseAccessor.initMetricSchema(PhoenixHBaseAccessor.java:143)
    at com.bonc.manager.timeline.phoenix.aggregate.TimelineMetricStore.init(TimelineMetricStore.java:36)
    at com.bonc.manager.timeline.listener.ApplicationStartup.initPhoenix(ApplicationStartup.java:35)
    at com.bonc.manager.timeline.listener.ApplicationStartup.onApplicationEvent(ApplicationStartup.java:28)
    at com.bonc.manager.timeline.listener.ApplicationStartup.onApplicationEvent(ApplicationStartup.java:1)
    at org.springframework.context.event.SimpleApplicationEventMulticaster.invokeListener(SimpleApplicationEventMulticaster.java:166)
    at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:138)
    at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:382)
    at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:336)
    at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:877)
    at org.springframework.boot.context.embedded.EmbeddedWebApplicationContext.finishRefresh(EmbeddedWebApplicationContext.java:144)
    at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:544)
    at org.springframework.boot.context.embedded.EmbeddedWebApplicationContext.refresh(EmbeddedWebApplicationContext.java:122)
    at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:761)
    at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:371)
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:315)
    at com.bonc.manager.timeline.ApplicationService.main(ApplicationService.java:28)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
    at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)
    at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:433)
    at org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:342)
    at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144)
    at org.apache.phoenix.query.HConnectionFactory$HConnectionFactoryImpl.createConnection(HConnectionFactory.java:47)
    at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:390)
    ... 29 common frames omitted
Caused by: java.lang.reflect.InvocationTargetException: null
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
    ... 34 common frames omitted
Caused by: java.lang.ExceptionInInitializerError: null
    at org.apache.hadoop.hbase.ClusterId.parseFrom(ClusterId.java:64)
    at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:75)
    at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
    at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:919)
    at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:657)
    ... 39 common frames omitted
Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: beh
    at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
    at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310)
    at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:708)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:651)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2696)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2733)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2715)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:382)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
    at org.apache.hadoop.hbase.util.DynamicClassLoader.initTempDir(DynamicClassLoader.java:120)
    at org.apache.hadoop.hbase.util.DynamicClassLoader.<init>(DynamicClassLoader.java:98)
    at org.apache.hadoop.hbase.protobuf.ProtobufUtil.<clinit>(ProtobufUtil.java:241)
    ... 44 common frames omitted
Caused by: java.net.UnknownHostException: beh
    ... 59 common frames omitted

根據堆棧,主要問題是hdfs創建的時候無法解析域名,懷疑是配置文件的問題,首先確認了 org.apache.hadoop.fs.FileSystem.createFileSystem方法的參數conf中hbase.rootdir的配置確實爲hdfs://beh/hbase;
然後跟蹤conf初始化的地方爲ProtobufUtil類,該類爲protobuf通信工具類。
關鍵語句即爲如下代碼

    Configuration conf = HBaseConfiguration.create();

通過跟蹤發現,conf會包含core-default.xml,hdfs-default.xml,hbase-default.xml,hbase-site.xml四個文件。
通過輸出conf的配置值也能確認確實是這四個配置,現在的問題是這四個配置文件的位置。

因爲phoenix連接只需要提供zookeeper地址,所以我一直以爲配置文件存在zookeeper中,但查看zookeeper存儲,並未發現這些配置。

沒有線索,繼續跟代碼,通過跟蹤發現,Configuration類的loadResource是最終加載配置的方法。
跟蹤hbase-site.xml的加載過程,有意外發現,代碼進入第一個else if,即配置參數被當成字符串處理,比較奇怪的是這個url,處理後該值是
F:\work\svn\BEH-Manager\BEH-Manager\BEH-Manager-timeline\target\classes\hbase-site.xml。

      if (resource instanceof URL) {                  // an URL resource
        doc = parse(builder, (URL)resource);
      } else if (resource instanceof String) {        // a CLASSPATH resource
        URL url = getResource((String)resource);
        doc = parse(builder, url);

囧囧囧,原來是意外將hbase-site.xml放到resource中後,它意外的發揮了資源文件的作用,被hbase讀取後引起問題。

將該配置文件挪到其他位置後,問題果然消失。

跟蹤挪走後的配置加載,發現整個加載過程使用的都是默認配置,並未使用集羣實際配置。
規到底,是對配置文件的加載理解不甚詳細導致。

附:
在此需要感謝idea強大的跟蹤源代碼和debug能力,簡單說兩個使用的特點。
1. 自動反編譯功能,就是將class文件反編譯成java文件,並且可以斷點跟蹤。
2. 在1的基礎上,自動下載源代碼,進行匹配,這個也比較爽。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章