一 前言
在hadoopdb\hive\hadoop源碼中,會有很多的測試主程序 ,而這些測試主程序通常都要用到configuration,即
new JobConf(conf)時初始化configuration對象,如果直接運行這些程序 ,可能會出現
只讀取jar包中的配置文件 ,並不讀取在conf路徑下重新定義的新配置文件。
二 解決方案
記得在項目的classpath中添加conf文件 路徑,或在應用程序的classpath中添加conf文件路徑
以configuration測試用例做說明:
org.apache.hadoop.conf.configuration類提供了一個專門用做測試的 main函數
1.在run->debug configurations ... 對話框左側邊欄找到java application,雙擊java application,會提示 創建新的調試程序 。
將Name 爲congfiguration,主程序選擇org.apache.hadoop.conf.Configuration,點擊 debug
會輸出以下默認信息
<?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration>
<property><name>fs.file.impl</name><value>org.apache.hadoop.fs.LocalFileSystem</value></property>
<property><name>hadoop.logfile.count</name><value>10</value></property>
<property><name>fs.har.impl.disable.cache</name><value>true</value></property>
<property><name>ipc.client.kill.max</name><value>10</value></property>
<property><name>fs.s3n.impl</name><value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value></property>
<property><name>io.mapfile.bloom.size</name><value>1048576</value></property>
<property><name>fs.s3.sleepTimeSeconds</name><value>10</value></property>
<property><name>fs.s3.block.size</name><value>67108864</value></property>
<property><name>fs.kfs.impl</name><value>org.apache.hadoop.fs.kfs.KosmosFileSystem</value></property>
<property><name>ipc.server.listen.queue.size</name><value>128</value></property>
<property><name>hadoop.util.hash.type</name><value>murmur</value></property>
<property><name>ipc.client.tcpnodelay</name><value>false</value></property>
<property><name>io.file.buffer.size</name><value>4096</value></property>
<property><name>fs.s3.buffer.dir</name><value>${hadoop.tmp.dir}/s3</value></property>
<property><name>hadoop.tmp.dir</name><value>/tmp/hadoop-${user.name}</value></property>
<property><name>fs.trash.interval</name><value>0</value></property>
<property><name>io.seqfile.sorter.recordlimit</name><value>1000000</value></property>
<property><name>fs.ftp.impl</name><value>org.apache.hadoop.fs.ftp.FTPFileSystem</value></property>
<property><name>fs.checkpoint.size</name><value>67108864</value></property>
<property><name>fs.checkpoint.period</name><value>3600</value></property>
<property><name>fs.hftp.impl</name><value>org.apache.hadoop.hdfs.HftpFileSystem</value></property>
<property><name>hadoop.native.lib</name><value>true</value></property>
<property><name>fs.hsftp.impl</name><value>org.apache.hadoop.hdfs.HsftpFileSystem</value></property>
<property><name>ipc.client.connect.max.retries</name><value>10</value></property>
<property><name>fs.har.impl</name><value>org.apache.hadoop.fs.HarFileSystem</value></property>
<property><name>fs.s3.maxRetries</name><value>4</value></property>
<property><name>topology.node.switch.mapping.impl</name><value>org.apache.hadoop.net.ScriptBasedMapping</value></property>
<property><name>hadoop.logfile.size</name><value>10000000</value></property>
<property><name>fs.checkpoint.dir</name><value>${hadoop.tmp.dir}/dfs/namesecondary</value></property>
<property><name>fs.checkpoint.edits.dir</name><value>${fs.checkpoint.dir}</value></property>
<property><name>topology.script.number.args</name><value>100</value></property>
<property><name>fs.s3.impl</name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property>
<property><name>ipc.client.connection.maxidletime</name><value>10000</value></property>
<property><name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec</value></property>
<property><name>ipc.server.tcpnodelay</name><value>false</value></property>
<property><name>io.serializations</name><value>org.apache.hadoop.io.serializer.WritableSerialization</value></property>
<property><name>ipc.client.idlethreshold</name><value>4000</value></property>
<property><name>fs.hdfs.impl</name><value>org.apache.hadoop.hdfs.DistributedFileSystem</value></property>
<property><name>io.bytes.per.checksum</name><value>512</value></property>
<property><name>io.mapfile.bloom.error.rate</name><value>0.005</value></property>
<property><name>io.seqfile.lazydecompress</name><value>true</value></property>
<property><name>local.cache.size</name><value>10737418240</value></property>
<property><name>hadoop.security.authorization</name><value>false</value></property>
<property><name>hadoop.rpc.socket.factory.class.default</name><value>org.apache.hadoop.net.StandardSocketFactory</value></property>
<property><name>io.skip.checksum.errors</name><value>false</value></property>
<property><name>io.seqfile.compress.blocksize</name><value>1000000</value></property>
<property><name>fs.ramfs.impl</name><value>org.apache.hadoop.fs.InMemoryFileSystem</value></property>
<property><name>webinterface.private.actions</name><value>false</value></property>
<property><name>fs.default.name</name><value>file:///</value></property>
</configuration>
2.打開debug configurations ... 對話框,在configuration 調試程序classpath選項卡中加入$HADOOP_HOME/conf文件夾。如果在conf文件 中有自定義的配置文件
就會在下面的輸出中看到自定義配置覆蓋了默認配置。
<?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration>
<property><name>hadoopdb.config.replication</name><value>false</value></property>
<property><name>fs.file.impl</name><value>org.apache.hadoop.fs.LocalFileSystem</value></property>
<property><name>hadoop.logfile.count</name><value>10</value></property>
<property><name>fs.har.impl.disable.cache</name><value>true</value></property>
<property><name>ipc.client.kill.max</name><value>10</value></property>
<property><name>fs.s3n.impl</name><value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value></property>
<property><name>io.mapfile.bloom.size</name><value>1048576</value></property>
<property><name>fs.s3.sleepTimeSeconds</name><value>10</value></property>
<property><name>fs.s3.block.size</name><value>67108864</value></property>
<property><name>fs.kfs.impl</name><value>org.apache.hadoop.fs.kfs.KosmosFileSystem</value></property>
<property><name>ipc.server.listen.queue.size</name><value>128</value></property>
<property><name>hadoop.util.hash.type</name><value>murmur</value></property>
<property><name>ipc.client.tcpnodelay</name><value>false</value></property>
<property><name>io.file.buffer.size</name><value>4096</value></property>
<property><name>fs.s3.buffer.dir</name><value>${hadoop.tmp.dir}/s3</value></property>
<property><name>hadoop.tmp.dir</name><value>/tmp/hadoop-${user.name}</value></property>
<property><name>fs.trash.interval</name><value>0</value></property>
<property><name>io.seqfile.sorter.recordlimit</name><value>1000000</value></property>
<property><name>fs.ftp.impl</name><value>org.apache.hadoop.fs.ftp.FTPFileSystem</value></property>
<property><name>fs.checkpoint.size</name><value>67108864</value></property>
<property><name>fs.checkpoint.period</name><value>3600</value></property>
<property><name>fs.hftp.impl</name><value>org.apache.hadoop.hdfs.HftpFileSystem</value></property>
<property><name>hadoopdb.fetch.size</name><value>1000</value></property> #變化部分
<property><name>hadoop.native.lib</name><value>true</value></property>
<property><name>fs.hsftp.impl</name><value>org.apache.hadoop.hdfs.HsftpFileSystem</value></property>
<property><name>ipc.client.connect.max.retries</name><value>10</value></property>
<property><name>fs.har.impl</name><value>org.apache.hadoop.fs.HarFileSystem</value></property>
<property><name>fs.s3.maxRetries</name><value>4</value></property>
<property><name>topology.node.switch.mapping.impl</name><value>org.apache.hadoop.net.ScriptBasedMapping</value></property>
<property><name>hadoop.logfile.size</name><value>10000000</value></property>
<property><name>fs.checkpoint.dir</name><value>${hadoop.tmp.dir}/dfs/namesecondary</value></property>
<property><name>fs.checkpoint.edits.dir</name><value>${fs.checkpoint.dir}</value></property>
<property><name>topology.script.number.args</name><value>100</value></property> #被新值覆蓋
<property><name>fs.s3.impl</name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property>
<property><name>ipc.client.connection.maxidletime</name><value>10000</value></property>
<property><name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec</value></property>
<property><name>hadoopdb.config.file</name><value>HadoopDB.xml</value></property> #添加新屬性
<property><name>ipc.server.tcpnodelay</name><value>false</value></property>
<property><name>io.serializations</name><value>org.apache.hadoop.io.serializer.WritableSerialization</value></property>
<property><name>ipc.client.idlethreshold</name><value>4000</value></property>
<property><name>fs.hdfs.impl</name><value>org.apache.hadoop.hdfs.DistributedFileSystem</value></property>
<property><name>io.bytes.per.checksum</name><value>512</value></property>
<property><name>io.mapfile.bloom.error.rate</name><value>0.005</value></property>
<property><name>io.seqfile.lazydecompress</name><value>true</value></property>
<property><name>local.cache.size</name><value>10737418240</value></property>
<property><name>hadoop.security.authorization</name><value>false</value></property>
<property><name>hadoop.rpc.socket.factory.class.default</name><value>org.apache.hadoop.net.StandardSocketFactory</value></property>
<property><name>io.skip.checksum.errors</name><value>false</value></property>
<property><name>io.seqfile.compress.blocksize</name><value>1000000</value></property>
<property><name>fs.ramfs.impl</name><value>org.apache.hadoop.fs.InMemoryFileSystem</value></property>
<property><name>webinterface.private.actions</name><value>false</value></property>
<property><name>fs.default.name</name><value>hdfs://localhost:9000</value></property> #被新值覆蓋
</configuration>
三 總結
這種情況是因爲在configuration類內部,對資源 的讀取採用以下方式
ClassLoader cL = Thread.currentThread().getContextClassLoader();
if (cL == null) {
cL = Configuration.class.getClassLoader();
}
然後 使用cL.getResource("XXX.xml")來獲取資源,這些資源 都是存放在類所在路徑 的根目錄 中的
如
|-com.cn.test
|-Test.class
|-test2.txt
|-test1.txt
com.cn.test.class.getClassLoader.getResource("test1.txt") 返回test1.txt
com.cn.test.class.getClassLoader.getResource("test2.txt")返回null
此時,如果要在增加其它 的資源 ,可以在classpath中加入其它路徑。