Hadoop相關問題記錄

Eclipse Maven Missing artifact jdk.tools:jdk.tools:jar:1.6

問題描述:

       以前的項目,只是爲了解決問題而解決問題,沒有去考慮原因,現在由於重裝了系統導致之前的配置報錯了,遂找了找原因。

Eclipse Maven Missing artifact jdk.tools:jdk.tools:jar:1.6

原因:

       出現類似問題,大概就是pom.xml中的依賴包中需要使用類似jdk1.6/jdk1.7而你沒有,所以就包缺少這個jar包的問題。

別人的解決辦法:

       當時是解決了問題,也不知道設置了什麼,但是現在新系統無效了,所以怎麼解決也不重要了

<dependency>
    <groupId>jdk.tools</groupId>
    <artifactId>jdk.tools</artifactId>
    <version>1.6</version>
    <scope>system</scope>
    <systemPath>${JAVA_HOME}/lib/tools.jar</systemPath>
</dependency>

我的解決辦法:

       因爲我引入了hadoop的相關jar包,它需要使用jdk1.6,我是做法是讓它不去下載依賴的jdk,省略的是其他不下載的與我項目衝突的依賴。

	<dependency>
	  <groupId>org.apache.hadoop</groupId>
	  <artifactId>hadoop-common</artifactId>
	  <version>${hadoop.version}</version>
	  	<exclusions>
                        ......
			<exclusion>
				<artifactId>jdk.tools</artifactId>
				<groupId>jdk.tools</groupId>
			</exclusion>
		</exclusions>
	</dependency>

       當然不知道哪些其他的依賴還會去依賴不同的jdk,該問題只在Eclipse裏出現,IDEA中暫未出現該問題。

------------------------------------------------------------------------------------------------------------------------------------

MapReduce job報錯,資源不足

描述:

        在Eclipse中開發的MapReduce打成jar包放到服務器運行時,一直報資源不足的錯誤,在開發時想引入配置文件在運行jar包時生效省事。但是試了很多次,都是同樣的錯誤,所以總結是:程序自己引入的配置文件不生效,還是要修改服務器相關配置重啓。

問題:

Container [pid=29704,containerID=container_1528184652168_0006_01_000002] is running beyond physical memory limits.
 Current usage: 1.0 GB of 1 GB physical memory used; 3.1 GB of 2.1 GB virtual memory used. Killing container.Dump of the process-tree for container_1528184652168_0006_01_000002 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USA
GE(PAGES) FULL_CMD_LINE	|- 29719 29704 29704 29704 (java) 4165 318 3246665728 265504 /usr/java/jdk1.8.0_131/bin/java -Djava.net.pr
eferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Djava.net.preferIPv4Stack=true -Xmx820m -Djava.io.tmpdir=/yarn/nm/usercache/hdfs/appcache/application_1528184652168_0006/container_1528184652168_0006_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/yarn/container-logs/application_1528184652168_0006/container_1528184652168_0006_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 10.10.201.222 57724 attempt_1528184652168_0006_m_000000_0 2 	|- 29704 29700 29704 29704 (bash) 0 0 108650496 310 /bin/bash -c /usr/java/jdk1.8.0_131/bin/java -Djava.ne
t.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Djava.net.preferIPv4Stack=true -Xmx820m -Djava.io.tmpdir=/yarn/nm/usercache/hdfs/appcache/application_1528184652168_0006/container_1528184652168_0006_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/yarn/container-logs/application_1528184652168_0006/container_1528184652168_0006_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 10.10.201.222 57724 attempt_1528184652168_0006_m_000000_0 2 1>/yarn/container-logs/application_1528184652168_0006/container_1528184652168_0006_01_000002/stdout 2>/yarn/container-logs/application_1528184652168_0006/container_1528184652168_0006_01_000002/stderr  
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

參數:

1.0 GB:任務所佔的物理內存
1GB: mapreduce.map.memory.mb 參數默認設置大小
3.1 GB:程序佔用的虛擬內存
2.1 GB: mapreduce.map.memory.db 乘以 yarn.nodemanager.vmem-pmem-ratio 得到的

        其中 yarn.nodemanager.vmem-pmem-ratio 是 虛擬內存和物理內存比例,在yarn-site.xml中設置,默認是2.1
很明顯,container佔用了2.8G的虛擬內存,但是分配給container的卻只有2.1GB。所以kill掉了這個container

上面只是map中產生的報錯,當然也有可能在reduce中報錯,如果是reduce中,那麼就是mapreduce.reduce.memory.db * yarn.nodemanager.vmem-pmem-ratio

注:

        物理內存:真實的硬件設備(內存條)
        虛擬內存:利用磁盤空間虛擬出的一塊邏輯內存,用作虛擬內存的磁盤空間被稱爲交換空間(Swap Space)。(爲了滿足物理內存的不足而提出的策略)

        linux會在物理內存不足時,使用交換分區的虛擬內存。內核會將暫時不用的內存塊信息寫到交換空間,這樣以來,物理內存得到了釋放,這塊內存就可以用於其它目的,當需要用到原始的內容時,這些信息會被重新從交換空間讀入物理內存。

解決辦法:

        可以到linux服務器修改配置文件,重啓服務。我是通過CDH的瀏覽器管理界面(http://xx.xx.xx.xx:7180/cmf/home),進入YARN,進入配置文件頁面,修改的配置文件,然後在瀏覽器點擊重啓該服務就可以了。

因爲我的是maponly的MapReduce,所以我只修改了map,在修改完之後,管理頁面會提示重啓哪部分,點擊重啓即可。

參考文章:https://www.2cto.com/net/201805/747075.html

-----------------------------------------------------------------------------------------------------------------------------------

Windows本地Hadoop環境問題

java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
	at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:355)
	at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:370)
	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:363)
	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:78)
	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
	at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:257)
	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:234)
	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:749)
	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:734)
	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:607)
	at org.apache.hadoop.mapreduce.task.JobContextImpl.<init>(JobContextImpl.java:72)
	at org.apache.hadoop.mapreduce.Job.<init>(Job.java:133)
	at org.apache.hadoop.mapreduce.Job.<init>(Job.java:123)
	at org.apache.hadoop.mapreduce.Job.<init>(Job.java:128)
	at SolrIndexMapReduce.SeaIndexMRByPerson.createSubmittableJob(SeaIndexMRByPerson.java:217)
	at SolrIndexMapReduce.SeaIndexMRByPerson.main(SeaIndexMRByPerson.java:244)

配置好本地Hadoop環境配置,參考地址:https://www.cnblogs.com/honey01/p/8027193.html

本地Hadoop環境配置好後,運行報如上錯誤,說明缺少“winutils.exe”文件,只需要將該報放入Hadoop環境的bin目錄下即可

我的爲2.6.0版本,“winutils.exe”文件下載地址:鏈接:https://pan.baidu.com/s/1ISbhB427SQpP4Ez43R1vXw 提取碼:q6ye

	Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
	at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
	at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:570)
	at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977)
	at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:173)
	at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:160)
	at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:94)
	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:285)
	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:344)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115)
	at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:131)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:163)
	at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:432)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
	at SolrIndexMapReduce.HomePageIndexMR.main(HomePageIndexMR.java:283)

當出現如上錯誤時,說明缺少“hadoop.dll”文件,將該文件放置Hadoop的bin目錄下即可

我的爲2.6.0版本,“hadoop.dll”文件下載地址:鏈接:https://pan.baidu.com/s/1JwQGt5KDPLesQSXDiqxvAg 提取碼:rfmc

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章