HADOOP2.0,Exception java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/v2/app/MRAppMaster

一、問題

運行yarn的MR程序,發現出現問題,報錯:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/v2/app/MRAppMaster
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.v2.app.MRAppMaster
	at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)

這個問題在hadoop-mapreduce-user郵件列表上面有人討論過,地址:http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201207.mbox/browser 不過不深入。

二、分析

這個問題,很明顯一看就是類加載不到,我們肯定首先去看下這個類在哪裏,在包hadoop-mapreduce-client-app-2.0.0-alpha.jar中,路徑在$HADOOP_HOME/share/hadoop/mapreduce(在2.0版本中,後續我估計這個可能會調整)

這個我猜應該是classpath的問題,所以我很想弄到啓動container的時候的參數。

我們知道啓動是通過shell命令啓動,在ContainerLaunch.java中,我最終調試發現了啓動參數(下面的這段代碼其實最後會寫入到/tmp/nm-local-dir/nmPrivate/application_1350793073454_0005/container_1350793073454_0005_01_000001/launch_container.sh這樣類似的文件中):

#!/bin/bash

export YARN_LOCAL_DIRS="/tmp/nm-local-dir/usercache/yarn/appcache/application_1350707900707_0003"
export NM_HTTP_PORT="8042"
export JAVA_HOME="/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre"
export NM_HOST="hd19-vm4.yunti.yh.aliyun.com"
export CLASSPATH="$PWD:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/share/hadoop/common/*:$HADOOP_COMMON_HOME/share/hadoop/common/lib/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*:$YARN_HOME/share/hadoop/mapreduce/*:$YARN_HOME/share/hadoop/mapreduce/lib/*:job.jar:$PWD/*"
export HADOOP_TOKEN_FILE_LOCATION="/tmp/nm-local-dir/usercache/yarn/appcache/application_1350707900707_0003/container_1350707900707_0003_01_000001/container_tokens"
export APPLICATION_WEB_PROXY_BASE="/proxy/application_1350707900707_0003"
export JVM_PID="$$"
export USER="yarn"
export PWD="/tmp/nm-local-dir/usercache/yarn/appcache/application_1350707900707_0003/container_1350707900707_0003_01_000001"
export NM_PORT="49111"
export HOME="/home/"
export LOGNAME="yarn"
export APP_SUBMIT_TIME_ENV="1350788662618"
export HADOOP_CONF_DIR="/home/yarn/hadoop-2.0.0-alpha/conf"
export MALLOC_ARENA_MAX="4"
export AM_CONTAINER_ID="container_1350707900707_0003_01_000001"
ln -sf "/tmp/nm-local-dir/usercache/yarn/appcache/application_1350707900707_0003/filecache/-5059634618081520617/job.jar" "job.jar"
mkdir -p jobSubmitDir
ln -sf "/tmp/nm-local-dir/usercache/yarn/appcache/application_1350707900707_0003/filecache/8471400424465082106/appTokens" "jobSubmitDir/appTokens"
ln -sf "/tmp/nm-local-dir/usercache/yarn/appcache/application_1350707900707_0003/filecache/-511993817008097803/job.xml" "job.xml"
mkdir -p jobSubmitDir
ln -sf "/tmp/nm-local-dir/usercache/yarn/appcache/application_1350707900707_0003/filecache/5917092335430839370/job.split" "jobSubmitDir/job.split"
mkdir -p jobSubmitDir
ln -sf "/tmp/nm-local-dir/usercache/yarn/appcache/application_1350707900707_0003/filecache/5764499011863329844/job.splitmetainfo" "jobSubmitDir/job.splitmetainfo"
exec /bin/bash -c "$JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.mapreduce.container.log.dir=/tmp/logs/application_1350707900707_0003/container_1350707900707_0003_01_000001 -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/tmp/logs/application_1350707900707_0003/container_1350707900707_0003_01_000001/stdout 2>/tmp/logs/application_1350707900707_0003/container_1350707900707_0003_01_000001/stderr  "

classpath是:

"$PWD:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/share/hadoop/common/*:$HADOOP_COMMON_HOME/share/hadoop/common/lib/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*:$YARN_HOME/share/hadoop/mapreduce/*:$YARN_HOME/share/hadoop/mapreduce/lib/*:job.jar:$PWD/*"


其實這個是:yarn.application.classpath這個參數控制,這個默認的是:

  <property>
    <description>Classpath for typical applications.</description>
     <name>yarn.application.classpath</name>
     <value>
        $HADOOP_CONF_DIR,
        $HADOOP_COMMON_HOME/share/hadoop/common/*,
        $HADOOP_COMMON_HOME/share/hadoop/common/lib/*,
        $HADOOP_HDFS_HOME/share/hadoop/hdfs/*,
        $HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,
        $YARN_HOME/share/hadoop/mapreduce/*,
        $YARN_HOME/share/hadoop/mapreduce/lib/*
     </value>
  </property>

通過比較,那$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.0.0-alpha.jar應該在$YARN_HOME/share/hadoop/mapreduce/*這個中,現在的問題就是$YARN_HOME這個數值是多少?

在命令行下執行:

[yarn@hd19-vm2 ~]$ echo $YARN_HOME
/home/yarn/hadoop-2.0.0-alpha
是正確的。

那在launch_container.sh命令執行過程中,難道不起作用麼?這個就要從linux的環境變量說起了,參考:http://vbird.dic.ksu.edu.tw/linux_basic/0320bash_4.php

鳥哥講述了login 與 non-login shell的區別, non-login shell是不讀取~/.bash_profile這個文件啦,是讀取:~/.bashrc這個文件。(我們設置環境變量的時候大部分人會寫到~/.bash_profile文件中)

我們通過遠程調用shell及java調用shell的過程其實都不會讀取~/.bash_profile文件的。所以說launch_container.sh中也export了很多的環境變量了。這個主要是ContainerLaunch#sanitizeEnv()寫入的。

我們看到有export JAVA_HOME="/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre" 沒有export YARN_HOME=xxx的,所以執行launch_container.sh的時候,其實YARN_HOME是空的。

因爲System.getenv中沒有YARN_HOME所以在launch_container.sh也沒有export選項。(這個要看源碼ContainerLaunch.java#sanitizeEnv())

我們也看下jvm啓動的時候env:

System.getenv()
	 (java.util.Collections$UnmodifiableMap<K,V>) {HADOOP_PREFIX=/home/yarn/hadoop-2.0.0-alpha, SHLVL=2, JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64, YARN_LOG_DIR=/home/yarn/hadoop-2.0.0-alpha/logs, XFILESEARCHPATH=/usr/dt/app-defaults/%L/Dt, SSH_CLIENT=10.249.197.55 47859 22, MAIL=/var/mail/yarn, PWD=/home/yarn/hadoop-2.0.0-alpha, LOGNAME=yarn, CVS_RSH=ssh, G_BROKEN_FILENAMES=1, NLSPATH=/usr/dt/lib/nls/msg/%L/%N.cat, LD_LIBRARY_PATH=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/../lib/amd64, SSH_CONNECTION=10.249.197.55 47859 10.249.197.56 22, MALLOC_ARENA_MAX=4, SHELL=/bin/bash, YARN_ROOT_LOGGER=INFO,RFA, YARN_LOGFILE=yarn-yarn-nodemanager-hd19-vm2.yunti.yh.aliyun.com.log, PATH=/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin, USER=yarn, HOME=/home/yarn, LESSOPEN=|/usr/bin/lesspipe.sh %s, HADOOP_CONF_DIR=/home/yarn/hadoop-2.0.0-alpha/conf, LS_COLORS=, SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass, LANG=en_US.UTF-8, YARN_IDENT_STRING=yarn, YARN_NICENESS=0}

這些主要是.bashrc與hadoop啓動的命令產生的(其實從啓動機器上面也可以帶環境變量過來,大家可以做一個實驗 export a=b; ssh h2 "echo $a>test";ssh h2 "cat test";)


還有一點非常注意:. xx.sh 如果沒有export x 那x有效範圍就是調用的進程,YARN_HOME就是這麼弄的。

三、修正

那麼我們修改這個就是非常容易了,我們可以把YARN_HOME等設置在.bashrc中。設置的變量主要有:JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,YARN_HOME

其中JAVA_HOME一般會ssh的時候帶過去(當然需要所有機器的JAVA_HOME一致)、已經export的有HADOOP_CONF_DIR

或者修改$HADOOP_HOME/libexec中的代碼,把YARN_HOME等變量export 。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章