Eclipse打包mapreduce程序並提交至hadoop集羣運行

在命令行裏能夠將程序運行在hadoop集羣環境後,將Eclipse裏的各種配置也相應配好,點擊run on hadoop。

作業成功運行,hdfs上能夠看到結果,但是仍然,沒有提交至真正的集羣環境。

查了好久資料,直接在代碼中指定遠程jobtracker地址,仍然未果。

於是在Eclipse裏調試程序,運行成功後打成jar包上傳至hadoop集羣中運行:

直接export,保證jar文件的META-INF/MANIFEST.MF文件中存在Main-Class映射:

Main-Class: WordCount

其實直接next自動文件裏就有這個關係。

將打好的jar上傳至服務器,假設在/opt目錄下,則命令:

hadoop jar /opt/myWordCount.jar WordCount /test_in /output12

報錯:

xception in thread "main" java.lang.UnsupportedClassVersionError: WordCount : Unsupported major.minor version 52.0
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:270)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:205)

網上查資料,懷疑是java版本不同導致,win7上的Eclipse是java1.8.而服務器上的是java1.7

在Eclipse裏面 windows--preference--java--compile--compile level,選擇1.7

重新導入運行

出現錯誤:

14/11/07 10:33:46 INFO ipc.Client: Retrying connect to server: hadoop-05/192.168.0.7:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/11/07 10:33:47 INFO ipc.Client: Retrying connect to server: hadoop-05/192.168.0.7:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/11/07 10:33:48 INFO ipc.Client: Retrying connect to server: hadoop-05/192.168.0.7:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/11/07 10:33:49 INFO ipc.Client: Retrying connect to server: hadoop-05/192.168.0.7:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/11/07 10:33:50 INFO ipc.Client: Retrying connect to server: hadoop-05/192.168.0.7:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/11/07 10:33:51 INFO ipc.Client: Retrying connect to server: hadoop-05/192.168.0.7:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/11/07 10:33:52 INFO ipc.Client: Retrying connect to server: hadoop-05/192.168.0.7:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

resourcemanager連不上。檢查yarn-site.xml都配置好了

但是發現端口號與默認的端口號不一致,於是修改

配置文件改爲如下:

        <property>
                <name>yarn.resourcemanager.address</name>
                <value>localhost:8032</value>
        </property>
        <property>
                <name>yarn.resourcemanager.scheduler.address</name>
                <value>localhost:8030</value>
        </property>
        <property>
                <name>yarn.resourcemanager.resource-tracker.address</name>
                <value>localhost:8031</value>
        </property>
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>192.168.0.7</value>
        </property>

重新運行,仍然出現同樣錯誤,於是將代碼中顯式指定的job.tracker註釋掉。

竟然又出現錯誤:

Usage: wordcount <in> <out>

檢查代碼,發現這是因爲輸入參數不是兩個而導致。但是檢查了命令沒有發現錯誤,只能將路徑寫死在程序中,再打jar包

  FileInputFormat.addInputPath(job, new Path("hdfs://192.168.0.7:9000/test_in"));
  FileOutputFormat.setOutputPath(job, new Path("hdfs://192.168.0.7:9000/out1"));

提交至hadoop集羣,結果出來了。


但是還是沒有想通爲什麼路徑寫在外面不可以。先記錄 mark下






發佈了53 篇原創文章 · 獲贊 8 · 訪問量 9萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章