在HUE平臺實現hive on spark 和sparksql 功能並存(hadoop-3.1.2+hive-3.1.1+spark-2.3.3+hue-4.3.0)

背景

我先說明一下我之前的情況,我先實現了hive on spark的功能,並在本地虛擬機中實現hive on spark 計算(三臺虛擬機(5G+3G+4G)5G和4G的虛擬機是放在nvme固態硬盤上的(500G 讀3500m寫2500n西數黑盤)) ,也同時想嘗試下sparksql這個功能,但是去查了很多博客後發現,sparksql和hive on spark是兩種對spark不同的編譯方式,我不想再編譯一次,我就想怎麼能把這兩種功能都添加進來。

先上結果圖

sparksql

hive on spark 

嘗試

hive on spark 編譯完成之後的spark是沒有集成hive的jar包的,所以我就先運行了一下./start-thriftserver.sh --master yarn --deploy-mode client

starting org.apache.spark.sql.hive.thriftserver.HiveThriftServer2, logging to /usr/local/spark-2.3.3/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-master.out
failed to launch: nice -n 0 bash /usr/local/spark-2.3.3/bin/spark-submit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --name Thrift JDBC/ODBC Server --master yarn
  	at org.apache.spark.util.Utils$.classForName(Utils.scala:239)
  	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:851)
  	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
  	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
  	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
  	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
  Failed to load main class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.
  You need to build Spark with -Phive and -Phive-thriftserver.
  2019-03-14 18:53:42,524 INFO util.ShutdownHookManager: Shutdown hook called
  2019-03-14 18:53:42,525 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-40240fda-9b4e-47f1-801b-855dd8f7d4ea
full log in /usr/local/spark-2.3.3/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-master.out

錯誤日誌

2019-03-14 18:53:42,419 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
java.lang.ClassNotFoundException: org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at org.apache.spark.util.Utils$.classForName(Utils.scala:239)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:851)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Failed to load main class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.
You need to build Spark with -Phive and -Phive-thriftserver.
2019-03-14 18:53:42,524 INFO util.ShutdownHookManager: Shutdown hook called
2019-03-14 18:53:42,525 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-40240fda-9b4e-47f1-801b-855dd8f7d4ea

博客https://blog.csdn.net/mnasd/article/details/80398975 介紹他完成編譯之後將這兩個文件放在了jars中 ,我也將spark官方下載的包中的同名的jar包也放在了jars中(spark-hive-thriftserver_2.11-2.3.3.jar,spark-hive_2.11-2.3.3.jar)

又有錯誤了

starting org.apache.spark.sql.hive.thriftserver.HiveThriftServer2, logging to /usr/local/spark-2.3.3/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-master.out
failed to launch: nice -n 0 bash /usr/local/spark-2.3.3/bin/spark-submit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --name Thrift JDBC/ODBC Server --master yarn
  Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
  	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
  	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
  	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  	... 8 more
  Failed to load hive class.
  You need to build Spark with -Phive and -Phive-thriftserver.
  2019-03-14 18:51:26,873 INFO util.ShutdownHookManager: Shutdown hook called
  2019-03-14 18:51:26,875 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-0003a80c-d623-4d59-a9e3-ba4c842deda1
full log in /usr/local/spark-2.3.3/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-master.out

查看日誌 

2019-03-14 18:51:26,666 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at org.apache.spark.util.Utils$.classForName(Utils.scala:239)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:851)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 8 more
Failed to load hive class.
You need to build Spark with -Phive and -Phive-thriftserver.
2019-03-14 18:51:26,873 INFO util.ShutdownHookManager: Shutdown hook called
2019-03-14 18:51:26,875 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-0003a80c-d623-4d59-a9e3-ba4c842deda1

百度了一下org.apache.hadoop.hive.conf.HiveConf,沒有和我情況符合的,我就想,會不會是缺少其他的jar包呢,我就去看了一下官方spark/jars文件夾下還有哪些和hive有關的而編譯的無hive版本沒有的jar包,發現了hive開頭的5個jar包

我就都放進去之後嘗試

starting org.apache.spark.sql.hive.thriftserver.HiveThriftServer2, logging to /usr/local/spark-2.3.3/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-master.out
failed to launch: nice -n 0 bash /usr/local/spark-2.3.3/bin/spark-submit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --name Thrift JDBC/ODBC Server --master yarn
  	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
  	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
  Caused by: java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 3.1.2
  	at org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(ShimLoader.java:174)
  	at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:139)
  	at org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:100)
  	at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:368)
  	... 19 more
  2019-03-14 19:05:06,080 INFO util.ShutdownHookManager: Shutdown hook called
  2019-03-14 19:05:06,080 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-3ef664cd-a5a6-45db-8fde-2e7fff6de7c8
full log in /usr/local/spark-2.3.3/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-master.out

錯誤改變了 沒有再提示需要編譯,但是仍然報錯 ,查看日誌

我復現的時候有個問題直接被跳過了   我這裏提下  有時會提示缺少jline  將官方spark/jars中的jline拷貝到本地jars文件夾中

這次的日誌反應問題是

2019-03-14 19:05:05,883 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-03-14 19:05:06,036 INFO thriftserver.HiveThriftServer2: Started daemon with process name: 18505@master
2019-03-14 19:05:06,038 INFO util.SignalUtils: Registered signal handler for TERM
2019-03-14 19:05:06,038 INFO util.SignalUtils: Registered signal handler for HUP
2019-03-14 19:05:06,038 INFO util.SignalUtils: Registered signal handler for INT
2019-03-14 19:05:06,045 INFO thriftserver.HiveThriftServer2: Starting SparkContext
Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:105)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at org.apache.spark.util.Utils$.classForName(Utils.scala:239)
	at org.apache.spark.sql.SparkSession$.hiveClassesArePresent(SparkSession.scala:1079)
	at org.apache.spark.sql.SparkSession$Builder.enableHiveSupport(SparkSession.scala:866)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:48)
	at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:79)
	at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 3.1.2
	at org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(ShimLoader.java:174)
	at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:139)
	at org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:100)
	at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:368)
	... 19 more
2019-03-14 19:05:06,080 INFO util.ShutdownHookManager: Shutdown hook called
2019-03-14 19:05:06,080 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-3ef664cd-a5a6-45db-8fde-2e7fff6de7c8

說了我的hadoop版本他不支持 我去網上找的解答辦法讓我換hadoop版本,但是hive3.1.1支持的是hadoop3.x.y的版本,spark2.3.0的版本,如果切換到hadoop2.8,hive到2.3 spark只能到2.0  我不想再整一次,我就去查怎麼解決,網上大面積解答辦法我都試了,版本不兼容這個問題確實是個大問題,都要放棄繼續整了,偶然的一眼看到一個網頁https://stackoverflow.com/questions/53915059/how-can-i-fix-java-lang-illegalargumentexception-unrecognized-hadoop-major-vers

這個提問和我問題一樣的人,他是在一次查詢中報了這個錯誤,解答的人都說是hadoop版本問題,他另闢蹊徑在maven依賴中強行改了hive-exec的依賴版本,然後成功的解決了他的問題,給我一個觸動,在我之前說的spark官方jars包中也有一個同名文件

hive-exec-1.2.1.spark2.jar,我去hive的lib中找到了 hive3.1的同名文件hive-exec-3.1.1.jar然後將他拷貝到了jars中,

starting org.apache.spark.sql.hive.thriftserver.HiveThriftServer2, logging to /usr/local/spark-2.3.3/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-master.out
failed to launch: nice -n 0 bash /usr/local/spark-2.3.3/bin/spark-submit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --name Thrift JDBC/ODBC Server --master yarn
  	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
  	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
  Caused by: java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 3.1.2
  	at org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(ShimLoader.java:174)
  	at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:139)
  	at org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:100)
  	at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:368)
  	... 19 more
  2019-03-14 19:24:38,106 INFO util.ShutdownHookManager: Shutdown hook called
  2019-03-14 19:24:38,107 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-b450cf31-1d45-4d43-a5d2-f288328f46dc
full log in /usr/local/spark-2.3.3/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-master.out

錯誤沒變 ,我又想,會不會是命名規範的問題,沒有讀取到這個jar包

然後我就把名字改成了官方jars包中的命名方式,版本號是3.1.1,hive-exec-3.1.1.spark2.jar

hadoop@master:/usr/local/spark-2.3.3/sbin$ ./start-thriftserver.sh --master yarn
starting org.apache.spark.sql.hive.thriftserver.HiveThriftServer2, logging to /usr/local/spark-2.3.3/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-master.out
hadoop@master:/usr/local/spark-2.3.3/sbin$ 

驚喜,運行成功 ,然後去查看日誌有沒有其他錯誤。查看日誌之後沒有問題 ,然後去測試之後得到了前面所得到的的結果,完美解決

 

參考博客

https://blog.csdn.net/mnasd/article/details/80398975

https://stackoverflow.com/questions/53915059/how-can-i-fix-java-lang-illegalargumentexception-unrecognized-hadoop-major-vers

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章