spark使用過程中的問題

原創

make_APP

2020-06-13 07:24

我的環境：

scala2.10.6

Hadoop 2.6.2

jdk-8u66-linux-x64

spark1.5.2

一臺master，兩臺slave

問題1、

scala> val textFile =sc.textFile("README.md")

報錯信息爲： error: not found: value sc

sc爲spark context，創建rdd時候就有的了，很明顯創建失敗

可能原因：

hdfs失敗，未啓動hadoop；加載spark-shell就失敗；權限不對；環境變量沒配好==》路徑不對

附註：

Spark context available as sc.

Spark context的縮寫爲Sc。

問題2、

spark.sparkcontext:error initializiing

報錯總信息爲：sparkDriver could not bind on port 0

具體摘要爲：

starting remoting

java.net.BindException:Failed to bind to: /10.1.4.221:0:shutting down Netty transport

Service 'sparkDriver' failed after 16 retries!

Master上報錯，slave2上未報這個錯誤

Slave2上的部分信息爲：

Successfully started service ‘sparkDriver’ on port 42887

Remoting started listening on addresses: [akka.tcp://[email protected]:42887]

*上面ip地址爲ip地址

都報的錯誤爲

error not found: value sqlContext

解決辦法：

export SPARK_LOCAL_IP=127.0.0.1

注意export 之後，僅對當前窗口有效，僅僅是臨時創建環境變量，最好加到$SPARK_HOME/conf/spark-env.sh中

猜測報錯原因：前幾天網線動了，所以ip變了？

解決上述問題的參考鏈接：

http://stackoverflow.com/questions/30085779/apache-spark-error-while-start

然後只報error not found: value sql Context的錯誤了

google後的相關信息爲：

Looks like your Spark config may be trying to log to an HDFS path. Can you review yourconfig settings?

While reading a local file which is not in HDFS throughspark shell, does the HDFS need to be up and running ?

The data may be spilled off to disk hence HDFS is anecessity for Spark.

You can run Spark on a single machine & not use HDFS but in distributed mode HDFS will be required.

所以問題原因應該是：

這個應該是沒有啓動hadoop

**突然忘了斷電後重啓過了

於是

bin/hadoopnamenode –format

sbin/start-dfs.sh

sbin/start-yarn.sh

然後再啓動spark

進入spark目錄，sbin/start-all.sh

然後再bin/spark-shell即成功

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

spark使用過程中的問題

樹莓派實戰4：配置一小時發送一次郵件

樹莓派實戰3：配置開機自啓動

樹莓派實戰2：發送ip地址到自己郵箱

給臺式機補內存條

spark使用過程中的問題

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結