我的環境:
問題1、
scala> val textFile =sc.textFile("README.md")
報錯信息爲: error: not found: value sc
sc爲spark context,創建rdd時候就有的了,很明顯創建失敗
可能原因:
hdfs失敗,未啓動hadoop;加載spark-shell就失敗;權限不對;環境變量沒配好==》路徑不對
附註:Spark context available as sc.
Spark context的縮寫爲Sc。
問題2、
spark.sparkcontext:error initializiing
報錯總信息爲:sparkDriver could not bind on port 0
具體摘要爲:
starting remoting
java.net.BindException:Failed to bind to: /10.1.4.221:0:shutting down Netty transport
Service 'sparkDriver' failed after 16 retries!
Master上報錯,slave2上未報這個錯誤
Slave2上的部分信息爲:
Successfully started service ‘sparkDriver’ on port 42887
Remoting started listening on addresses: [akka.tcp://[email protected]:42887]
*上面ip地址爲ip地址
都報的錯誤爲
error not found: value sqlContext
解決辦法:
export SPARK_LOCAL_IP=127.0.0.1
注意export 之後,僅對當前窗口有效,僅僅是臨時創建環境變量,最好加到$SPARK_HOME/conf/spark-env.sh中
猜測報錯原因:前幾天網線動了,所以ip變了?
解決上述問題的參考鏈接:
http://stackoverflow.com/questions/30085779/apache-spark-error-while-start
然後只報error not found: value sql Context的錯誤了
google後的相關信息爲:
Looks like your Spark config may be trying to log to an HDFS path. Can you review yourconfig settings?
While reading a local file which is not in HDFS throughspark shell, does the HDFS need to be up and running ?
The data may be spilled off to disk hence HDFS is anecessity for Spark.
You can run Spark on a single machine & not use HDFS but in distributed mode HDFS will be required.
所以問題原因應該是:
這個應該是沒有啓動hadoop
**突然忘了斷電後重啓過了
於是
bin/hadoopnamenode –format
sbin/start-dfs.sh
sbin/start-yarn.sh
然後再啓動spark
進入spark目錄,sbin/start-all.sh
然後再bin/spark-shell即成功