在HUE3.10版中已經具有spark notebook的功能,但需要自己配置。
安裝步驟
主要參照run-hue-spark-notebook-on-cloudera進行。
注意點
在第四步,
#download Livy
wget http://archive.cloudera.com/beta/livy/livy-server-0.2.0.zip
unzip livy-server-0.2.0.zip -d /<your_livy_dir>
#set environment variables for Livy
export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
export HADOOP_CONF_DIR=<your hadoop_conf_dir found in the previous step in Hue configuration>
export HUE_SECRET_KEY=<your Hue superuser password, this is usually the user you use when you log in to Hue Web UI the first time>
#run Livy. You must run Livy as a user who has access to hdfs, for example, the superuser hdfs.
su hdfs
/<your_livy_dir>/livy-server-0.2.0/bin/livy-server
注意:HADOOP_CONF_DIR可能隨着CDH的重啓而改變,所以每次需要重新設置。也可以直接設置成固定的/etc/hadoop/conf
,見CDH中服務的配置及啓動 。
測試
val data = Array(1, 2, 3, 4 ,5)
val distData = sc.parallelize(data)
distData.map(s=>s+1).collect()