Spark需要yarn(hadoop版本2.7.7),在ubuntu19上配置步驟如下。
配置
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64
3、修改core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
4、修改hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
5、增加mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
6、修改yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
安裝
還要額外安裝openssh-server
sudo apt install openssh-server
執行
1、格式化文件系統
bin/hdfs namenode -format
2、啓動dfs
sbin/start-dfs.sh
3、啓動yarn
sbin/start-yarn.sh
默認端口是8088