centos7搭建hadoop2.7.2完全分佈式集羣
我之前使用的是centos6.8安裝hadoop2.7.2,但報錯如下:
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable.是由於缺少hadoop-native-64-2.7.0.tar,但結果還是報錯,於是換了centos7.2來安裝,不過又入坑了,請看cetos7初體驗。
創建目錄 /usr/apache 來放置hadoop系列軟件,方便管理。
jdk安裝:
官網下載jdk1.8(hadoop2.7對idk的要求是jdk1.7以上,爲了避免出錯,我使用最新的jdk版本)。解壓並移動到 /usr/apache 目錄。配置環境變量:
vi /etc/profile
加入以下內容:
#java
export JAVA_HOME=/usr/apache/jdk1.8.0_101
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
然後 source /etc/profile,再使用java -version查看java是否安裝完成。
ssh免密碼配置
ssh的免密碼配置請參考http://my.oschina.net/u/189445/blog/503525
可能會報錯:-bash: ssh: command not found
解決方法:centos最小化安裝會出現的問題.
解決方法:
yum -y install openssh-clients
hadoop安裝
環境變量的設置:
vi /etc/profile
#hadoop
export HADOOP_HOME=/usr/apache/hadoop-2.7.2
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
hadoop配置文件的配置
hadoop2.x的配置文件放在 hadoop-2.7.2/etc/hadoop/ 下:
配置hadoop-env.sh與yarn-env.sh
# The java implementation to use.
export JAVA_HOME=/usr/apache/jdk1.8.0_101
export HADOOP_CONF_DIR=/usr/apache/hadoop-2.7.2/etc/hadoop/
最後的HADOOPCONFDIR中的/一定要加上,不然會報錯:
master: Error: Cannot find configuration directory: /etc/hadoop
其中yarn-env.sh只加入java的環境變量就行了。
core-site.sh配置
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/apache/hadoop-2.7.2/tmp</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131702</value>
</property>
</configuration>
hdfs.site.sh配置
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/apache/hadoop-2.7.2/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/apache/hadoop-2.7.2/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
mapred-site.xml配置,需要從mapred-site.xml.template複製一份
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
</configuration>
yarn-site.xml配置
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>768</value>
</property>
</configuration>
格式化namenode
使用的命令是 hdfs namenode -format ,該命令在hadoop2.7.2/bin下:
INFO common.Storage: Storage directory /usr/apache/hadoop-2.7.2/dfs/name has been successfully formatted.
INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
INFO util.ExitUtil: Exiting with status 0
上面的反饋表明格式化成功。
啓動hdfs
啓動命令在hadoop2.7.2/sbin下:
先啓動dos:start-dfs.sh
master: starting namenode, logging to /usr/apache/hadoop-2.7.2/logs/hadoop-root-namenode-master.out
slave1: starting datanode, logging to /usr/apache/hadoop-2.7.2/logs/hadoop-root-datanode-slave1.out
slave2: starting datanode, logging to /usr/apache/hadoop-2.7.2/logs/hadoop-root-datanode-slave2.out
Starting secondary namenodes [master]
master: starting secondarynamenode, logging to /usr/apache/hadoop-2.7.2/logs/hadoop-root-secondarynamenode-master.out
啓動yarn:start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/apache/hadoop-2.7.2/logs/yarn-root-resourcemanager-master.out
slave1: starting nodemanager, logging to /usr/apache/hadoop-2.7.2/logs/yarn-root-nodemanager-slave1.out
slave2: starting nodemanager, logging to /usr/apache/hadoop-2.7.2/logs/yarn-root-nodemanager-slave2.out
jps命令查看各節點進程:
master上:
3458 ResourceManager
3299 SecondaryNameNode
3527 Jps
3115 NameNode
slave1上:
2852 Jps
2646 DataNode
slave2上:
9620 Jps
9414 DataNode
到此,hadoop集羣搭建完成。