系統環境:
四臺虛擬機
192.168.1.167 vm4.com
192.168.1.31 vm3.com
192.168.1.62 vm2.com
192.168.1.39 vm1.com
系統版本
[root@vm1 ~]# cat /etc/centos-release
CentOS Linux release 7.0.1406 (Core)
部署前準備
vm1 可以免密登陸其他三臺機器
四臺機器互相解析
[root@vm1 ~]# cat /etc/hosts
192.168.1.167 vm4.com
192.168.1.31 vm3.com
192.168.1.62 vm2.com
192.168.1.39 vm1.com
防火牆關閉狀態
systemctl stop firewalld.service
systemctl disable firewalld.service
新建hadoop用戶
useradd -u 5000 hadoop
設置密碼
echo "hadoop"|passwd --stdin hadoop
JDK環境部署(四臺機器都需要):
在官網下載 jdk-8u151-linux-x64.tar.gz
tar zxvf /root/jdk-8u151-linux-x64.tar.gz
mv /root/jdk1.8.0_151 /usr/local/jdk1.8
配置環境變量
vim /etc/profile.d/java.sh
####
JAVA_HOME=/usr/local/jdk1.8
JAVA_BIN=/usr/local/jdk1.8/bin
JRE_HOME=/usr/local/jdk1.8/jre
PATH=$PATH:/usr/local/jdk1.8/bin:/usr/local/jdk1.8/jre/bin
CLASSPATH=/usr/local/jdk1.8/jre/lib:/usr/local/jdk1.8/lib:/usr/local/jdk1.8/jre/lib/charsets.jar
####
#source生效
source /etc/profile.d/java.sh
測試
[root@vm1 /]# java -version
java version "1.8.0_151"
Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)
JDK環境部署ok
Hadoop環境部署:
1.下載安裝hadoop
hadoop官網:http://hadoop.apache.org
wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.6.5/hadoop-2.6.5.tar.gz
tar zxvf hadoop-2.6.5.tar.gz
cd hadoop-2.6.5
vim etc/hadoop/hadoop-env.sh
修改這一行:
export JAVA_HOME=/usr/local/jdk1.8
然後測試
[root@vm2 hadoop]# ./bin/hadoop version
Hadoop 2.6.5
Subversion https://github.com/apache/hadoop.git -r e8c9fe0b4c252caf2ebf1464220599650f119997
Compiled by sjlee on 2016-10-02T23:43Z
Compiled with protoc 2.5.0
From source with checksum f05c9fa095a395faa9db9f7ba5d754
This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.6.5.jar
2.集羣分佈式環境配置
hadoop環境變量配置
vim ~/.bashrc
添加一行
export PATH=$PATH:/usr/local/hadoop/bin:/usr/local/hadoop/sbin
source 生效
source ~/.bashrc
mv hadoop-2.6.5 /usr/local/hadoop
chown -R hadoop.hadoop /usr/local/hadoop-2.6.5/
編輯配置文件
配置文件目錄
[root@vm2 hadoop]# pwd
/usr/local/hadoop/etc/hadoop
1.編輯slaves文件
將作爲 DataNode 的主機名寫入該文件,每行一個
[root@vm2 hadoop]# cat slaves
vm2.com
vm3.com
vm4.com
2.編輯core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://vm1.com:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/local/hadoop/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
</configuration>
3.編輯hdfs-site.xml
dfs.replication 爲 Slave 節點個數
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>vm1.com:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/tmp/dfs/data</value>
</property>
</configuration>
4.編輯mapred-site.xml
cp mapred-site.xml.template mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>vm1.com:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>vm1.com:19888</value>
</property>
</configuration>
5.編輯yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>vm1.com</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
至此 hadoop啓動所必需的配置已全部配置好
將整個hadoop目錄打包用scp命令發送至其他節點
啓動hadoop
首次啓動需要先在 Master 節點執行 NameNode 的格式化:
hdfs namenode -format
成功的話,會看到 “successfully formatted” 和 “Exitting with status 0” 的提示,若爲 “Exitting with status 1” 則是出錯。
接着開啓 NameNode 和 DataNode 守護進程。
以下三個腳本的存放路徑在
[root@vm1 sbin]# pwd
/usr/local/hadoop/sbin
前面已經添加到了環境變量 可以直接執行
start-dfs.sh
start-yarn.sh
mr-jobhistory-daemon.sh start historyserver
通過命令 jps 可以查看各個節點所啓動的進程。正確的話,在 vm1.com節點上可以看到 NameNode、ResourceManager、SecondrryNameNode、JobHistoryServer 進程如下:
[root@vm1 ~]# jps
4050 NameNode
4229 SecondaryNameNode
4663 JobHistoryServer
4378 ResourceManager
7498 Jps
在其他三個datanode節點可以看到 DataNode 和 NodeManager 進程
[root@vm2 sbin]# jps
12301 NodeManager
13358 Jps
12207 DataNode
以上 缺少任一進程都表示出錯
另外還需要在 Master 節點上通過命令 hdfs dfsadmin -report 查看 DataNode 是否正常啓動
[root@vm1 ~]# hdfs dfsadmin -report
Configured Capacity: 160982630400 (149.93 GB)
Present Capacity: 154752630784 (144.12 GB)
DFS Remaining: 154750676992 (144.12 GB)
DFS Used: 1953792 (1.86 MB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Live datanodes (3):
Name: 192.168.1.167:50010 (vm4.com)
Hostname: vm4.com
Decommission Status : Normal
Configured Capacity: 53660876800 (49.98 GB)
DFS Used: 651264 (636 KB)
Non DFS Used: 2076315648 (1.93 GB)
DFS Remaining: 51583909888 (48.04 GB)
DFS Used%: 0.00%
DFS Remaining%: 96.13%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Nov 09 17:40:28 CST 2017
Name: 192.168.1.31:50010 (vm3.com)
Hostname: vm3.com
Decommission Status : Normal
Configured Capacity: 53660876800 (49.98 GB)
DFS Used: 651264 (636 KB)
Non DFS Used: 2076905472 (1.93 GB)
DFS Remaining: 51583320064 (48.04 GB)
DFS Used%: 0.00%
DFS Remaining%: 96.13%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Nov 09 17:40:28 CST 2017
Name: 192.168.1.62:50010 (vm2.com)
Hostname: vm2.com
Decommission Status : Normal
Configured Capacity: 53660876800 (49.98 GB)
DFS Used: 651264 (636 KB)
Non DFS Used: 2076778496 (1.93 GB)
DFS Remaining: 51583447040 (48.04 GB)
DFS Used%: 0.00%
DFS Remaining%: 96.13%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Nov 09 17:40:28 CST 2017