1 準備工作
1.1 虛擬機規劃
- 版本:CentOS Linux release 7.6.1810
- VMware安裝三臺虛擬機
192.168.159.133(linux-01.potato.com) NameNode DataNode ResourceManager NodeManager
192.168.159.128(linux-02.potato.com) SecondaryNameNode DataNode NodeManager
192.168.159.131(linux-03.potato.com) DataNode NodeManager
1.2 用戶
- hadoop不能用root用戶啓動,需要創建一個啓動用戶,本文使用dehuab作爲啓動用戶
1.3 SSH免密登錄
ssh-keygen -t rsa(3次回車)
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected](自己也要拷貝給自己)
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
1.4 JDK
- 版本 1.8.0_181
- 解壓到/usr/local/jdk
1.5 Hadoop
- 版本 Hadoop 3.2.0
- 解壓到/usr/local/hadoop
- 創建hadoop數據存放目錄/usr/local/hadoop-data
- 設置hadoop目錄的所屬用戶(啓動用戶dehuab)
sudo chown -R dehuab:dehuab /usr/local/hadoop
sudo chown -R dehuab:dehuab /usr/local/hadoop-data
1.6 配置hosts
192.168.159.133 linux-01.potato.com
192.168.159.128 linux-02.potato.com
192.168.159.131 linux-03.potato.com
1.7 防火牆
- 關閉firewall:systemctl stop firewalld.service
- 停止firewall(禁止firewall開機啓動):systemctl disable firewalld.service
- 查看默認防火牆狀態(關閉後顯示notrunning,開啓後顯示running):firewall-cmd --state
1.8 環境變量
JAVA_HOME=/usr/local/jdk
export JAVA_HOME
PATH=$JAVA_HOME/bin:$PATH
export PATH
HADOOP_HOME=/usr/local/hadoop
export HADOOP_HOME
PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export PATH
- 使環境變量生效 source /etc/profile
2 配置文件
- 在NameNode機器上操作,其他機器通過scp拷貝
- 在hadoop-env.sh中,再顯示地重新聲明一遍export JAVA_HOME
export JAVA_HOME=/usr/local/jdk
2.2 hdfs-site.xml
<configuration>
<!--配置數據塊的冗餘度,默認是3-->
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<!--配置HDFS的權限檢查,默認是true-->
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>linux-01.potato.com:50070</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>linux-02.potato.com:50090</value>
</property>
</configuration>
2.3 core-site.xml
<configuration>
<!--配置HDFS主節點,namenode的地址,9000是RPC通信端口-->
<property>
<name>fs.defaultFS</name>
<value>hdfs://linux-01.potato.com:9001</value>
</property>
<!--配置HDFS數據塊和元數據保存的目錄,一定要修改-->
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-data</value>
</property>
</configuration>
2.4 mapred-site.xml(默認沒有)
<!--配置MR程序運行的框架-->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
2.5 yarn-site.xml
<configuration>
<!--配置Yarn的ResourceManager節點-->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>linux-01.potato.com</value>
</property>
<!--NodeManager執行MR任務的方式是Shuffle洗牌-->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
2.6 works
- 該文件裏面配置所有節點機器的名稱
- hadoop3.0以前文件名稱爲slaves,hadoop3.0以後文件名稱改爲works
bigdata122
bigdata123
2.7 scp拷貝
scp -r /etc/hosts [email protected]:/etc/hosts
scp -r /etc/profile [email protected]:/etc/profile
scp -r /usr/local/hadoop/ [email protected]:/usr/local/
scp -r /usr/local/jdk/ [email protected]:/usr/local/
scp -r /etc/hosts [email protected]:/etc/hosts
scp -r /etc/profile [email protected]:/etc/profile
scp -r /usr/local/hadoop/ [email protected]:/usr/local/
scp -r /usr/local/jdk/ [email protected]:/usr/local/
3 HDFS NameNode 格式化
hdfs namenode -format
成功的標誌: Storage directory /usr/local/hadoop/tmp/dfs/name has been successfully formatted.
4 啓動服務
- 通過start-all.sh啓動(環境變量配置成功後,start-all.sh可以在任意位置訪問)
- jps命令查看進程
驗證5個進程:
5022 NameNode
5314 SecondaryNameNode
5586 NodeManager
5476 ResourceManager
5126 DataNode
YARN: http://linux-01.potato.com:8088
HDFS: http://linux-01.potato.com:50070
/usr/local/hadoop/logs/hadoop-dehuab-datanode-linux-01.potato.com.log Shift+G 看啓動日誌