一. 準備環境
1. 三臺物理服務器: 192.168.2.222(master), 192.168.2.223(slave1), 192.168.2.224(slave2)
2. linux系統:debian9
3. jdk版本:1.8
4. hadoop版本:3.1+
二. SSH免密登錄配置
slave與master之間需要進行免密登錄,首先對master進行免密登錄配置
1. 修改master服務器SSH 配置文件
vim /etc/ssh/sshd_config
將PermitRootLogin no 或者 PermitRootLogin without-password 修改成 PermitRootLogin yes
2. 生成祕鑰對
ssh-keygen -t rsa
-t參數表示類型,這裏選擇rsa。選擇保存位置的時候直接回車,使用默認的/root/.ssh/id_rsa。提示輸入密碼的時候,直接回車
3. 將祕鑰拷貝到.ssh目錄下的authorized_keys
cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys
4. 將公鑰複製到兩臺slave服務器
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
設置完成後,同理設置slave1,slave2兩臺服務器
三. 部署Hadoop集羣
1. 修改三臺服務器的hosts文件
vim /etc/hosts
192.168.2.222 hadoop1
192.168.2.223 hadoop2
192.168.2.224 hadoop3
2. 設置三臺服務器主機名
vim /etc/hostname
3. 規劃master slave
目前是一主雙從的結構
主從 | IP | 主機名 | 用戶 | HDFS | YARN |
---|---|---|---|---|---|
master | 192.168.2.222 | hadoop1 | hadoop | NameNode | NodeManager,ResourceManager |
slave1 | 192.168.2.223 | hadoop2 | hadoop | DataNode | NodeManager |
slave2 | 192.168.2.224 | hadoop3 | hadoop | DataNode | NodeManager |
4. 創建hadoop文件存放目錄
mkdir /opt/data/hadoop datanode hdfs journal log namenode tmp
scp -r /opt/data/hadoop/ [email protected]:/opt/data/
scp -r /opt/data/hadoop/ [email protected]:/opt/data/
5. 解壓hadoop安裝文件
tar -xvf hadoop-3.1.1.tar.gz -C /usr/local
6. 創建slaves文件
vim /home/hadoop-2.6.0/etc/hadoop/slaves
7. 修改hadoop-env.sh配置文件
vim /usr/local/hadoop-3.1.1/etc/hadoop/hadoop-env.sh
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
export JAVA_HOME=/usr/local/jdk1.8.0_201
export HADOOP_HOME=/usr/local/hadoop-3.1.1
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_LOG_DIR=/opt/data/hadoop/log
8. 修改core-site.xml文件
vim /usr/local/hadoop-3.1.1/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop1:9000/</value>
</property>
<!-- 指定zookeeper地址 -->
<property>
<name>ha.zookeeper.quorum</name>
<value>192.168.2.222:2181,192.168.2.223:2181,192.168.2.224:2181</value>
</property>
<!-- 指定hadoop臨時目錄 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/data/hadoop/tmp</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
</configuration>
9. 修改hdfs-site.xml
vim /usr/local/hadoop-3.1.1/etc/hadoop/hdfs-site.xml
<configuration>
<!--HDFS 的數據塊的副本存儲個數, 默認是3-->
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/data/hadoop/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/data/hadoop/datanode</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop1:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<!--secondarynamenode 運行節點的信息,和 namenode 不同節點-->
<!-- <property>
<name>dfs.secondary.http.address</name>
<value>hadoop2:50090</value>
</property>-->
</configuration>
10. 修改yarn-site.xml
hadoop classpath
vim /usr/local/hadoop-3.1.1/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop1</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop1:18040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop1:18030</value>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop1:18088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop1:18025</value>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop1:18141</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.application.classpath</name>
<value>輸入剛纔返回的Hadoop classpath路徑</value>
</property>
</configuration>
11. 修改mapred-site.xml
vim /usr/local/hadoop-3.1.1/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
12. 將hadoop整個文件copy到兩臺slave
cd /usr/local
scp -r hadoop-3.1.1 [email protected]:/usr/local
scp -r hadoop-3.1.1 [email protected]:/usr/local
四. 啓動Hadoop集羣
1. 格式化master服務器namenode
/usr/local/hadoop-3.1.1/bin/hadoop namenode -format
2. 啓動master服務器namenode
/usr/local/hadoop-3.1.1/sbin/hadoop-daemon.sh start namenode
3. 啓動yarn
/usr/local/hadoop-3.1.1/sbin/yarn-daemon.sh start resourcemanager
/usr/local/hadoop-3.1.1/sbin/yarn-daemon.sh start nodemanager
4. 啓動2臺slave服務器datanode
/usr/local/hadoop-3.1.1/sbin/hadoop-daemon.sh start datanode
5. 查看啓動結果
/usr/local/jdk1.8.0_201/bin/jps
啓動命令也可以使用 start-all.sh 啓動全部
6. 配置環境變量
vim /etc/profile
export HADOOP_HOME=/usr/local/hadoop-3.1.1
export PATH=$PATH:$HADOOP_HOME/bin
source /etc/profile
查看HDFS管理頁面地址: http://192.168.2.222:9870
查看YARN管理頁面地址: http://192.168.2.222:18088