Hadoop 3.1 集群部署

一. 准备环境

  1.  三台物理服务器: 192.168.2.222(master), 192.168.2.223(slave1), 192.168.2.224(slave2)

  2. linux系统:debian9

  3. jdk版本:1.8

  4. hadoop版本:3.1+

二. SSH免密登录配置

 slave与master之间需要进行免密登录,首先对master进行免密登录配置

1. 修改master服务器SSH 配置文件

vim /etc/ssh/sshd_config

将PermitRootLogin no 或者 PermitRootLogin without-password 修改成 PermitRootLogin yes

2. 生成秘钥对

ssh-keygen -t rsa

-t参数表示类型,这里选择rsa。选择保存位置的时候直接回车,使用默认的/root/.ssh/id_rsa。提示输入密码的时候,直接回车

3. 将秘钥拷贝到.ssh目录下的authorized_keys

cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys

4. 将公钥复制到两台slave服务器

ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]

设置完成后,同理设置slave1,slave2两台服务器

三. 部署Hadoop集群

1. 修改三台服务器的hosts文件

vim /etc/hosts
192.168.2.222 hadoop1 
192.168.2.223 hadoop2 
192.168.2.224 hadoop3

2. 设置三台服务器主机名

vim /etc/hostname

3. 规划master slave

  目前是一主双从的结构

主从 IP 主机名 用户 HDFS YARN
master 192.168.2.222 hadoop1 hadoop NameNode NodeManager,ResourceManager
slave1 192.168.2.223 hadoop2 hadoop DataNode NodeManager
slave2 192.168.2.224 hadoop3 hadoop DataNode NodeManager

4. 创建hadoop文件存放目录

mkdir /opt/data/hadoop  datanode hdfs  journal  log  namenode  tmp

scp -r /opt/data/hadoop/ [email protected]:/opt/data/
scp -r /opt/data/hadoop/ [email protected]:/opt/data/

5. 解压hadoop安装文件

tar -xvf hadoop-3.1.1.tar.gz -C /usr/local

6. 创建slaves文件

vim /home/hadoop-2.6.0/etc/hadoop/slaves 

7. 修改hadoop-env.sh配置文件

vim /usr/local/hadoop-3.1.1/etc/hadoop/hadoop-env.sh 
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root


export JAVA_HOME=/usr/local/jdk1.8.0_201
export HADOOP_HOME=/usr/local/hadoop-3.1.1
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop

export HADOOP_LOG_DIR=/opt/data/hadoop/log

8. 修改core-site.xml文件

vim /usr/local/hadoop-3.1.1/etc/hadoop/core-site.xml
<configuration>
   <property>
     <name>fs.defaultFS</name>
     <value>hdfs://hadoop1:9000/</value>
   </property>
   <!-- 指定zookeeper地址 -->
   <property>
     <name>ha.zookeeper.quorum</name>
     <value>192.168.2.222:2181,192.168.2.223:2181,192.168.2.224:2181</value>
   </property>
   <!-- 指定hadoop临时目录 -->
   <property>
     <name>hadoop.tmp.dir</name>
     <value>/opt/data/hadoop/tmp</value>
   </property>
   <property>
     <name>io.file.buffer.size</name>
     <value>131072</value>
   </property>
   <property>
     <name>hadoop.proxyuser.root.hosts</name>
     <value>*</value>
   </property>
   <property>
      <name>hadoop.proxyuser.root.groups</name>
      <value>*</value>
   </property>
</configuration>

9. 修改hdfs-site.xml 

vim /usr/local/hadoop-3.1.1/etc/hadoop/hdfs-site.xml
<configuration>
  <!--HDFS 的数据块的副本存储个数, 默认是3-->
  <property>
    <name>dfs.replication</name>
    <value>2</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>/opt/data/hadoop/namenode</value>
 </property>
 <property>
  <name>dfs.datanode.data.dir</name>
  <value>/opt/data/hadoop/datanode</value>
 </property>
 <property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>hadoop1:9001</value>
 </property>
 <property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>dfs.permissions</name>
    <value>false</value>
  </property>

  <!--secondarynamenode 运行节点的信息,和 namenode 不同节点-->
  <!-- <property>
    <name>dfs.secondary.http.address</name>
    <value>hadoop2:50090</value>
 </property>-->
</configuration>

10. 修改yarn-site.xml

hadoop classpath

vim /usr/local/hadoop-3.1.1/etc/hadoop/yarn-site.xml 
<configuration>
<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>hadoop1</value>
  </property>

  <property>
    <name>yarn.resourcemanager.address</name>
    <value>hadoop1:18040</value>
  </property>

  <property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>hadoop1:18030</value>

  <property>
    <name>yarn.resourcemanager.webapp.address</name>
    <value>hadoop1:18088</value>
  </property>

  <property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>hadoop1:18025</value>

  <property>
    <name>yarn.resourcemanager.admin.address</name>
    <value>hadoop1:18141</value>
  </property>

  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>

  <property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>
  <property> 
    <name>yarn.application.classpath</name>
    <value>输入刚才返回的Hadoop classpath路径</value>
  </property>

</configuration>

11. 修改mapred-site.xml

vim /usr/local/hadoop-3.1.1/etc/hadoop/mapred-site.xml 
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>

12. 将hadoop整个文件copy到两台slave

cd /usr/local
scp -r  hadoop-3.1.1 [email protected]:/usr/local
scp -r  hadoop-3.1.1 [email protected]:/usr/local

四. 启动Hadoop集群

1. 格式化master服务器namenode

 /usr/local/hadoop-3.1.1/bin/hadoop namenode -format

2. 启动master服务器namenode

/usr/local/hadoop-3.1.1/sbin/hadoop-daemon.sh start namenode

3. 启动yarn

/usr/local/hadoop-3.1.1/sbin/yarn-daemon.sh start resourcemanager
/usr/local/hadoop-3.1.1/sbin/yarn-daemon.sh start nodemanager

4. 启动2台slave服务器datanode

/usr/local/hadoop-3.1.1/sbin/hadoop-daemon.sh start datanode

5. 查看启动结果

/usr/local/jdk1.8.0_201/bin/jps

启动命令也可以使用  start-all.sh 启动全部

6. 配置环境变量

vim /etc/profile

export HADOOP_HOME=/usr/local/hadoop-3.1.1
export PATH=$PATH:$HADOOP_HOME/bin 

source /etc/profile

查看HDFS管理页面地址:  http://192.168.2.222:9870

查看YARN管理页面地址:  http://192.168.2.222:18088

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章