Hadoop學習筆記之HA集羣安裝部署

1 運行環境

1.1 軟件環境

  • 四個節點
  • 64位CentOS 7.0
  • JVM:預裝64位JDK1.8及以上版本

1.2 IP與hostname設置

IP HOSTNAME
192.168.36.134 hadoop01
192.168.36.135 hadoop02
192.168.36.136 hadoop03
192.168.36.138 hadoop04

2 安裝準備

2.1 準備虛擬機

  • 準備四個虛擬機節點

修改主機名

  • 四個節點分別使用root用戶修改/etc/hostname文件,修改裏面的內容爲對應的主機名
  • 主機名分別爲 hadoop01, hadoop02, hadoop03, hadoop04
vi /etc/hostname

2.2 關閉防火牆

  • 分別將四個節點的防火牆都關閉,命令如下
systemctl stop firewalld.service    //關閉防火牆
systemctl diable firewalld.service  //關閉防火牆開機啓動
systemctl status firewalld   //查看防火牆狀態

2.3 修改hosts列表

  • 分別修改四個節點上的 /etc/hosts 文件,添加以下內容,四個節點都一樣
  • 添加完後使用 ping [hostname] 測試配置是否成功
vi /etc/hosts

192.168.36.134 hadoop01
192.168.36.135 hadoop02
192.168.36.136 hadoop03
192.168.36.138 hadoop04

ping hadoop04

2.4 配置時鐘同步

  • 使四個節點的時鐘同步
//查看時間命令
[lan@hadoop01 ~]$ date
  • 如果時間不同步,可通過以下命令同步網絡時間
//同步網絡時間
ntpdate time.nuri.net

2.5 配置免祕鑰登錄

  • lan用戶登錄hadoop01節點,執行以下指令生成一對祕鑰
[lan@hadoop01 ~]$ ssh-keygen -t rsa
//回車——回車——回車
  • 登錄hadoop02節點生成祕鑰,並傳給hadoop01節點
[lan@hadoop02 ~]$ ssh-keygen -t rsa
[lan@hadoop02 ~]$ scp ~/.ssh/id_rsa.pub lan@hadoop01:~/.ssh/id_rsa.pub02
  • 登錄hadoop03節點生成祕鑰,並傳給hadoop01節點
[lan@hadoop03 ~]$ ssh-keygen -t rsa
[lan@hadoop03 ~]$ scp ~/.ssh/id_rsa.pub lan@hadoop01:~/.ssh/id_rsa.pub03
  • 登錄hadoop04節點生成祕鑰,並傳給hadoop01節點
[lan@hadoop04 ~]$ ssh-keygen -t rsa
[lan@hadoop04 ~]$ scp ~/.ssh/id_rsa.pub lan@hadoop01:~/.ssh/id_rsa.pub04
  • 登錄hadoop01節點,組合所有公鑰
  • 注意修改文件權限
[lan@hadoop01 ~]$ cd ~/.ssh
[lan@hadoop01 .ssh]$ cat id_rsa.pub >> authorized_keys
[lan@hadoop01 .ssh]$ cat id_rsa.pub02 >> authorized_keys
[lan@hadoop01 .ssh]$ cat id_rsa.pub03 >> authorized_keys
[lan@hadoop01 .ssh]$ cat id_rsa.pub04 >> authorized_keys
[lan@hadoop01 .ssh]$ chmod 600 authorized_keys //修改文件權限
  • 在hadoop01上分發祕鑰文件給其他節點
[lan@hadoop01 .ssh]$ scp ~/.ssh/authorized_keys lan@hadoop02:~/.ssh/
[lan@hadoop01 .ssh]$ scp ~/.ssh/authorized_keys lan@hadoop03:~/.ssh/
[lan@hadoop01 .ssh]$ scp ~/.ssh/authorized_keys lan@hadoop04:~/.ssh/

//測試免密
ssh hadoop02
  • 注:以上所有登錄節點、傳輸文件過程都需要輸入對應節點lan用戶的登錄祕鑰
  • 到此,免密配置成功,所有節點都可以相互之間免密登錄

2.6 安裝jdk

  • 注:因爲hadoop所有組件都是在JVM環境中運行,所以在安裝其他組件之前必須首先安裝JDK

  • jdk版本推薦安裝1.8,可去官網自行下載

  • 下載後上傳至服務器用戶家目錄下

  • 解壓,將JDK文件解壓,放到/usr/java/目錄下,使用root用戶

[root@hadoop01 ~]# mkdir /usr/java/
[root@hadoop01 ~]# mv /home/lan/jdk-8u144-linux-x64.tar.gz /usr/java
[root@hadoop01 ~]# cd /usr/java/
[root@hadoop01 ~]# tar -zxvf jdk-8u144-linux-x64.tar.gz
  • 配置環境變量,使用lan用戶
[lan@hadoop01 ~]$ vim .bash_profile
  • 添加以下內容
export JAVA_HOME=/usr/java/jdk1.8.0_144
export PATH=$JAVA_HOME/bin:$PATH
  • 測試
[lan@hadoop01 ~]$ java -version
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
  • 其他節點也要按如上步驟進行安裝配置

3 安裝其他組件

3.1 安裝zookeeper

  • 注:zookeeper安裝只在hadoop01進行配置,其他節點通過hadoop01分發已配置好的安裝目錄進行配置,這樣操作可以避免重複
  1. 解壓軟件包,將zookeeper-3.4.6.tar.gz解壓縮
[lan@hadoop01 ~]$ tar -zxvf zookeeper-3.4.6.tar.gz
  1. 修改配置文件
  • 修改zookeeper配置文件/home/lan/zookeeper-3.4.6/conf/zoo_sample.cfg重命名爲zoo.cfg
  • 進入到conf目錄下,執行:
[lan@hadoop01 ~]$ mv zoo_sample.cfg zoo.cfg
  • 修改zoo.cfg添加如下內容
[lan@hadoop01 ~]$ vi zoo.cfg

server.1=hadoop01:2888:3888
server.2=hadoop02:2888:3888
server.3=hadoop04:2888:3888
  1. 創建相關目錄
  • 創建/tmp/zookeeper目錄,並在此目錄下創建myid文件
[lan@hadoop01 ~]$ mkdir /tmp/zookeeper
[lan@hadoop01 ~]$ cd /tmp/zookeeper
[lan@hadoop01 ~]$ vim myid
  • 在文件中寫入數字
1
  1. 分發zookeeper軟件包
[lan@hadoop01 ~]$ scp -r ~/zookeeper-3.4.6 lan@hadoop02:~/
[lan@hadoop01 ~]$ scp -r ~/zookeeper-3.4.6 lan@hadoop04:~/
  1. 修改myid文件
ssh lan@hadoop02
mkdir /tmp/zookeeper/
vim /tmp/zookeeper/myid
//修改文件中數字爲2
2

ssh lan@hadoop04
mkdir /tmp/zookeeper/
vim /tmp/zookeeper/myid
//修改文件中數字爲3
3
  1. 分別在hadoop01、hadoop02、hadoop04節點上修改環境變量,添加以下行
export ZOOKEEPER_HOME=/home/lan/zookeeper-3.4.6
export PATH=$ZOOKEEPER_HOME/bin:$PATH
  1. 啓動zookeeper
  • 分別在hadoop01、hadoop02、hadoop04上執行,這裏以hadoop01做示例
[lan@hadoop01 ~]$ zkServer.sh start
  • 查看進程
[lan@hadoop01 ~]$ jps
17683 QuorumPeerMain
17701 Jps
  • 在三個節點上都啓動了zookeeper後,可查看zookeeper狀態
[lan@hadoop01 ~]$ zkServer.sh status
JMX enabled by default
Using config: /home/lan/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: follower

3.2 安裝hadoop

  • hadoop部分的配置分爲兩部分hdfs和yarn

3.2.1 HDFS

  1. 上傳並解壓安裝包(這裏只在hadoop01進行,其他節點由hadoop01節點進行分發配置)
  • 將hadoop-2.7.7.tar.gz安裝包上傳至服務器
  • 解壓安裝包
[lan@hadoop01 ~]$ tar -zxvf hadoop-2.7.7.tar.gz
  1. 修改配置文件
  • 修改core-site.xml
[lan@hadoop01 ~]$ vim ~/hadoop-2.7.7/etc/hadoop/core-site.xml
  • 修改爲以下內容
<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://beh</value>
    <final>false</final>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/lan/hadoopdata</value>
    <final>false</final>
  </property>
  <!-- 設置zookeeper參與選舉的節點 -->
  <property>
    <name>ha.zookeeper.quorum</name>
    <value>hadoop01:2181,hadoop02:2181,hadoop04:2181</value>
    <final>false</final>
  </property>
</configuration>
  • 修改hdfs-site.xml
[lan@hadoop01 ~]$ vim ~/hadoop-2.7.7/etc/hadoop/hdfs-site.xml
  • 修改爲以下內容
<configuration>
  <property>
    <name>dfs.nameservices</name>
    <value>beh</value>
    <final>false</final>
  </property>
  <!-- 指定雙namenode各自的代稱 -->
  <property>
    <name>dfs.ha.namenodes.beh</name>
    <value>nn1,nn2</value>
    <final>false</final>
  </property>
  <!-- 指定nn1節點地址 -->
  <property>
    <name>dfs.namenode.rpc-address.beh.nn1</name>
    <value>hadoop01:9000</value>
    <final>false</final>
  </property>
  <property>
    <name>dfs.namenode.http-address.beh.nn1</name>
    <value>hadoop01:50070</value>
    <final>false</final>
  </property>
  <!-- 指定nn2節點地址 -->
  <property>
    <name>dfs.namenode.rpc-address.beh.nn2</name>
    <value>hadoop02:9000</value>
    <final>false</final>
  </property>
  <property>
    <name>dfs.namenode.http-address.beh.nn2</name>
    <value>hadoop02:50070</value>
    <final>false</final>
  </property>
  <!-- 指定zk集羣地址 -->
  <property>
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://hadoop01:8485;hadoop02:8485;hadoop04:8485/beh</value>
    <final>false</final>
  </property>
  <!-- 開啓故障自動轉換 -->
  <property>
    <name>dfs.ha.automatic-failover.enabled.beh</name>
    <value>true</value>
    <final>false</final>
  </property>
  <!-- 導入高可用所需jar包 -->
  <property>
    <name>dfs.client.failover.proxy.provider.beh</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    <final>false</final>
  </property>
  <property>
    <name>dfs.journalnode.edits.dir</name>
    <value>/home/lan/metadata/journal</value>
    <final>false</final>
  </property>
  <property>
    <name>dfs.ha.fencing.methods</name>
    <value>sshfence</value>
    <final>false</final>
  </property>
  <!-- 配置免密私鑰所在目錄 -->
  <property>
    <name>dfs.ha.fencing.ssh.private-key-files</name>
    <value>/home/lan/.ssh/id_rsa</value>
    <final>true</final>
  </property>
  <!-- 配置datanode節點個數 -->
  <property>
    <name>dfs.replication</name>
    <value>2</value>
    <final>false</final>
  </property>
</configuration>
  • 修改slaves
[lan@hadoop01 ~]$ vim ~/hadoop-2.7.7/etc/hadoop/slaves
  • 修改爲以下內容(datanode對應的節點主機名)
hadoop03
hadoop04

3.2.2 YARN

  • 修改mapred-site.xml(此處要將mapred-site.xml.template文件轉存爲mapred-site.xml,執行cp mapred-site.xml.template mapred-site.xml)
[lan@hadoop01 ~]$ vim ~/hadoop-2.7.7/etc/hadoop/mapred-site.xml
  • 修改爲以下內容
<configuration>
<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>
</configuration>
  • 修改yarn-site.xml
[lan@hadoop01 ~]$ vim ~/hadoop-2.7.7/etc/hadoop/yarn-site.xml
  • 修改爲以下內容
<configuration>
<!-- 開啓RM高可用 -->
   <property>
      <name>yarn.resourcemanager.ha.enabled</name>
      <value>true</value>
   </property>
   <!-- 指定RM的cluster id -->
   <property>
     <name>yarn.resourcemanager.cluster-id</name>
     <value>beh</value>
   </property>
   <!-- 指定RM的名字 -->
   <property>
      <name>yarn.resourcemanager.ha.rm-ids</name>
      <value>rm1,rm2</value>
   </property>
   <!-- 分別指定RM的地址 -->
   <property>
      <name>yarn.resourcemanager.hostname.rm1</name>
      <value>hadoop01</value>
   </property>
   <property>
      <name>yarn.resourcemanager.hostname.rm2</name>
      <value>hadoop02</value>
   </property>
   <!-- 指定zk集羣地址 -->
   <property>
     <name>yarn.resourcemanager.zk-address</name>
     <value>hadoop01:2181,hadoop02:2181,hadoop04:2181</value>
   </property>
   <property>
     <name>yarn.nodemanager.aux-services</name>
     <value>mapreduce_shuffle</value>
   </property>
   <!--開啓故障自動切換-->
	<property>
     <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
     <value>true</value>
   </property>
<property>
        <name>yarn.resourcemanager.address.rm1</name>
        <value>hadoop01:8032</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address.rm1</name>
        <value>hadoop01:8030</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address.rm1</name>
        <value>hadoop01:8088</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
        <value>hadoop01:8031</value>
    </property>

<property>
        <name>yarn.resourcemanager.address.rm2</name>
        <value>hadoop02:8032</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address.rm2</name>
        <value>hadoop02:8030</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address.rm2</name>
        <value>hadoop02:8088</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
        <value>hadoop02:8031</value>
    </property>
        
</configuration>
[lan@hadoop01 ~]$ vim ~/hadoop-2.7.7/etc/hadoop/hadoop-env.sh
[lan@hadoop01 ~]$ vim ~/hadoop-2.7.7/etc/hadoop/yarn-env.sh
  • 修改爲以下內容
export JAVA_HOME=/usr/java/jdk1.8.8_144

3.2.3 分發配置文件

  • **將以上配置分發至所有節點
[lan@hadoop01 ~]$ scp -r ~/hadoop-2.7.7 lan@hadoop02:~/
[lan@hadoop01 ~]$ scp -r ~/hadoop-2.7.7 lan@hadoop03:~/
[lan@hadoop01 ~]$ scp -r ~/hadoop-2.7.7 lan@hadoop04:~/

3.2.4 啓動HDFS

  • 啓動journalnode(進程名:JournalNode),哪些節點配置了此項,就在哪些節點啓動(hadoop01,hadoop02,hadoop04),命令如下
hadoop-daemon.sh start journalnode
  • 格式化zookeeper,在hadoop01上執行
[lan@hadoop01 ~]$ hdfs zkfs -formatZK
  • 對hadoop01節點進行格式化和啓動namenode(進程名:NameNode)
//格式化
[lan@hadoop01 ~]$ hdfs name -format
//啓動namenode
[lan@hadoop01 ~]$ hadoop-daemon.sh start namenode
  • 對hadoop02節點進行格式化和啓動
//格式化
[lan@hadoop02 ~]$ hdfs namenode -bootstrapStandby
//啓動namenode
[lan@hadoop02 ~]$ hadoop-daemon.sh start namenode
  • 在hadoop01和hadoop02上啓動zkfc服務(zkfc服務進程名:DFSZKFailoverController):此時hadoop01和hadoop02就會有一個節點變爲active狀態
[lan@hadoop01 ~]$ hadoop-daemon.sh start zkfc
[lan@hadoop02 ~]$ hadoop-daemon.sh start zkfc
  • 啓動datanode(進程名:DataNode):在hadoop01上執行
[lan@hadoop01 ~]$ hadoop-daemons.sh start datanode

3.2.5 驗證是否成功

  • 打開瀏覽器,訪問192.168.36.134:50070以及192.168.36.135:50070,將會看到兩個namenode,一個是active而另一個是standby
  • 然後kill掉其中active的namenode進程,另一個standby的namenode將會自動轉換爲active狀態
    在這裏插入圖片描述
    在這裏插入圖片描述
  • 殺掉hadoop01節點上的namenode進程
[lan@hadoop01 ~]$ jps
17683 QuorumPeerMain
19364 Jps
18487 JournalNode
19179 DFSZKFailoverController
18669 NameNode
[lan@hadoop01 ~]$ kill -9 18669
[lan@hadoop01 ~]$ jps
17683 QuorumPeerMain
18487 JournalNode
19417 Jps
19179 DFSZKFailoverController
  • 原來爲standby的節點轉換爲了active
    在這裏插入圖片描述
[lan@hadoop01 ~]$ hadoop-daemon.sh start namenode
starting namenode, logging to /home/lan/hadoop-2.7.7/logs/hadoop-lan-namenode-hadoop01.out
[lan@hadoop01 ~]$ jps
19473 NameNode
17683 QuorumPeerMain
18487 JournalNode
19562 Jps
19179 DFSZKFailoverController

在這裏插入圖片描述

3.2.6 啓動yarn

  • 在hadoop01上啓動(此腳本將會啓動hadoop01上的resourcemanager及所有nodemanager進程)
[lan@hadoop01 ~]$ start-yarn.sh
  • 在hadoop02上啓動resourcemanager
[lan@hadoop02 ~]$ yarn-daemon.sh start resourcemanager

3.2.7 驗證是否成功

  • 打開瀏覽器,訪問192.168.36.134:8088以及192.168.36.135:8088

在這裏插入圖片描述
在這裏插入圖片描述

3.3 關閉集羣

  1. 關閉yarn
//先在hadoop01節點關閉
[lan@hadoop01 ~]$ stop-yarn.sh
//再在hadoop02節點關閉resourcemanager
[lan@hadoop02 ~]$ yarn-daemon.sh stop resourcemanager
  1. 關閉HDFS
  • 在hadoop01下執行
[lan@hadoop01 ~]$ stop-dfs.sh
  1. 關閉zkfc
  • 分別在hadoop01和hadoop02節點關閉
[lan@hadoop01 ~]$ hadoop-daemon.sh stop zkfc
[lan@hadoop02 ~]$ hadoop-daemon.sh stop zkfc
  1. 關閉zookeeper
  • 分別在hadoop01、hadoop02和hadoop04節點關閉
[lan@hadoop01 ~]$ zkServer.sh stop
[lan@hadoop02 ~]$ zkServer.sh stop
[lan@hadoop04 ~]$ zkServer.sh stop
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章