大數據平臺搭建(二):hadoop HA 集羣搭建

前言

     本章搭建zookeeper集羣和hadoop集羣

1.hadoop版本的選擇

    1.目前而言,不收費的Hadoop版本主要有三個(均是國外廠商),分別是:Apache(最原始的版本,所有發行版均基於這個版本進行改進)、Cloudera版本(Cloudera’s Distribution Including Apache Hadoop,簡稱CDH)、Hortonworks版本(Hortonworks Data Platform,簡稱“HDP”),對於國內而言,絕大多數選擇CDH版本。
    2.上一段摘自網絡,由於CDH用的比較多,所以我決定用CDH來搭建集羣,但是我不想通過CM來安裝,而是使用CDH tar包安裝方式來搭建。
    3.不用Apache版本原因是:CDH 提供的jar包比較穩定且不用自己去搭配版本(更多原因請看區別)。
    4.不用CM安裝CDH的方式原因是:太方便太傻瓜,我自己不放心,所以最終決定使用CDH5 tar包安裝。

2. CDH和Apache原始版的區別

     1. CDH對Hadoop版本的劃分非常清晰,比如,cdh3、cdh4和cdh5,相比而言,Apache版本則混亂得多;比Apache hadoop在兼容性,安全性,穩定性上有增強。
     2.CDH總是並應用了最新Bug修復或者Feature的Patch,並比Apache hadoop同功能版本提早發佈,更新速度比Apache官方快。
     3.安全 CDH支持Kerberos安全認證,apache hadoop則使用簡陋的用戶名匹配認證
     4.)CDH文檔清晰,很多采用Apache版本的用戶都會閱讀CDH提供的文檔,包括安裝文檔、升級文檔等。
     5.)CDH支持Yum/Apt包,Tar包(本次使用方式),RPM包,Cloudera Manager四種方式安裝,Apache hadoop只支持Tar包安裝。

3.CDH版本選擇

hadoop生態選用CDH5.9.3
jdk-8u161-linux-x64.tar.gz
zookeeper-3.4.5-cdh5.9.3.tar.gz
hadoop-2.6.0-cdh5.9.3.tar.gz
hive-1.1.0-cdh5.9.3.tar.gz
sqoop2-1.99.5-cdh5.9.3.tar.gz
hbase-1.2.0-cdh5.9.3.tar.gz
。。。請認準cdh5.9.3去官方下載:http://archive.cloudera.com/cdh5/cdh/5/

4.集羣規劃

主機名稱 IP 安裝軟件 運行的進程
hadoop201 192.168.8.201 jdk,hadoop NameNode、 JournalNode 、DFSZKFailoverController(zkfc)、 ResourceManager
hadoop202 192.168.8.202 jdk,hadoop NameNode、 JournalNode 、DFSZKFailoverController(zkfc)、 ResourceManager
hadoop203 192.168.8.203 jdk,hadoop,zk DataNode、NodeManager、QuorumPeerMain
hadoop204 192.168.8.204 jdk,hadoop,zk DataNode、NodeManager、QuorumPeerMain
hadoop205 192.168.8.205 jdk,hadoop, zk DataNode、NodeManager、QuorumPeerMain
  DFSZKFailoverController:監控管理NameNode,必須跟NameNode在一起
  JournalNode: 存儲NameNode的狀態等信息,包括edits文件等
  QuorumPeerMain: zk進程,爲什麼用zookeeper?主要是利用zookeeper的選舉機制、故障自動轉移、心跳監測等保障高可靠

5.HDFS HA示例圖

這裏寫圖片描述

   參考官方文檔,使用QJM的HA方案來安裝:

這裏寫圖片描述

6.zookeeper集羣搭建

    1.203上安裝,解壓zk包,找到zoo.cfg,一般有個sample配置文件,改下名字即可。
    這裏寫圖片描述
    2.vim zoo.cfg,主要修改數據目錄和3個集羣節點的通信選舉端口
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
#數據文件目錄
dataDir=/home/hadoop/zookeeper/data
#日誌目錄
#dataLogDir=/home/hadoop/zookeeper/zkdatalog
# the port at which the clients will connect
clientPort=2181
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
#server.服務編號=主機名稱:Zookeeper不同節點之間同步和通信的端口:選舉端口
server.3=hadoop203:2888:3888
server.4=hadoop204:2888:3888
server.5=hadoop205:2888:3888

    3.將配置好的zookeeper文件夾scp到204 205上
    4.三個節點分別cd到數據目錄/home/hadoop/zookeeper/data下,創建myid文件,輸入各自編號3,4,5 ,要跟zoo.cfg中配置的編號一致。
    5.啓動zk,/home/hadoop/zookeeper/sbin/zkServer.sh start
     6.驗證,jps zk進程QuorumPeerMain已啓動。zkServer.sh status,一個leader,兩個follower,即搭建成功。如果不放心,可以殺掉一個zk,然後看角色變化。

這裏寫圖片描述

7.hadoop集羣搭建

    1.下載安裝notepad++插件NPPFTP,爲了修改配置文件方便,文件可以直接在notepad中打開修改這裏寫圖片描述
    2.下載解壓hadoop對應cdh版本5.9.3,在201上安裝 ,並設置環境變量
    3.配置hdfs,共4個,參考官方文檔
     hadoop-env.sh,主要是配置jdk
    這裏寫圖片描述
     core-site.xml
    <?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <!-- 指定hdfs的nameservice爲mycluster -->
    <property>
      <name>fs.defaultFS</name>
      <value>hdfs://mycluster</value>
    </property>
    <!-- hadoop的臨時目錄 -->
    <property>  
        <name>hadoop.tmp.dir</name>  
        <value>/home/hadoop/hadoop/tmp</value>  
    </property>
    <!-- JournalNode存儲本地狀態等的臨時目錄 -->
    <property>
      <name>dfs.journalnode.edits.dir</name>
      <value>/home/hadoop/journalnodeTmp</value>
    </property>
    <!-- 指定zookeeper集羣節點-->
    <property>
       <name>ha.zookeeper.quorum</name>
       <value>hadoop203:2181,hadoop204:2181,hadoop205:2181</value>
     </property>
</configuration>
     hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<!-- 指定nameservice的名字爲mycluster -->
    <property>
      <name>dfs.nameservices</name>
      <value>mycluster</value>
    </property>
    <!-- 指定ns下的nn節點 -->
    <property>
      <name>dfs.ha.namenodes.mycluster</name>
      <value>nn1,nn2</value>
    </property>
    <!-- 指定nn的rpc通信地址 -->
    <property>
      <name>dfs.namenode.rpc-address.mycluster.nn1</name>
      <value>hadoop201:9000</value>
    </property>
    <property>
      <name>dfs.namenode.rpc-address.mycluster.nn2</name>
      <value>hadoop202:9000</value>
    </property>
    <!-- 指定nn的http地址 -->
    <property>
      <name>dfs.namenode.http-address.mycluster.nn1</name>
      <value>hadoop201:50070</value>
    </property>
    <property>
      <name>dfs.namenode.http-address.mycluster.nn2</name>
      <value>hadoop202:50070</value>
    </property>
    <!-- 指定namenode的元數據的存放目錄 -->
    <property>
      <name>dfs.namenode.shared.edits.dir</name>
      <value>qjournal://hadoop201:8485;hadoop202:8485/mycluster</value>
    </property>
    <property>
      <name>dfs.journalnode.edits.dir</name>
      <value>/home/hadoop/hadoop/journal</value>
    </property>
    <!-- 指定故障轉移的實現類 -->
    <property>
      <name>dfs.client.failover.proxy.provider.mycluster</name>
      <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <!-- 配置隔離機制方法,主要處理場景:
            1. nn1出問題但沒down
            2. nn1和zkfc同時down掉(無法彙報給zk)-->
    <property>
      <name>dfs.ha.fencing.methods</name>
      <value>
            sshfence
            shell(/bin/true)
      </value>
    </property>
    <!-- 配置隔離機制需要免密登陸-->
    <property>
      <name>dfs.ha.fencing.ssh.private-key-files</name>
      <value>/home/hadoop/.ssh/id_rsa</value>
    </property>
    <!-- 配置sshfence隔離機制超時時間-->
    <property>
      <name>dfs.ha.fencing.ssh.connect-timeout</name>
      <value>30000</value>
    </property>
    <!-- 開啓故障自動切換-->
    <property>
       <name>dfs.ha.automatic-failover.enabled</name>
       <value>true</value>
     </property>
</configuration>
     slaves
    這裏寫圖片描述
    4.配置yarn,共2個,參考官方文檔
     mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <!-- 指定mr框架爲yarn方式 -->
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>
     yarn-site.xml
<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>
    <!-- rm  HA 配置啓動 -->
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
     </property>
     <!-- rm故障自動轉移 -->
     <property>  
        <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>  
        <value>true</value>  
     </property> 
     <property>  
           <name>yarn.resourcemanager.recovery.enabled</name>  
          <value>true</value>  
     </property>
     <!-- 指定yarn cluster的唯一id -->
     <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>cluster1</value>
     </property>
     <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
     </property>
     <!-- 指定rm地址 --> 
     <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>hadoop201</value>
     </property>
     <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>hadoop202</value>
     </property>
     <!--  rm1端口號 -->
     <property>  
           <name>yarn.resourcemanager.address.rm1</name>  
          <value>hadoop201:8032</value>  
     </property>   
     <!-- rm1調度器的端口號 -->     
     <property>  
          <name>yarn.resourcemanager.scheduler.address.rm1</name>  
          <value>hadoop201:8034</value>  
     </property>  
     <!-- rm1 webapp的端口號 -->    
     <property>  
          <name>yarn.resourcemanager.webapp.address.rm1</name>  
          <value>hadoop201:8088</value>  
     </property>
    <!--  rm2端口號 -->
     <property>  
           <name>yarn.resourcemanager.address.rm2</name>  
          <value>hadoop202:8032</value>  
     </property>   
     <!-- rm2調度器的端口號 -->     
     <property>  
          <name>yarn.resourcemanager.scheduler.address.rm2</name>  
          <value>hadoop202:8034</value>  
     </property>  
     <!-- rm2 webapp的端口號 -->    
     <property>  
          <name>yarn.resourcemanager.webapp.address.rm2</name>  
          <value>hadoop202:8088</value>  
     </property>  
     <!-- zk集羣地址 --> 
     <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>hadoop203:2181,hadoop204:2181,hadoop205:2181</value>
     </property>
     <property>  
          <name>yarn.resourcemanager.zk.state-store.address</name>  
           <value>hadoop203:2181,hadoop204:2181,hadoop205:2181</value>  
     </property> 
     <!-- 執行MapReduce需要配置的shuffle過程 --> 
     <property>  
           <name>yarn.nodemanager.aux-services</name>  
          <value>mapreduce_shuffle</value>  
     </property>  
     <property>  
           <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>  
          <value>org.apache.hadoop.mapred.ShuffleHandler</value>  
     </property>
</configuration>

更多配置屬性,請參考*-default.xml默認配置文件。
至此,集羣的配置工作基本完成,下面就要完成集羣的初始化、啓動和關閉了。


8.hadoop集羣初始化

    1.啓動zk,在203 204 205上執行
    2.啓動journalNode,在201 202節點上執行hadoop-daemon.sh start journalnode
    3.格式化hdfs,在201上執行命令: hdfs namenode -format
    4.格式化高可用,在201上執行: hdfs zkfc -formatZK,查看zk根目錄下,多了個ha的目錄:這裏寫圖片描述
    5.同步NameNode的元數據信息
     201執行: hdfs namenode
     202節點執行: hdfs namenode -bootstrapStandby,同步完成後,ctrl+c結束掉201的進程
    6.啓動hdfs,在201上執行,sbin/start-dfs.sh
    7.啓動yarn,
     在201上執行,sbin/start-yarn.sh
     在202上執行:sbin/yarn-daemon.sh start resourcemanager (單獨啓動rm進程)
    8.各節點進程展示

這裏寫圖片描述

    9.測試hdfs HA,kill掉一個namenode進程,然後通過http://hadoop201:50070/http://hadoop202:50070/查看201和202節點的狀態轉換,不再演示。
    10.測試hdfs和yarn
     本地創建:vim a.txt
      上傳到hdfs:
      hdfs dfs -mkdir /test
      hdfs dfs -put a.txt /test
      hdfs dfs -ls /test/

這裏寫圖片描述

    測試yarn:查看rs狀態:bin/yarn rmadmin -getServiceState rm1
    這裏寫圖片描述
    運行mr,測試workcount示例:
     hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.9.3.jar wordcount /test/a.txt /test/out/

這裏寫圖片描述

     統計信息如下

這裏寫圖片描述

9.集羣啓動順序

    1.啓動zk,203 204 205bin目錄下 ./zkServer.sh start
    2.啓動hdfs,201 sbin/start-dfs.sh
    3.啓動yarn,
     201: sbin/start-yarn.sh
     202: sbin/yarn-daemon.sh start resourcemanager

10.集羣關閉順序

    1.關閉yarn
     201執行 sbin/stop-yarn.sh
     202執行 sbin/yarn-daemon.sh stop resourcemanager
    2.關閉hdfs,201執行 sbin/stop-dfs.sh
    3.關閉zookeeper,203 204 205執行./zkServer.sh stop

11.觀察dfs啓動和關閉順序:

這裏寫圖片描述
這裏寫圖片描述

12.搭建過程中遇到的問題

    1.Linux用戶權限問題,用root用戶搭建不存在這問題
      chown -R hadoop zookeeper/ 遞歸更改文件夾的所屬用戶
      chgrp -R hadoop zookeeper/ 遞歸更改文件夾的所屬組
    2.hdfs格式化會報connected refused
      原因:namenode節點沒有啓動journalnode進程
      解決:namenode節點 優先啓動journalnode再進行格式化
    3.201啓動dfs,202節點的namenode沒有啓動起來,日誌報connected refused
      原因:202的core-site.xml 沒有指定hadoop臨時文件目錄,這個需要跟201指定目錄一致
       解決:把所有節點的core-site.xml中指定同樣的目錄,刪除tmp文件夾,重新格式化

總結:

     本文主要講了zookeeper集羣和hadoop集羣的搭建,其實看下來主要工作在配置文件,其他的都比較簡單。下一章節將繼續擴展。

大數據平臺搭建(三):hive 介紹和安裝配置

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章