Linux CentOS7.5搭建高可用Hadoop分佈式集羣環境

1.Linux環境準備

  1.1 關閉防火牆(三臺虛擬機均執行)

firewall-cmd --state   #查看防火牆狀態
 
systemctl start firewalld.service   #開啓防火牆
 
systemctl stop firewalld.service     #關閉防火牆
 
systemctl disable firewalld.service  #禁止開機啓動防火牆

  1.2 配置靜態IP地址(三臺虛擬機均執行)

附:虛擬機Linux配置靜態ip

[root@node01 ~]# vim /etc/sysconfig/network-scripts/ifcfg-ens33

完整內容: 

TYPE="Ethernet"
PROXY_METHOD="none"
BROWSER_ONLY="no"
BOOTPROTO="static"
DEFROUTE="yes"
IPV4_FAILURE_FATAL="no"
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
IPV6_DEFROUTE="yes"
IPV6_FAILURE_FATAL="no"
IPV6_ADDR_GEN_MODE="stable-privacy"
NAME="ens33"
UUID="a5a7540d-fafb-47c8-bd59-70f1f349462e"
DEVICE="ens33"
ONBOOT="yes"

IPADDR="192.168.24.137"
GATEWAY="192.168.24.2"
NETMASK="255.255.255.0"
DNS1="8.8.8.8"

注:

     這裏ONBOOT設置成yes,BOOTPROTO改爲static,由自動分配改成靜態ip,然後就是配置靜態ip、網關、子網掩碼、DNS.其它內容,三臺虛擬機一致,IPADDR由137-139依次分配。

問題:

     剛開始我的網關設置的是192.168.24.1.結果重啓虛擬機和重啓網卡都沒有用,還是一樣ping 不通8.8.8.8,也ping不通百度。

解決:

   編輯-虛擬網絡編輯器,進入界面。選擇自己的虛擬網卡,你就看子網地址跟你設置的ip是不是同一個網段。然後點擊NAT設置。

   可以看到子網掩碼是255.255.255.0,網關ip是192.168.24.2,而不是我剛開始認爲的192.168.24.1。於是我重新對/etc/sysconfig/network-scripts/ifcfg-ens33進行編輯,將網關地址改成192.168.24.2,然後重啓網卡。

[root@node01 yum.repos.d]# netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         192.168.24.2    0.0.0.0         UG        0 0          0 ens33
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 ens33
192.168.24.0    0.0.0.0         255.255.255.0   U         0 0          0 ens33
[root@node01 yum.repos.d]# vim /etc/sysconfig/network-scripts/ifcfg-ens33
[root@node01 yum.repos.d]# systemctl restart network
[root@node01 yum.repos.d]# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=128 time=32.3 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=128 time=32.9 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=128 time=31.7 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=128 time=31.7 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=128 time=31.7 ms

  1.3 修改hostname(三臺虛擬機均修改)

注:使用命令編輯:vim /ect/sysconfig/network修改內容:HOSTNAME=node01,在Linux Centos7.5中好像並不適用。因此,在這裏我使用命令:hostnamectl set-hostname node01或者直接使用vim /etc/hostname來修改。

修改完畢,需要重啓方可生效。可以使用命令reboot.

  1.4 設置ip和域名映射(三臺虛擬機均修改.新增部分)

192.168.24.137 node01 node01.hadoop.com
192.168.24.138 node02 node02.hadoop.com
192.168.24.139 node03 node03.hadoop.com

  1.5 三臺機器機器免密碼登錄(三臺虛擬機均修改)

爲什麼要免密登錄
  - Hadoop 節點衆多, 所以一般在主節點啓動從節點, 這個時候就需要程序自動在主節點登錄到從節點中, 如果不能免密就每次都要輸入密碼, 非常麻煩
- 免密 SSH 登錄的原理
  1. 需要先在 B節點 配置 A節點 的公鑰
  2. A節點 請求 B節點 要求登錄
  3. B節點 使用 A節點 的公鑰, 加密一段隨機文本
  4. A節點 使用私鑰解密, 併發回給 B節點
  5. B節點 驗證文本是否正確

 第一步:三臺機器生成公鑰與私鑰

     在三臺機器執行以下命令,生成公鑰與私鑰。命令如下:ssh-keygen -t rsa

 第二步:拷貝公鑰到同一臺機器

   將三臺機器將拷貝公鑰到第一臺機器,三臺機器執行命令:

ssh-copy-id node01

[root@node02 ~]# ssh-copy-id node01
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'node01 (192.168.24.137)' can't be established.
ECDSA key fingerprint is SHA256:GzI3JXtwr1thv7B0pdcvYQSpd98Nj1PkjHnvABgHFKI.
ECDSA key fingerprint is MD5:00:00:7b:46:99:5e:ff:f2:54:84:19:25:2c:63:0a:9e.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@node01's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'node01'"
and check to make sure that only the key(s) you wanted were added.

 第三步:複製第一臺機器的認證到其他機器

  將第一臺機器(192.168.24.137)的公鑰拷貝到其他機器(192.168.24.138,192.168.24.139)上,在第一臺機器上面使用以下命令

scp /root/.ssh/authorized_keys node02:/root/.ssh

scp /root/.ssh/authorized_keys node03:/root/.ssh1.6

[root@node01 ~]# scp /root/.ssh/authorized_keys node02:/root/.ssh
The authenticity of host 'node02 (192.168.24.138)' can't be established.
ECDSA key fingerprint is SHA256:GzI3JXtwr1thv7B0pdcvYQSpd98Nj1PkjHnvABgHFKI.
ECDSA key fingerprint is MD5:00:00:7b:46:99:5e:ff:f2:54:84:19:25:2c:63:0a:9e.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node02,192.168.24.138' (ECDSA) to the list of known hosts.
root@node02's password:
authorized_keys                                                       100%  786   719.4KB/s   00:00
[root@node01 ~]# scp /root/.ssh/authorized_keys node03:/root/.ssh
The authenticity of host 'node03 (192.168.24.139)' can't be established.
ECDSA key fingerprint is SHA256:TyZdob+Hr1ZX7WRSeep1saPljafCrfto9UgRWNoN+20.
ECDSA key fingerprint is MD5:53:64:22:86:20:19:da:51:06:f9:a1:a9:a8:96:4f:af.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node03,192.168.24.139' (ECDSA) to the list of known hosts.
root@node03's password:
authorized_keys                                                       100%  786   692.6KB/s   00:00

 可以使用如下命令,在三臺虛擬機直接互相檢測,是否免密登錄了:

[root@node02 hadoop-2.7.5]# cd ~/.ssh
[root@node02 .ssh]# ssh node01
Last login: Thu Jun 11 10:12:27 2020 from 192.168.24.1
[root@node01 ~]# ssh node02
Last login: Thu Jun 11 14:51:58 2020 from node03

1.6 三臺機器時鐘同步(三臺虛擬機均執行)

爲什麼需要時間同步

- 因爲很多分佈式系統是有狀態的, 比如說存儲一個數據, A節點 記錄的時間是 1, B節點 記錄的時間是 2, 就會出問題
## 安裝
[root@node03 ~]# yum install -y ntp
## 啓動定時任務
[root@node03 ~]# crontab -e
no crontab for root - using an empty one
crontab: installing new crontab
## 文件中添加如下內容:
*/1 * * * * /usr/sbin/ntpdate ntp4.aliyun.com;

注:如果在使用yum install ....命令時,遇到如下錯誤:

/var/run/yum.pid 已被鎖定,PID 爲 5396 的另一個程序正在運行。
Another app is currently holding the yum lock; waiting for it to exit...
  另一個應用程序是:yum
    內存: 70 M RSS (514 MB VSZ)
    已啓動: Thu Jun 11 10:02:10 2020 - 18:48之前
    狀態  :跟蹤/停止,進程ID:5396
Another app is currently holding the yum lock; waiting for it to exit...
  另一個應用程序是:yum
    內存: 70 M RSS (514 MB VSZ)
    已啓動: Thu Jun 11 10:02:10 2020 - 18:50之前
    狀態  :跟蹤/停止,進程ID:5396
^Z
[1]+  已停止               yum install -y ntp

可以使用此命令解決: 

[root@node03 ~]# rm -f /var/run/yum.pid

如果想要將Linux CentOS的yum源更換爲國內yum源,可以使用如下命令:

附:centos yum repo 國內鏡像

阿里雲鏡像

#備份
cp /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
#如果你的centos 是 5
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-5.repo
#如果你的centos是6
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-6.repo
yum clean all
yum makecache

163鏡像 :

cp /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.163.com/.help/CentOS5-Base-163.repo
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.163.com/.help/CentOS6-Base-163.repo
yum clean all
yum makecache

 2.安裝jdk

   2.1 安裝包分發到其他機器

  第一臺機器(192.168.24.37)上面執行以下兩個命令:

[root@node01 software]# ls
hadoop-2.7.5  jdk1.8.0_241  zookeeper-3.4.9  zookeeper-3.4.9.tar.gz
[root@node01 software]# java -version
java version "1.8.0_241"
Java(TM) SE Runtime Environment (build 1.8.0_241-b07)
Java HotSpot(TM) 64-Bit Server VM (build 25.241-b07, mixed mode)
[root@node01 software]# scp -r  /software/jdk1.8.0_241/ node02:/software/jdk1.8.0_241/
root@node02's password:
(省略.....)
[root@node01 software]# scp -r  /software/jdk1.8.0_241/ node03:/software/jdk1.8.0_241/
root@node03's password:
(省略.....)

  ps: 這裏我的node1節點中的jdk1.8已經安裝配置好了,可以參考:jdk安裝

       執行完畢後,可以在node2、node3節點上查看,可以發現自動創建了/software/jdk1.8.0_241/目錄,並將node1節點的jdk安裝包傳輸到了node2和node3.然後在node2和node3節點中使用如下命令配置jdk:

[root@node02 software]# vim /etc/profile
[root@node02 software]# source /etc/profile
[root@node02 software]# java -version
java version "1.8.0_241"
Java(TM) SE Runtime Environment (build 1.8.0_241-b07)
Java HotSpot(TM) 64-Bit Server VM (build 25.241-b07, mixed mode)

 /etc/profile 新增內容:

export JAVA_HOME=/software/jdk1.8.0_241
export CLASSPATH="$JAVA_HOME/lib"
export PATH="$JAVA_HOME/bin:$PATH"

3.zookeeper集羣安裝

服務器IP 主機名 myid的值
192.168.174.100 node01 1
192.168.174.110 node02 2
192.168.174.120 node03 3

  3.1 下載zookeeeper的壓縮包

 下載網址如下:zookeeper下載地址,我使用的zk版本爲3.4.9。可以使用wget下載。

  3.2 解壓

[root@node01 software]# tar -zxvf zookeeper-3.4.9.tar.gz
[root@node01 software]# ls
hadoop-2.7.5  jdk1.8.0_241  zookeeper-3.4.9  zookeeper-3.4.9.tar.gz

  3.3 修改配置文件

第一臺機器(node1)修改配置文件  

cd /software/zookeeper-3.4.9/conf/

cp zoo_sample.cfg zoo.cfg

mkdir -p /software/zookeeper-3.4.9/zkdatas/

vim zoo.cfg: (新增部分)

dataDir=/software/zookeeper-3.4.9/zkdatas
# 保留多少個快照
autopurge.snapRetainCount=3
# 日誌多少小時清理一次
autopurge.purgeInterval=1
# 集羣中服務器地址
server.1=node01:2888:3888
server.2=node02:2888:3888
server.3=node03:2888:3888

  3.4 添加myid配置

   在第一臺機器(node1)的/software/zookeeper-3.4.9/zkdatas /這個路徑下創建一個文件,文件名爲myid ,文件內容爲1,使用命令:

echo 1 > /software/zookeeper-3.4.9/zkdatas/myid

  3.5 安裝包分發並修改myid的值

 安裝包分發到其他機器

第一臺機器(node1)上面執行以下兩個命令

[root@node01 conf]# scp -r  /software/zookeeper-3.4.9/ node02:/software/zookeeper-3.4.9/
root@node02's password:
(省略.....)
[root@node01 conf]# scp -r  /software/zookeeper-3.4.9/ node03:/software/zookeeper-3.4.9/
root@node03's password:
(省略.....)

第二臺機器上修改myid的值爲2

echo 2 > /software/zookeeper-3.4.9/zkdatas/myid

第三臺機器上修改myid的值爲3

echo 3 > /software/zookeeper-3.4.9/zkdatas/myid

3.6 三臺機器啓動zookeeper服務(三臺虛擬機均執行)

#啓動
/software/zookeeper-3.4.9/bin/zkServer.sh start

#查看啓動狀態
/software/zookeeper-3.4.9/bin/zkServer.sh status

如圖:

4、安裝配置hadoop

使用完全分佈式,實現namenode高可用,ResourceManager的高可用

 

192.168.24.137

192.168.24.138

192.168.24.139

zookeeper

zk

zk

zk

HDFS

JournalNode

JournalNode

JournalNode

NameNode

NameNode

 

ZKFC

ZKFC

 

DataNode

DataNode

DataNode

YARN

 

ResourceManager

ResourceManager

NodeManager

NodeManager

NodeManager

MapReduce

 

 

JobHistoryServer

  4.1  Linux centos7.5 編譯hadoop源碼

    這裏我並不直接使用hadoop提供的包,而使用自己編譯過後的hadoop的包。停止之前的hadoop集羣的所有服務,並刪除所有機器的hadoop安裝包.

[root@localhost software]# cd /software/hadoop-2.7.5-src/hadoop-dist/target
[root@localhost target]# ls
antrun                    hadoop-2.7.5.tar.gz                 javadoc-bundle-options
classes                   hadoop-dist-2.7.5.jar               maven-archiver
dist-layout-stitching.sh  hadoop-dist-2.7.5-javadoc.jar       maven-shared-archive-resources
dist-tar-stitching.sh     hadoop-dist-2.7.5-sources.jar       test-classes
hadoop-2.7.5              hadoop-dist-2.7.5-test-sources.jar  test-dir
[root@localhost target]# cp -r hadoop-2.7.5 /software
[root@localhost target]# cd /software/
[root@localhost software]# ls
apache-maven-3.0.5             findbugs-1.3.9.tar.gz    jdk1.7.0_75                protobuf-2.5.0
apache-maven-3.0.5-bin.tar.gz  hadoop-2.7.5             jdk-7u75-linux-x64.tar.gz  protobuf-2.5.0.tar.gz
apache-tomcat-6.0.53.tar.gz    hadoop-2.7.5-src         mvnrepository              snappy-1.1.1
findbugs-1.3.9                 hadoop-2.7.5-src.tar.gz  mvnrepository.tar.gz       snappy-1.1.1.tar.gz
[root@localhost software]# cd hadoop-2.7.5
[root@localhost hadoop-2.7.5]# ls
bin  etc  include  lib  libexec  LICENSE.txt  NOTICE.txt  README.txt  sbin  share
[root@localhost hadoop-2.7.5]# cd etc
[root@localhost etc]# ls
hadoop
[root@localhost etc]# cd hadoop/
[root@localhost hadoop]# ls
capacity-scheduler.xml      hadoop-policy.xml        kms-log4j.properties        ssl-client.xml.example
configuration.xsl           hdfs-site.xml            kms-site.xml                ssl-server.xml.example
container-executor.cfg      httpfs-env.sh            log4j.properties            yarn-env.cmd
core-site.xml               httpfs-log4j.properties  mapred-env.cmd              yarn-env.sh
hadoop-env.cmd              httpfs-signature.secret  mapred-env.sh               yarn-site.xml
hadoop-env.sh               httpfs-site.xml          mapred-queues.xml.template
hadoop-metrics2.properties  kms-acls.xml             mapred-site.xml.template
hadoop-metrics.properties   kms-env.sh               slaves

附: 可以使用notepad++插件:NppFtp來對遠程服務器中文件進行編輯:

在下列小圖標中找到Show NppFTP Window,結果並未找到。

點擊插件-插件管理-搜索nppftp-勾選-安裝。

再次打開就會多一個小圖標:點擊connect,現在就可以對遠程服務器文件進行編輯了。

不過這裏呢,我不使用notepad++插件:NppFtp,我使用MobaXterm對遠程服務器文件進行編輯。

4.2 修改hadoop配置文件

  4.2.1 修改 core-site.xml

cd /software/hadoop-2.7.5/etc/hadoop
<configuration>
	<!-- 指定NameNode的HA高可用的zk地址  -->
	<property>
		<name>ha.zookeeper.quorum</name>
		<value>node01:2181,node02:2181,node03:2181</value>
	</property>
	<!-- 指定HDFS訪問的域名地址  -->
	<property>
		<name>fs.defaultFS</name>
		<value>hdfs://ns</value>
	</property>
	<!-- 臨時文件存儲目錄  -->
	<property>
		<name>hadoop.tmp.dir</name>
		<value>/software/hadoop-2.7.5/data/tmp</value>
	</property>
	<!-- 開啓hdfs垃圾箱機制,指定垃圾箱中的文件七天之後就徹底刪掉
			單位爲分鐘
	 -->
	<property>
		<name>fs.trash.interval</name>
		<value>10080</value>
	</property>
</configuration>

 4.2.2 修改 hdfs-site.xml:

<configuration>
	<!--  指定命名空間  -->
	<property>
		<name>dfs.nameservices</name>
		<value>ns</value>
	</property>
	<!--  指定該命名空間下的兩個機器作爲我們的NameNode  -->
	<property>
		<name>dfs.ha.namenodes.ns</name>
		<value>nn1,nn2</value>
	</property>
	<!-- 配置第一臺服務器的namenode通信地址  -->
	<property>
		<name>dfs.namenode.rpc-address.ns.nn1</name>
		<value>node01:8020</value>
	</property>
	<!--  配置第二臺服務器的namenode通信地址  -->
	<property>
		<name>dfs.namenode.rpc-address.ns.nn2</name>
		<value>node02:8020</value>
	</property>
	<!-- 所有從節點之間相互通信端口地址 -->
	<property>
		<name>dfs.namenode.servicerpc-address.ns.nn1</name>
		<value>node01:8022</value>
	</property>
	<!-- 所有從節點之間相互通信端口地址 -->
	<property>
		<name>dfs.namenode.servicerpc-address.ns.nn2</name>
		<value>node02:8022</value>
	</property>
	<!-- 第一臺服務器namenode的web訪問地址  -->
	<property>
		<name>dfs.namenode.http-address.ns.nn1</name>
		<value>node01:50070</value>
	</property>
	<!-- 第二臺服務器namenode的web訪問地址  -->
	<property>
		<name>dfs.namenode.http-address.ns.nn2</name>
		<value>node02:50070</value>
	</property>
	<!-- journalNode的訪問地址,注意這個地址一定要配置 -->
	<property>
		<name>dfs.namenode.shared.edits.dir</name>
		<value>qjournal://node01:8485;node02:8485;node03:8485/ns1</value>
	</property>
	<!--  指定故障自動恢復使用的哪個java類 -->
	<property>
		<name>dfs.client.failover.proxy.provider.ns</name>
		<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
	</property>
	<!-- 故障轉移使用的哪種通信機制 -->
	<property>
		<name>dfs.ha.fencing.methods</name>
		<value>sshfence</value>
	</property>
	<!-- 指定通信使用的公鑰  -->
	<property>
		<name>dfs.ha.fencing.ssh.private-key-files</name>
		<value>/root/.ssh/id_rsa</value>
	</property>
	<!-- journalNode數據存放地址  -->
	<property>
		<name>dfs.journalnode.edits.dir</name>
		<value>/software/hadoop-2.7.5/data/dfs/jn</value>
	</property>
	<!-- 啓用自動故障恢復功能 -->
	<property>
		<name>dfs.ha.automatic-failover.enabled</name>
		<value>true</value>
	</property>
	<!-- namenode產生的文件存放路徑 -->
	<property>
		<name>dfs.namenode.name.dir</name>
		<value>file:///software/hadoop-2.7.5/data/dfs/nn/name</value>
	</property>
	<!-- edits產生的文件存放路徑 -->
	<property>
		<name>dfs.namenode.edits.dir</name>
		<value>file:///software/hadoop-2.7.5/data/dfs/nn/edits</value>
	</property>
	<!-- dataNode文件存放路徑 -->
	<property>
		<name>dfs.datanode.data.dir</name>
		<value>file:///software/hadoop-2.7.5/data/dfs/dn</value>
	</property>
	<!-- 關閉hdfs的文件權限 -->
	<property>
		<name>dfs.permissions</name>
		<value>false</value>
	</property>
	<!-- 指定block文件塊的大小 -->
	<property>
		<name>dfs.blocksize</name>
		<value>134217728</value>
	</property>
</configuration>

 4.2.3 修改yarn-site.xml

注:node03與node02配置不同

<configuration>
	<!-- Site specific YARN configuration properties -->
	<!-- 是否啓用日誌聚合.應用程序完成後,日誌彙總收集每個容器的日誌,這些日誌移動到文件系統,例如HDFS. -->
	<!-- 用戶可以通過配置"yarn.nodemanager.remote-app-log-dir"、"yarn.nodemanager.remote-app-log-dir-suffix"來確定日誌移動到的位置 -->
	<!-- 用戶可以通過應用程序時間服務器訪問日誌 -->
	<!-- 啓用日誌聚合功能,應用程序完成後,收集各個節點的日誌到一起便於查看 -->
	<property>
		<name>yarn.log-aggregation-enable</name>
		<value>true</value>
	</property>
	<!--開啓resource manager HA,默認爲false-->
	<property>
		<name>yarn.resourcemanager.ha.enabled</name>
		<value>true</value>
	</property>
	<!-- 集羣的Id,使用該值確保RM不會做爲其它集羣的active -->
	<property>
		<name>yarn.resourcemanager.cluster-id</name>
		<value>mycluster</value>
	</property>
	<!--配置resource manager  命名-->
	<property>
		<name>yarn.resourcemanager.ha.rm-ids</name>
		<value>rm1,rm2</value>
	</property>
	<!-- 配置第一臺機器的resourceManager -->
	<property>
		<name>yarn.resourcemanager.hostname.rm1</name>
		<value>node03</value>
	</property>
	<!-- 配置第二臺機器的resourceManager -->
	<property>
		<name>yarn.resourcemanager.hostname.rm2</name>
		<value>node02</value>
	</property>
	<!-- 配置第一臺機器的resourceManager通信地址 -->
	<property>
		<name>yarn.resourcemanager.address.rm1</name>
		<value>node03:8032</value>
	</property>
	<property>
		<name>yarn.resourcemanager.scheduler.address.rm1</name>
		<value>node03:8030</value>
	</property>
	<property>
		<name>yarn.resourcemanager.resource-tracker.address.rm1</name>
		<value>node03:8031</value>
	</property>
	<property>
		<name>yarn.resourcemanager.admin.address.rm1</name>
		<value>node03:8033</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.address.rm1</name>
		<value>node03:8088</value>
	</property>
	<!-- 配置第二臺機器的resourceManager通信地址 -->
	<property>
		<name>yarn.resourcemanager.address.rm2</name>
		<value>node02:8032</value>
	</property>
	<property>
		<name>yarn.resourcemanager.scheduler.address.rm2</name>
		<value>node02:8030</value>
	</property>
	<property>
		<name>yarn.resourcemanager.resource-tracker.address.rm2</name>
		<value>node02:8031</value>
	</property>
	<property>
		<name>yarn.resourcemanager.admin.address.rm2</name>
		<value>node02:8033</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.address.rm2</name>
		<value>node02:8088</value>
	</property>
	<!--開啓resourcemanager自動恢復功能-->
	<property>
		<name>yarn.resourcemanager.recovery.enabled</name>
		<value>true</value>
	</property>
	<!--在node3上配置rm1,在node2上配置rm2,注意:一般都喜歡把配置好的文件遠程複製到其它機器上,但這個在YARN的另一個機器上一定要修改,其他機器上不配置此項-->
	<property>
		<name>yarn.resourcemanager.ha.id</name>
		<value>rm1</value>
		<description>If we want to launch more than one RM in single node, we need this configuration</description>
	</property>
	<!--用於持久存儲的類。嘗試開啓-->
	<property>
		<name>yarn.resourcemanager.store.class</name>
		<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
	</property>
	<property>
		<name>yarn.resourcemanager.zk-address</name>
		<value>node02:2181,node03:2181,node01:2181</value>
		<description>For multiple zk services, separate them with comma</description>
	</property>
	<!--開啓resourcemanager故障自動切換,指定機器-->
	<property>
		<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
		<value>true</value>
		<description>Enable automatic failover; By default, it is enabled only when HA is enabled.</description>
	</property>
	<property>
		<name>yarn.client.failover-proxy-provider</name>
		<value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>
	</property>
	<!-- 允許分配給一個任務最大的CPU核數,默認是8 -->
	<property>
		<name>yarn.nodemanager.resource.cpu-vcores</name>
		<value>4</value>
	</property>
	<!-- 每個節點可用內存,單位MB -->
	<property>
		<name>yarn.nodemanager.resource.memory-mb</name>
		<value>512</value>
	</property>
	<!-- 單個任務可申請最少內存,默認1024MB -->
	<property>
		<name>yarn.scheduler.minimum-allocation-mb</name>
		<value>512</value>
	</property>
	<!-- 單個任務可申請最大內存,默認8192MB -->
	<property>
		<name>yarn.scheduler.maximum-allocation-mb</name>
		<value>512</value>
	</property>
	<!--多長時間聚合刪除一次日誌 此處-->
	<property>
		<name>yarn.log-aggregation.retain-seconds</name>
		<value>2592000</value>
		<!--30 day-->
	</property>
	<!--時間在幾秒鐘內保留用戶日誌。只適用於如果日誌聚合是禁用的-->
	<property>
		<name>yarn.nodemanager.log.retain-seconds</name>
		<value>604800</value>
		<!--7 day-->
	</property>
	<!--指定文件壓縮類型用於壓縮彙總日誌-->
	<property>
		<name>yarn.nodemanager.log-aggregation.compression-type</name>
		<value>gz</value>
	</property>
	<!-- nodemanager本地文件存儲目錄-->
	<property>
		<name>yarn.nodemanager.local-dirs</name>
		<value>/software/hadoop-2.7.5/yarn/local</value>
	</property>
	<!-- resourceManager  保存最大的任務完成個數 -->
	<property>
		<name>yarn.resourcemanager.max-completed-applications</name>
		<value>1000</value>
	</property>
	<!-- 逗號隔開的服務列表,列表名稱應該只包含a-zA-Z0-9_,不能以數字開始-->
	<property>
		<name>yarn.nodemanager.aux-services</name>
		<value>mapreduce_shuffle</value>
	</property>
	<!--rm失聯後重新鏈接的時間-->
	<property>
		<name>yarn.resourcemanager.connect.retry-interval.ms</name>
		<value>2000</value>
	</property>
</configuration>

4.2.4 修改mapred-site.xml

<configuration>
	<!--指定運行mapreduce的環境是yarn -->
	<property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
	</property>
	<!-- MapReduce JobHistory Server IPC host:port -->
	<property>
		<name>mapreduce.jobhistory.address</name>
		<value>node03:10020</value>
	</property>
	<!-- MapReduce JobHistory Server Web UI host:port -->
	<property>
		<name>mapreduce.jobhistory.webapp.address</name>
		<value>node03:19888</value>
	</property>
	<!-- The directory where MapReduce stores control files.默認 ${hadoop.tmp.dir}/mapred/system -->
	<property>
		<name>mapreduce.jobtracker.system.dir</name>
		<value>/software/hadoop-2.7.5/data/system/jobtracker</value>
	</property>
	<!-- The amount of memory to request from the scheduler for each map task. 默認 1024-->
	<property>
		<name>mapreduce.map.memory.mb</name>
		<value>1024</value>
	</property>
	<!-- <property>
                <name>mapreduce.map.java.opts</name>
                <value>-Xmx1024m</value>
        </property> -->
	<!-- The amount of memory to request from the scheduler for each reduce task. 默認 1024-->
	<property>
		<name>mapreduce.reduce.memory.mb</name>
		<value>1024</value>
	</property>
	<!-- <property>
               <name>mapreduce.reduce.java.opts</name>
               <value>-Xmx2048m</value>
        </property> -->
	<!-- 用於存儲文件的緩存內存的總數量,以兆字節爲單位。默認情況下,分配給每個合併流1MB,給個合併流應該尋求最小化。默認值100-->
	<property>
		<name>mapreduce.task.io.sort.mb</name>
		<value>100</value>
	</property>
	<!-- <property>
        <name>mapreduce.jobtracker.handler.count</name>
        <value>25</value>
        </property>-->
	<!-- 整理文件時用於合併的流的數量。這決定了打開的文件句柄的數量。默認值10-->
	<property>
		<name>mapreduce.task.io.sort.factor</name>
		<value>10</value>
	</property>
	<!-- 默認的並行傳輸量由reduce在copy(shuffle)階段。默認值5-->
	<property>
		<name>mapreduce.reduce.shuffle.parallelcopies</name>
		<value>25</value>
	</property>
	<property>
		<name>yarn.app.mapreduce.am.command-opts</name>
		<value>-Xmx1024m</value>
	</property>
	<!-- MR AppMaster所需的內存總量。默認值1536-->
	<property>
		<name>yarn.app.mapreduce.am.resource.mb</name>
		<value>1536</value>
	</property>
	<!-- MapReduce存儲中間數據文件的本地目錄。目錄不存在則被忽略。默認值${hadoop.tmp.dir}/mapred/local-->
	<property>
		<name>mapreduce.cluster.local.dir</name>
		<value>/software/hadoop-2.7.5/data/system/local</value>
	</property>
</configuration>

4.2.5 修改slaves

node01
node02
node03

4.2.6 修改hadoop-env.sh

export JAVA_HOME=/software/jdk1.8.0_241

4.2.7 將第一臺機器(node1)hadoop的安裝包發送到其他機器上 

[root@node01 software]# ls
hadoop-2.7.5  jdk1.8.0_241  zookeeper-3.4.9  zookeeper-3.4.9.tar.gz
[root@node01 software]# scp -r hadoop-2.7.5/ node02:$PWD
root@node02's password:
(省略.....)
[root@node01 software]# scp -r hadoop-2.7.5/ node03:$PWD
root@node03's password:
(省略.....)

4.2.8 創建目錄(三臺虛擬機都創建)

mkdir -p /software/hadoop-2.7.5/data/dfs/nn/name
mkdir -p /software/hadoop-2.7.5/data/dfs/nn/edits
mkdir -p /software/hadoop-2.7.5/data/dfs/nn/name
mkdir -p /software/hadoop-2.7.5/data/dfs/nn/edits

如圖: 

4.2.9 更改node02、node03 節點中的yarn-site.xml

node01: 注掉yarn.resourcemanager.ha.id這一部分

<!-- 
<property>
	<name>yarn.resourcemanager.ha.id</name>
	<value>rm1</value>
	<description>If we want to launch more than one RM in single node, we need this configuration</description>
</property>
-->

node02:

<property>
	<name>yarn.resourcemanager.ha.id</name>
	<value>rm2</value>
	<description>If we want to launch more than one RM in single node, we need this configuration</description>
</property>

node03:

<property>
	<name>yarn.resourcemanager.ha.id</name>
	<value>rm1</value>
	<description>If we want to launch more than one RM in single node, we need this configuration</description>
</property>

5、啓動hadoop

 5.1 啓動HDFS過程

 node01機器執行以下命令:

bin/hdfs zkfc -formatZK

sbin/hadoop-daemons.sh start journalnode

bin/hdfs namenode -format

bin/hdfs namenode -initializeSharedEdits -force

sbin/start-dfs.sh

如果在執行命令時,遇到如下問題,那麼就是虛擬機免密碼登錄沒有配置,或者配置的有問題:

[root@node01 hadoop-2.7.5]# sbin/hadoop-daemons.sh start journalnode
The authenticity of host 'node01 (192.168.24.137)' can't be established.
ECDSA key fingerprint is SHA256:GzI3JXtwr1thv7B0pdcvYQSpd98Nj1PkjHnvABgHFKI.
ECDSA key fingerprint is MD5:00:00:7b:46:99:5e:ff:f2:54:84:19:25:2c:63:0a:9e.
Are you sure you want to continue connecting (yes/no)? root@node02's password: root@node03's password: Please type 'yes' or 'no':
node01: Warning: Permanently added 'node01' (ECDSA) to the list of known hosts.
root@node01's password:
node02: starting journalnode, logging to /software/hadoop-2.7.5/logs/hadoop-root-journalnode-node02.out


root@node03's password: node03: Permission denied, please try again.

root@node01's password: node01: Permission denied, please try again.

node02機器執行以下命令:

[root@node02 software]# cd hadoop-2.7.5/
[root@node02 hadoop-2.7.5]# bin/hdfs namenode -bootstrapStandby
(省略....)
[root@node02 hadoop-2.7.5]# sbin/hadoop-daemon.sh start namenode
(省略....)

5.2 啓動yarn過程

node02、node03機器執行以下命令:

[root@node03 software]# cd hadoop-2.7.5/
[root@node03 hadoop-2.7.5]# sbin/start-yarn.sh
[root@node02 hadoop-2.7.5]# sbin/start-yarn.sh
starting yarn daemons
resourcemanager running as process 11740. Stop it first.
The authenticity of host 'node02 (192.168.24.138)' can't be established.
ECDSA key fingerprint is SHA256:GzI3JXtwr1thv7B0pdcvYQSpd98Nj1PkjHnvABgHFKI.
ECDSA key fingerprint is MD5:00:00:7b:46:99:5e:ff:f2:54:84:19:25:2c:63:0a:9e.
Are you sure you want to continue connecting (yes/no)? node01: nodemanager running as process 15655. Stop it first.
node03: nodemanager running as process 13357. Stop it first.

啓動過程中,如果遇到上面的報錯信息,則使用下面命令解決: 

網上的大部分博客(不推薦使用,好像有問題)是如下命令:會提示This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh

#進程已經在運行中了,先執行stop-all.sh下,然後再執行start-all.sh
[root@node02 sbin]# pwd
/software/hadoop-2.7.5/sbin
[root@node02 sbin]# ./stop-all.sh
[root@node02 sbin]# ./start-all.sh

不過命令已經廢棄了,現在使用:

 ./stop-yarn.sh
 ./stop-dfs.sh

 ./start-yarn.sh
 ./start-dfs.sh
[root@node03 sbin]# ./start-dfs.sh
Starting namenodes on [node01 node02]
node02: starting namenode, logging to /software/hadoop-2.7.5/logs/hadoop-root-namenode-node02.out
node01: starting namenode, logging to /software/hadoop-2.7.5/logs/hadoop-root-namenode-node01.out
node02: starting datanode, logging to /software/hadoop-2.7.5/logs/hadoop-root-datanode-node02.out
node01: starting datanode, logging to /software/hadoop-2.7.5/logs/hadoop-root-datanode-node01.out
node03: starting datanode, logging to /software/hadoop-2.7.5/logs/hadoop-root-datanode-node03.out
Starting journal nodes [node01 node02 node03]
node02: starting journalnode, logging to /software/hadoop-2.7.5/logs/hadoop-root-journalnode-node02.out
node01: starting journalnode, logging to /software/hadoop-2.7.5/logs/hadoop-root-journalnode-node01.out
node03: starting journalnode, logging to /software/hadoop-2.7.5/logs/hadoop-root-journalnode-node03.out
Starting ZK Failover Controllers on NN hosts [node01 node02]
node01: starting zkfc, logging to /software/hadoop-2.7.5/logs/hadoop-root-zkfc-node01.out
node02: starting zkfc, logging to /software/hadoop-2.7.5/logs/hadoop-root-zkfc-node02.out
[root@node03 sbin]# ./start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-resourcemanager-node03.out
node01: starting nodemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-nodemanager-node01.out
node02: starting nodemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-nodemanager-node02.out
node03: starting nodemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-nodemanager-node03.out

注:使用jps命令查看三臺虛擬機:

問題:這是有問題的,三臺虛擬機中均少了NameNode,NameNode沒有起來

node01: 

[root@node01 hadoop-2.7.5]# jps
8083 NodeManager
8531 DFSZKFailoverController
8404 JournalNode
9432 Jps
1467 QuorumPeerMain
8235 NameNode

node02: 

[root@node02 sbin]# jps
7024 NodeManager
7472 DFSZKFailoverController
7345 JournalNode
7176 NameNode
8216 ResourceManager
8793 Jps
1468 QuorumPeerMain

node03:

[root@node03 hadoop-2.7.5]# jps
5349 NodeManager
5238 ResourceManager
6487 JobHistoryServer
6647 Jps
5997 JournalNode

解決:

(1)首先使用stop-dfs.sh和stop-yarn.sh將服務停掉:(在任意節點執行即可)

[root@node03 hadoop-2.7.5]# ./sbin/stop-dfs.sh
Stopping namenodes on [node01 node02]
node02: no namenode to stop
node01: no namenode to stop
node02: no datanode to stop
node01: no datanode to stop
node03: no datanode to stop
Stopping journal nodes [node01 node02 node03]
node02: no journalnode to stop
node01: no journalnode to stop
node03: no journalnode to stop
Stopping ZK Failover Controllers on NN hosts [node01 node02]
node02: no zkfc to stop
node01: no zkfc to stop
[root@node03 hadoop-2.7.5]# ./sbin/stop-yarn.sh
stopping yarn daemons
stopping resourcemanager
node01: stopping nodemanager
node02: stopping nodemanager
node03: stopping nodemanager
no proxyserver to stop

(2)刪除dataNode數據存放路徑中的文件(三臺虛擬機中都要刪除)

根據配置,所以需要刪除/software/hadoop-2.7.5/data/dfs/dn下的文件:

(3)使用start-dfs.sh和start-yarn.sh將服務再次啓動(任意節點即可)

[root@node01 hadoop-2.7.5]# rm -rf data/dfs/dn
[root@node01 hadoop-2.7.5]# sbin/start-dfs.sh
Starting namenodes on [node01 node02]
node02: starting namenode, logging to /software/hadoop-2.7.5/logs/hadoop-root-namenode-node02.out
node01: starting namenode, logging to /software/hadoop-2.7.5/logs/hadoop-root-namenode-node01.out
node02: starting datanode, logging to /software/hadoop-2.7.5/logs/hadoop-root-datanode-node02.out
node03: starting datanode, logging to /software/hadoop-2.7.5/logs/hadoop-root-datanode-node03.out
node01: starting datanode, logging to /software/hadoop-2.7.5/logs/hadoop-root-datanode-node01.out
Starting journal nodes [node01 node02 node03]
node02: starting journalnode, logging to /software/hadoop-2.7.5/logs/hadoop-root-journalnode-node02.out
node03: starting journalnode, logging to /software/hadoop-2.7.5/logs/hadoop-root-journalnode-node03.out
node01: starting journalnode, logging to /software/hadoop-2.7.5/logs/hadoop-root-journalnode-node01.out
Starting ZK Failover Controllers on NN hosts [node01 node02]
node02: starting zkfc, logging to /software/hadoop-2.7.5/logs/hadoop-root-zkfc-node02.out
node01: starting zkfc, logging to /software/hadoop-2.7.5/logs/hadoop-root-zkfc-node01.out
您在 /var/spool/mail/root 中有新郵件
[root@node01 hadoop-2.7.5]# sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-resourcemanager-node01.out
node02: starting nodemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-nodemanager-node02.out
node03: starting nodemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-nodemanager-node03.out
node01: starting nodemanager, logging to /software/hadoop-2.7.5/logs/yarn-root-nodemanager-node01.out

再次使用jps查看三臺虛擬機:(此時纔是正確的)

node01:

[root@node01 dfs]# jps
10561 NodeManager
9955 DataNode
10147 JournalNode
9849 NameNode
10762 Jps
1467 QuorumPeerMain
10319 DFSZKFailoverController

node02:

[root@node02 hadoop-2.7.5]# jps
9744 NodeManager
9618 DFSZKFailoverController
9988 Jps
9367 NameNode
8216 ResourceManager
9514 JournalNode
1468 QuorumPeerMain
9439 DataNode

node03:

[root@node03 hadoop-2.7.5]# jps
7953 Jps
7683 JournalNode
6487 JobHistoryServer
7591 DataNode
7784 NodeManager

5.3 查看resourceManager狀態

node03上面執行:

[root@node03 hadoop-2.7.5]# bin/yarn rmadmin -getServiceState rm1
active

node02上面執行:

[root@node02 hadoop-2.7.5]# bin/yarn rmadmin -getServiceState rm2
standby

5.4 啓動jobHistory

node03:

[root@node03 hadoop-2.7.5]# sbin/mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /software/hadoop-2.7.5/logs/mapred-root-historyserver-node03.out

5.5 hdfs狀態查看

node01: (部分截圖)

瀏覽器訪問:http://192.168.24.137:50070/dfshealth.html#tab-overview

node02:(部分截圖)

瀏覽器訪問:http://192.168.24.138:50070/dfshealth.html#tab-overview

5.6 yarn集羣訪問查看

瀏覽器訪問:http://192.168.24.139:8088/cluster/nodes

5.7 歷史任務瀏覽界面

6.hadoop命令行

刪除文件:

[root@node01 bin]# ./hdfs dfs -rm /a.txt
20/06/12 14:33:30 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 10080 minutes, Emptier interval = 0 minutes.
20/06/12 14:33:30 INFO fs.TrashPolicyDefault: Moved: 'hdfs://ns/a.txt' to trash at: hdfs://ns/user/root/.Trash/Current/a.txt
Moved: 'hdfs://ns/a.txt' to trash at: hdfs://ns/user/root/.Trash/Current

創建文件夾: 

[root@node01 bin]# ./hdfs dfs -mkdir /dir

上傳文件: 

[root@node01 bin]# ./hdfs dfs -put /software/a.txt /dir

截圖: 

注:

  點擊Download實際上給我訪問的是http://node02:50075....什麼的。如果不配置hosts,是打不開的,將node02改成ip就可以了,虛擬機中我用的是node01,node02,node3可以直接訪問的。

於是我改變宿主機的hosts文件,與虛擬機的hosts改成一致。 再次點擊Download可以直接瀏覽器下載了。

至此,hadoop分佈式環境搭建成功。

 
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章