Hbase的完全分布式集群搭建
一.准备工作
1.安装三台虚拟机
三台虚拟机,最小的集群规模数量,安装过程省略。
2. 所用软件版本
1.操作系统 CentOS-7-x86_64-Minimal-1804.iso
2.haoop版本 hadoop-2.7.1.tar.gz
3.Zookeeper版本 zookeeper-3.4.6.tar.gz
4.JDK版本 jdk-8u191-linux-x64.rpm
5. Hbase版本 hbase-2.0.0-bin.tar.gz
3.进行服务器配置
服务名 | node1 | node2 | node3 |
---|---|---|---|
NameNode | Y(主) | Y(备) | |
DataNode | Y | Y | Y |
Journal node | Y | Y | Y |
Zookeeper | Y | Y | Y |
ResourceManage | Y(主) | Y(备) | |
NodeManage | Y | Y | Y |
3.1 修改主机名映射
[root@X ~]# vim /etc/hosts
将所有节点都配置上,以便能够互相访问
3.2 ssh免密登录
[root@nodex ~]# ssh-keygen -t rsa
[root@nodex ~]# ssh-copy-id node1
[root@nodex ~]# ssh-copy-id node2
[root@nodex ~]# ssh-copy-id node3
3.3 进行时钟同步(一致可省略)
[root@nodeX ~]# date -s '2020-07-02 10:34:24'
2020年 7月 02日 星期四 10:34:24 CST
[root@nodeX ~]# clock -w
[root@nodeX ~]# date
2020年 7月 02日 星期四 10:34:24 CST
3.4 上传相关软件
3.5 安装JDK
root@nodeX opt]# rpm -ivh jdk-8u191-linux-x64.rpm
警告:jdk-8u191-linux-x64.rpm: 头V3 RSA/SHA256 Signature, 密钥 ID ec551f03: NOKEY
准备中... ################################# [100%]
正在升级/安装...
1:jdk1.8-2000:1.8.0_191-fcs ################################# [100%]
Unpacking JAR files...
tools.jar...
plugin.jar...
javaws.jar...
deploy.jar...
rt.jar...
jsse.jar...
charsets.jar...
localedata.jar...
二.Hadoop分布式集群搭建
1.安装Zookeeper
[root@nodex ~]# tar -zxf zookeeper-3.4.6.tar.gz -C /usr
[root@nodex ~]# vi /usr/zookeeper-3.4.6/conf/zoo.cfg
tickTime=2000
dataDir=/root/zkdata
clientPort=2181
initLimit=5
syncLimit=2
server.1=node1:2887:3887
server.2=node2:2887:3887
server.3=node3:2887:3887
# node1执行此指令
[root@node1 ~]# cd zkdata/
[root@node1 zkdata]# vi myid
1
# node2执行此指令
[root@node2 ~]# cd zkdata/
[root@node2 zkdata]# vi myid
2
# node3执行此指令
[root@node3 ~]# cd zkdata/
[root@node3 zkdata]# vi myid
3
启动zookeeper
[root@nodeX zookeeper-3.4.6]# ./bin/zkServer.sh start conf/zoo.cfg
# 确认zookeper服务是否正常:方法一
[root@nodex ~]# jps
1777 QuorumPeerMain
1811 Jps
# 确认zookeper服务是否正常:方法二
[root@nodex ~]# /usr/zookeeper-3.4.6/bin/zkServer.sh status /usr/zookeeper-3.4.6/conf/zoo.cfg
JMX enabled by default
Using config: /usr/zookeeper-3.4.6/conf/zoo.cfg
Mode: leader
2.安装Hadoop
[root@nodeX opt]# tar -zxf hadoop-2.7.1.tar.gz -C install/
#3. 配置java和hadoop的环境变量
[root@nodex ~]# vi ~/.bashrc
HADOOP_HOME=/opt/install/hadoop-2.7.1
JAVA_HOME=/usr/java/latest
CLASSPATH=.
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export JAVA_HOME
export CLASSPATH
export PATH
export HADOOP_HOME
[root@nodex ~]# source ~/.bashrc
2.1 修改hadoop配置文件
- core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://node:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/install/hadoop-2.7.1/hadoop-${user.name}</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>30</value>
</property>
<property>
<name>net.topology.script.file.name</name>
<value>/opt/install/hadoop-2.7.1/etc/hadoop/rack.sh</value>
</property>
- 创建机架脚本文件,该脚本可以根据IP判断机器所处的物理位置
while [ $# -gt 0 ] ; do
nodeArg=$1
exec</opt/install/hadoop-2.7.1/etc/hadoop/topology.data
result=""
while read line ; do
ar=( $line )
if [ "${ar[0]}" = "$nodeArg" ] ; then
result="${ar[1]}"
fi
done
shift
if [ -z "$result" ] ; then
echo -n "/default-rack"
else
echo -n "$result "
fi
done
[root@nodex hadoop]# chmod u+x rack.sh
[root@nodeX hadoop]# vi topology.data
192.168.19.149 /rack2
192.168.19.150 /rack1
192.168.19.151 /rack3
[root@nodeX hadoop]# /opt/install/hadoop-2.7.1/etc/hadoop/rack.sh 92.168.19.150
/rack1
hdfs-site.xml
[root@nodex hadoop] vi hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>node1:2181,node2:2181,node3:2181</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>node1:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>node2:9000</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://node1:8485;node2:8485;node3:8485/mycluster</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
slaves
[root@nodex hadoop]# vi slaves
node1
node2
node3
3.启动Hadoop 的HA集群
[root@nodex ~]# hadoop-daemon.sh start journalnode
[root@node1 ~]# hdfs namenode -format
[root@node1 ~]# hadoop-daemon.sh start namenode
[root@node2 ~]# hdfs namenode -bootstrapStandby
[root@node2 ~]# hadoop-daemon.sh start namenode
# zkfc: zookeeper failover controller
[root@node1|2 ~]# hdfs zkfc -formatZK (可以在node1或者node2任意一台注册namenode信息)
[root@node1 ~]# hadoop-daemon.sh start zkfc (哨兵)
[root@node2 ~]# hadoop-daemon.sh start zkfc (哨兵)
[root@nodeX ~]# hadoop-daemon.sh start datanode
注意:CentOS7需要安装一个中间依赖服务
[root@nodex ~]# yum install -y psmisc
4. YARN的HA集群
-
mapred-site.xml
[root@nodex ~]# cp /opt/install/hadoop-2.7.1/etc/hadoop/mapred-site.xml.template /opt/install/hadoop-2.7.1/etc/hadoop/mapred-site.xml [root@nodex ~]# vi /opt/install/hadoop-2.7.1/etc/hadoop/mapred-site.xml <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
-
yarn-site.xml
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.cluster-id</name> <value>cluster1</value> </property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>node2</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>node3</value> </property> <property> <name>yarn.resourcemanager.zk-address</name> <value>node1:2181,node2:2181,node3:2181</value> </property>
-
启动YARN
[root@node2 ~]# yarn-daemon.sh start resourcemanager [root@node3 ~]# yarn-daemon.sh start resourcemanager [root@nodeX ~]# yarn-daemon.sh start nodemanager
-
查看ResourceManager HA状态
[root@node1 ~]# yarn rmadmin -getServiceState rm1 active [root@node1 ~]# yarn rmadmin -getServiceState rm2 standby
三.Hbase 完全分布式集群搭建
-
解压相关包
[root@nodeX opt]# tar -zxf hbase-2.0.0-bin.tar.gz -C install/
-
修改配置文件
hbase-site.xml
<property> <name>hbase.rootdir</name> <value>hdfs://mycluster/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>node1,node2,node3</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property>
-
修改配置文件
regionservers
[root@nodex ~]# vi /opt/install/hbase-2.0.0/conf/regionservers node1 node2 node3
-
修改环境变量配置
[root@nodex ~]# vi .bashrc HBASE_MANAGES_ZK=false HBASE_HOME=/opt/install/hbase-2.0.0 HADOOP_HOME=/opt/install/hadoop-2.7.1 JAVA_HOME=/usr/java/latest CLASSPATH=. PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin export JAVA_HOME export CLASSPATH export PATH export HADOOP_HOME export HBASE_HOME export HBASE_MANAGES_ZK [root@nodex ~]# source .bashrc
启动服务
-
启动HMaster
[root@nodex ~]# hbase-daemon.sh start master
-
启动HRegionServe
[root@nodex ~]# hbase-daemon.sh start regionserver
验证结果
启动服务
-
启动HMaster
[root@nodex ~]# hbase-daemon.sh start master
-
启动HRegionServe
[root@nodex ~]# hbase-daemon.sh start regionserver
验证结果
-