安裝部署(七) HBase集羣安裝部署與測試

HBase集羣安裝部署與測試


Hadoop 2.7.2 
Spark 2.0.0
Kafka 0.10.0.0
HBase 1.2.2
Zookeeper 3.4.8


參考:
http://www.tuicool.com/articles/VV7bam
http://blog.csdn.net/yinedent/article/details/48275407


1 下載:
http://mirrors.hust.edu.cn/apache/hbase/stable/
http://mirrors.hust.edu.cn/apache/hbase/stable/hbase-1.2.2-bin.tar.gz


2 解壓:
root@py-server:/server# tar xvzf hbase-1.2.2-bin.tar.gz
root@py-server:/server# mv hbase-1.2.2/ hbase


3 環境變量:
vi ~/.bashrc
export HBASE_HOME=/server/hbase
export PATH=$HBASE_HOME/bin
source ~/.bashrc


4 配置:
依賴zookeeper環境,zookeeper集羣參考spark安裝部署裏邊的相應內容。主備Master。
5臺機配置如下:


4.1 配置hbase-site.xml
vi $HBASE_HOME/conf/hbase-site.xml
內容如下:
<property>
    <name>hbase.rootdir</name>
    <value>hdfs://py-server:9000/hbase</value>
  </property>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
  </property>
    <property>
<name>hbase.zookeeper.quorum</name>
<value>py-server:2181,py-11:2181,py-12:2181,py-13:2181,py-14:2181</value>
</property>
注:hdfs端口號在root@py-server:/server/hadoop/etc/hadoop# vi core-site.xml裏查看,之前配的是9000


4.2 配置主備Master
配置備用Master,把備用的Master主機名寫入,backup-masters配置文件默認不存在,新建一個
vi $HBASE_HOME/conf/backup-masters
py-12


4.3 配置regionservers
py-server
py-11
py-12
py-13
py-14


4.4配置hbase-env.sh
修改堆內存,不用HBase管理zookeeper集羣。
vi $HBASE_HOME/conf/hbase-env.sh
export HBASE_HEAPSIZE=4G
export HBASE_MANAGES_ZK=false


5 分發
root@py-server:/server# scp -r hbase/ [email protected]:/server/
root@py-server:/server# scp -r hbase/ [email protected]:/server/
root@py-server:/server# scp -r hbase/ [email protected]:/server/
root@py-server:/server# scp -r hbase/ [email protected]:/server/
修改各自的環境變量11-14節點 ~/.bashrc


6 啓動集羣


6.1
在主節點py-server執行:
root@py-server:/server# $HBASE_HOME/bin/start-hbase.sh
結果:
starting master, logging to /server/hbase/logs/hbase-root-master-py-server.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
py-11: starting regionserver, logging to /server/hbase/bin/../logs/hbase-root-regionserver-py-11.out
py-14: starting regionserver, logging to /server/hbase/bin/../logs/hbase-root-regionserver-py-14.out
py-13: starting regionserver, logging to /server/hbase/bin/../logs/hbase-root-regionserver-py-13.out
py-12: starting regionserver, logging to /server/hbase/bin/../logs/hbase-root-regionserver-py-12.out
py-11: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
py-11: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
py-13: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
py-13: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
py-12: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
py-12: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
py-12: SLF4J: Class path contains multiple SLF4J bindings.
py-12: SLF4J: Found binding in [jar:file:/server/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
py-12: SLF4J: Found binding in [jar:file:/server/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
py-12: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
py-12: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
py-server: starting regionserver, logging to /server/hbase/logs/hbase-root-regionserver-py-server.out
py-server: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
py-server: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
py-12: starting master, logging to /server/hbase/bin/../logs/hbase-root-master-py-12.out
py-12: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
py-12: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0


6.2驗證


6.2.1 驗證1:
通過Web UI查看即可。默認地址 http://py-server:16010/master-status
注意host是Master
Region Servers
Base Stats 
Memory 
Requests 
Storefiles 
Compactions 
ServerName Start time Version Requests Per Second Num. Regions 
py-11,16020,1470969413562  Fri Aug 12 10:36:53 CST 2016 Unknown 0 0 
py-12,16020,1470969406816  Fri Aug 12 10:36:46 CST 2016 Unknown 0 0 
py-13,16020,1470969425459  Fri Aug 12 10:37:05 CST 2016 Unknown 0 0 
py-14,16020,1470969407402  Fri Aug 12 10:36:47 CST 2016 Unknown 0 0 
py-server,16020,1470969419382  Fri Aug 12 10:36:59 CST 2016 Unknown 0 0 
Total:5  5 nodes with inconsistent version 0 0 




6.2.2 驗證2:
這個是Master
root@py-server:~# jps
18592 NodeManager
17894 DataNode
29959 Jps
18867 Worker
9780 Kafka
25303 Main
29623 HMaster
18073 SecondaryNameNode
18650 Master
20218 jar
17499 QuorumPeerMain
18269 ResourceManager
17725 NameNode




root@py-11:~# jps
23158 Worker
22664 QuorumPeerMain
24105 Jps
23898 HRegionServer
22971 NodeManager
22828 DataNode
20189 Kafka


root@py-12:~# jps
3846 QuorumPeerMain
13256 Jps
5161 Kafka
4282 Worker
13035 HMaster
4011 DataNode
4156 NodeManager




root@py-13:~# jps
20960 NodeManager
20817 DataNode
21126 Worker
17287 Kafka
20188 Jps
20653 QuorumPeerMain
19983 HRegionServer


root@py-14:~# jps
15958 Jps
551 DataNode
15753 HRegionServer
701 NodeManager
911 Worker
12095 ZooKeeperMain
383 QuorumPeerMain


6.2.3 驗證3:
$ZOOKEEPER_HOME/bin/zkCli.sh


[zk: localhost:2181(CONNECTED) 0] ls /
[controller_epoch, controller, brokers, zookeeper, admin, isr_change_notification, consumers, config, hbase]
包含hbase


6.2.4 驗證4:hbase shell 
root@py-server:~# hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/server/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/server/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.2.2, r3f671c1ead70d249ea4598f1bbcc5151322b3a13, Fri Jul  1 08:28:55 CDT 2016


hbase(main):002:0> create 'test','cf'
0 row(s) in 2.4050 seconds


=> Hbase::Table - test
hbase(main):003:0> list
TABLE                                                                                                                                                                               
test                                                                                                                                                                                
1 row(s) in 0.0070 seconds


=> ["test"]
hbase(main):004:0> version
1.2.2, r3f671c1ead70d249ea4598f1bbcc5151322b3a13, Fri Jul  1 08:28:55 CDT 2016


hbase(main):005:0> status
1 active master, 1 backup masters, 5 servers, 0 dead, 0.6000 average load


hbase(main):006:0> put 'test','rowkey1','cf:id','1'
0 row(s) in 0.1390 seconds


hbase(main):007:0> put 'test','rowkey1','cf:name','zhang3'
0 row(s) in 0.0150 seconds


hbase(main):008:0> scan 'test'
ROW                                            COLUMN+CELL                                                                                                                          
 rowkey1                                       column=cf:id, timestamp=1470974366010, value=1                                                                                       
 rowkey1                                       column=cf:name, timestamp=1470974389934, value=zhang3                                                                                
1 row(s) in 0.0470 seconds


hbase(main):009:0> 


6.2.5 驗證HMaster自動切換【主備】
參照:
http://blog.csdn.net/yinedent/article/details/48275407
或者本文附錄




6.3 關閉
6.3.1主節點執行:
root@py-12:~# stop-hbase.sh


6.3.2 kill進程(不推薦,除非關不掉hbase)
方法一:
root@py-server:~# jps
18592 NodeManager
17894 DataNode
31852 Jps
31089 HMaster
18867 Worker
9780 Kafka
25303 Main
18073 SecondaryNameNode
18650 Master
20218 jar
17499 QuorumPeerMain
30045 Main
18269 ResourceManager
17725 NameNode
31263 HRegionServer


kill -9 31089


方法二:
root@py-server:~# netstat -nlp | grep java
找到hbase端口16000對應的進程id
tcp6       0      0 10.1.1.6:16000          :::*                    LISTEN      31089/java      


kill -9 31089












##########################################
常見問題收集:
問題1:
出現ERROR: Can't get master address from ZooKeeper; znode data == null解決辦法
出現此問題可能是zookeeper不穩定造成的,採用的是虛擬機,經常掛起的狀態,使用hbase的list命令出現下面錯誤,這個可能是hbase的穩定性造成的,解決辦法有兩種。這裏使用第一種辦法就解決了。
ERROR: Can't get master address from ZooKeeper; znode data == null


Here is some help for this command:
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:


  hbase> list
  hbase> list 'abc.*'


解決方法:


1.重啓hbase
stop-hbase.sh


然後
start-hbase.sh
複製代碼
問題解決。這裏也找了其他解決辦法,作爲一個整理。


2.解決方法2:格式化namenode
2節點的datanode 日誌信息中:
Incompatible namespaceIDs in /home/hadoop/tmp/dfs/data: namenode namespaceID = 1780037790
1節點的namenode日誌信息::java.io.IOException: File /home/hadoop/tmp/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
將namenode的信息刪除,重新格式化
重新啓動,hbase正常
原文地址:
http://www.aboutyun.com/thread-8691-1-1.html


3.解決方法3:hdfs端口配置不一致
這個問題這裏是由於啓動hbase失敗造成,主要原因是因爲配置文件hbase-site.xml中hdfs端口配置錯誤導致,幫助文檔中使用的是8020,而我的hadoop分佈集羣用的是9000默認端口,修改配置如下:
gedit hbase-site.xml
                <property>
                        <name>hbase.rootdir</name>
                        <value>hdfs://hadoop0:9000/hbase</value>
                </property>


配置好後分發到各個節點,保證各個節點配置文件一致,重啓節點中各臺機器。啓動Hbase,bin/start-hbase.sh
        啓動順序:hadoop-->zookeeper-->hbase 
        在hadoop0上啓動hadoop集羣:
        /home/hadoop/hadoop-2.6.0/sbin/start-all.sh


        在每一臺機器上啓動zookeeper:
        /home/hadoop/zookeeper-3.4.6/bin/zkServer.sh start
        
        在hadoop0上啓動hbase集羣:
         /home/hadoop/hbase-1.0.1.1/bin/start-hbase.sh
登錄http://hadoop0:16010/master-status查看狀態,頁面這回可以打開了。
最後測試list工作正常,問題解決。
原文地址:
http://f.dataguru.cn/thread-519459-1-1.html




問題2:hbase停不了,stop-hbase.sh無法停止hbase
http://www.cnblogs.com/jdksummer/articles/2506811.html
停止hbase--------------注意停止順序,先停止hbase,再停止hadoop,否則可能導致hbase停止不了。


 $stop-hbase.sh


 $stop-all.sh


碰到的各種問題:


官方文檔裏說明了HBase與Hadoop的適配版本,主要是 0.20-append 分支是否合併進hadoop主幹的問題。hadoop 0.20.205.0 已經合併,所以一定要用 0.20.205.0 以後版本的hadoop。
官方文檔裏提到的需要替換 lib/hadoop-core-….jar 的問題一定要做。否則啓動時會出現 EOFException。由於實際版本號不同,所以直接把原jar移走,新jar放入即可。
0.20.205.0 需要同時把 hadoop/lib 裏的commons-configuration-1.6.jar 也考到 hbase/lib裏。否則啓動master時會出現master.HMaster exception of “NoClassDefFoundError” 
bin/start-hbase 會自動啓動一個zookeeper。當然可以自行配置zookeeper。
bin/stop-hbase 貌似只會停止zookeeper和master, 在 B(master)上會遺留 regionserver,可以kill 進程或者 bin/hbase-daemon.sh stop regionserver
同樣使用 netstat -nlp | grep java 檢查端口號,HBase相關服務端口均以 600開頭。
如果因爲停止順序不對導致hbase停止不了,可以通過下列方法強行kill掉hbase守護進程來停止,
        $netstat -nlp | grep java---------------------------------查找守護進程(Hmaster端口號爲60000注意,本例是16000)對應的進程號,如下圖所示 查找到進程號1 5562






        $sudo kill -9 15562 -----------------------------------------殺死進程 


  或者直接使用jps,每行中的數字即爲進程號,直接使用上面命令kill掉即可。




問題3:啓動HBase的時候,無法啓動RegionServer,查看日誌,錯誤如下: 


  org.apache.hadoop.hbase.ClockOutOfSyncException: Server summer1,60020,1357384944077 has been rejected; Reported time is too far out of sync with master.  Time difference of 3549899ms > max allowed of 30000ms


   原因是RegionServer與Master的時間不一致造成的,由錯誤內容可以看出兩臺機器之間最大的誤差時間爲30000ms,一旦超過這個值便無法啓動。


  解決方法:在所有結點上執行:sudo ntpdate time.nist.gov   (time.nist.gov是一個時間服務器)
http://www.cnblogs.com/jdksummer/articles/2506811.html


問題4;hbase運行shell時ERROR:org.apache.hadoop.hbase.PleaseHoldException: Master is initializing 的解決辦法
參考:
http://www.cnblogs.com/suddoo/p/4986094.html
所有機執行時間同步
root@py-server:~# ntpdate time.nist.gov
12 Aug 11:50:40 ntpdate[30326]: no server suitable for synchronization found
root@py-server:~# ntpdate 0.cn.pool.ntp.org
12 Aug 11:51:02 ntpdate[30361]: step time server 120.25.108.11 offset -10.398102 sec
root@py-server:~# 之後就是關閉hbase,再重新hbase,進入hbase shell ,list一下正常了。
參考我給出的第二篇博客的連接,安裝ntpdate, sudo apt-get install ntpdate後,運行shell命令:ntpdate  0.cn.pool.ntp.org    這個命令很簡單,參數可以選擇任意一個時間服務器的地址,然後重啓hbase數據庫:bin/stop-hbase.sh     bin/start-hbase.sh  即可。可能會出現 can't get master address from ZooKeeper錯誤,這可能是由於ZooKeeper不穩定造成的,我試着又重啓了一下,就可以了。






#####################################
驗證HMaster自動切換
在hadoop60上的日誌查看
cloud@hadoop60:~> tail/home/cloud/hbase0962/logs/hbase-cloud-master-hadoop60.log
2015-06-02 14:30:39,705 INFO [master:hadoop60:60000] http.HttpServer: Added global filter 'safety'(class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
2015-06-02 14:30:39,710 INFO [master:hadoop60:60000] http.HttpServer: Added filter static_user_filter(class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) tocontext master
2015-06-02 14:30:39,711 INFO [master:hadoop60:60000] http.HttpServer: Added filter static_user_filter(class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) tocontext static
2015-06-02 14:30:39,731 INFO [master:hadoop60:60000] http.HttpServer: Jetty bound to port 60010
2015-06-02 14:30:39,731 INFO [master:hadoop60:60000] mortbay.log: jetty-6.1.26
2015-06-02 14:30:40,291 INFO [master:hadoop60:60000] mortbay.log: [email protected]:60010
2015-06-02 14:30:40,292 DEBUG [master:hadoop60:60000]master.HMaster: HMaster started in backup mode. Stalling until master znode is written.
2015-06-02 14:30:40,407 INFO [master:hadoop60:60000] zookeeper.RecoverableZooKeeper: Node/hbase/master already exists and this is not a retry
2015-06-02 14:30:40,408 INFO  [master:hadoop60:60000] master.ActiveMasterManager:Adding ZNode for /hbase/backup-masters/hadoop60,60000,1433226638688 in backupmaster directory
2015-06-02 14:30:40,421 INFO  [master:hadoop60:60000]master.ActiveMasterManager: Another master is the active master,hadoop59,60000,1433226634553; waiting to become the next active master
 
這裏說明zookeeper已經接管了,並且把hadoop60作爲一個備份的Hbase了,並且這裏提示waiting to become thenextactive master(等待變成下一個活動的master),然後我們可以將hadoop59上的hmaster進程給kill掉,當然,也可以使用 ./hbase-daemon.shstop master 來結束hadoop59上的hmaster進程
6.14.2 kill掉hadoop59上的hmaster進程,看看hadoop60上的日誌會有什麼變化
cloud@hadoop59:~> hbase0962/bin/hbase-daemon.shstop master
stopping master.
cloud@hadoop59:~> jps
1320 Jps
45952 DFSZKFailoverController
49796 ResourceManager
43879 NameNode
cloud@hadoop59:~>
# 下面是hadoop60上日誌變化後的信息
cloud@hadoop60:~>tail  -n 50/home/cloud/hbase0962/logs/hbase-cloud-master-hadoop60.log
(省略。。。。。。)
2015-06-0214:47:48,103 INFO [master:hadoop60:60000] master.RegionStates: Onlinedc6541fc62282f10ad4206d626cc10f8b on hadoop36,60020,1433226640106
2015-06-0214:47:48,105 DEBUG [master:hadoop60:60000] master.AssignmentManager: Found{ENCODED => c6541fc62282f10ad4206d626cc10f8b, NAME =>'test,,1433214676168.c6541fc62282f10ad4206d626cc10f8b.', STARTKEY => '',ENDKEY => ''} out on cluster
2015-06-02 14:47:48,105INFO  [master:hadoop60:60000]master.AssignmentManager: Found regions out on cluster or in RIT; presumingfailover
2015-06-0214:47:48,237 DEBUG [master:hadoop60:60000] hbase.ZKNamespaceManager: Updatingnamespace cache from node default with data: \x0A\x07default
2015-06-0214:47:48,241 DEBUG [master:hadoop60:60000] hbase.ZKNamespaceManager: Updatingnamespace cache from node hbase with data: \x0A\x05hbase
2015-06-0214:47:48,289 INFO [master:hadoop60:60000] zookeeper.RecoverableZooKeeper: Node /hbase/namespace/defaultalready exists and this is not a retry
2015-06-0214:47:48,308 INFO [master:hadoop60:60000] zookeeper.RecoverableZooKeeper: Node/hbase/namespace/hbase already exists and this is not a retry
2015-06-0214:47:48,318 INFO  [master:hadoop60:60000]master.HMaster: Master has completed initialization
只看紅色標註的地方,意思就是說當我們kill掉hadoop59上的hmaster的時候,喚醒等待的hmaster線程,然後找到了等待的hmaster(hadoop60)),然後 zookeeper就接管並且將hadoop6上的hmaster從等待狀態切換爲激活狀態了,然後就ok了。(當然也可以多開幾個備用的hmaster,只需要在backup-masters配置文件中添加即可)
下面驗證配用hbase是否可用
cloud@hadoop60:~> hbase shell
2015-06-02 14:51:36,014 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated.Instead, use io.native.lib.available
HBase Shell; enter 'help<RETURN>' for list of supportedcommands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014
 
hbase(main):001:0> status
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in[jar:file:/home/cloud/hbase0962/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in[jar:file:/home/cloud/hadoop220/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for anexplanation.
2015-06-02 14:51:40,658 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library foryour platform... using builtin-java classes where applicable
8 servers, 0 dead, 0.3750 average load
hbase(main):002:0> scan 'test'
ROW                                              COLUMN+CELL                                                                                                                                  
 rowkey1                                        column=cf:id, timestamp=1433215397406, value=1                                                                                               
 rowkey1                                        column=cf:name, timestamp=1433215436532, value=zhangsan                                                                                       
1 row(s) in 0.2120 seconds
 
hbase(main):003:0>
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章