hbase問題記錄

1. [hbase]hadoop 異常記錄 ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried 7 times
#-----------------------------------------------------------------------------------------------------------------------------------
將hbase/lib目錄下的hadoop-core的jar文件刪除，將hadoop目錄下的hadoop-0.20.2-core.jar拷貝到hbase/lib下面，然後重新啓動hbase即可。

####################################################################################################################################
2. [chukwa]Caused by: java.lang.ClassNotFoundException: org.mortbay.jetty.HandlerContainer
#-----------------------------------------------------------------------------------------------------------------------------------
chukwa lib裏缺jetty-6.1.26.jar

####################################################################################################################################
3. slf4j-api-1.5.8.jar slf4j-log4j12-1.5.8.jar版本不一致（hadoop1.0.4和hbase0.92.1）

用1.5.8替換hadoop下的slf4j-log4j12-1.4.3.jar和slf4j-api-1.4.3.jar

####################################################################################################################################
4. [Hadoop]Could not connect to remote log4j server 忽略

####################################################################################################################################
5. 由於使用的Hadoop1.0.0，HBase0,90.5並不支持這個版本，需要替換相關的Jar包： hbase/lib下找到：hadoop-core-0.20-append-r1056497.jar 後刪除它。從Hadoop/lib下找到hadoop-core-1.0.0.jar和commons-configuration-1.6.jar，並拷貝到hbase/lib下。

####################################################################################################################################
6. [hbase shell]ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried 7 times

A：如果rootdir配置的是使用hdfs，檢查是否hdfs服務進程沒有打開，請啓動hadoop。

B：確認hadoop已經啓動，請檢查hbase和hadoop的版本是否不一致，將hbase/lib目錄下的hadoop-core的jar文件刪除，將hadoop目錄下的hadoop-0.20.2-core.jar拷貝到hbase/lib下面，然後重新啓動hbase即可。

C：可能hadoop啓動的事安全模式，執行：/bin$ hadoop dfsadmin -safemode leave 即可。

####################################################################################################################################
7. Hbase某個節點單獨啓動HRegionServer報錯
錯誤：starting regionserver, logging to /data/java/hbase-0.90.3/logs/hbase-root-regionserver-SFserver25.localdomain.out
Exception in thread "regionserver60020" java.lang.NullPointerException
at org.apache.hadoop.hbase.regionserver.HRegionServer.join(HRegionServer.java:1417)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:683)
at java.lang.Thread.run(Thread.java:662)

有以下原因：1、hbase中服務器時間不同步造成的regionserver啓動失敗
rg.apache.hadoop.hbase.ClockOutOfSyncException: org.apache.hadoop.hbase.ClockOutOfSyncException: Server hadoop-node6,60020,1337908009841 has been rejected; Reported time is too far out of sync with master. Time difference of 882788ms > max allowed of 30000ms

方案1
在hbase-site.xml添加配置
<property>
<name>hbase.master.maxclockskew</name>
<value>180000</value>
<description>Time difference of regionserver from master</description>
</property>

方案2

錯誤裏指出節點機的時間和master的時間差距大於30000ms，就是30秒時無法啓動服務。
修改各結點時間，使其誤差在30s內

####################################################################################################################################
8. unexpected error, closing socket connection and attempting reconnect

方案: service iptables stop

####################################################################################################################################
9. ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to
ZooKeeper but the connection closes immediately. This could be a sign that the server has too many
connections (30 is the default).

需要修改 limits.conf

vi /etc/security/limits.conf

在最後添加兩行：

hdfs - nofile 32768
hbase - nofile 32768

重啓下 hbase
[root@master bin]# ./stop-hbase.sh
stopping hbase.............................
[root@master bin]# ./start-hbase.sh
(如果stop-hbase一直處於………………的狀態，怎麼辦？我教你一個方法，先去重新start-hbase,肯定說hbase還沒有停止，需要先停止，給你一個PID，哈哈，之後你就kill -9 pid，在執行start-hbase.sh)

hbase-site.xml的配置
<configuration>
<property>
<name>/tmp/hbase-${user.name}</name>
<value>file:///home/bell/software/HBase/hbase-0.90.4/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

####################################################################################################################################
10.執行$ hbase hbck 命令時，出現以下提示：
Invalid maximum heap size: -Xmx4096m
The specified size exceeds the maximum representable size.
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
原因：jvm設置的內存過大，減小配置文件hbase-env.sh內的設置即可。例如：
export HBASE_HEAPSIZE=1024

####################################################################################################################################
11. 無法啓動hbase，regionserver log裏會有這樣的錯誤，zookeeper也有初始化問題的錯誤
FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 10.210.70.57,60020,1340088145399: Initialization of RS failed. Hence aborting RS.
因爲之前安裝配置的時候是好好的，中間經歷過強行kill daemon的過程，又是報錯初始化問題，所以估計是有緩存影響了，所以清理了tmp裏的數據，然後發現HRegionServer依然無法啓動，不過還好的是zookeeper啓動了，一怒之下把hdfs裏的hbase數據也都清理了，同時再清理tmp，檢查各個節點是否有殘留hbase進程，kill掉，重啓hbase，然後這個世界都正常了。不知道具體哪裏影響了，不推薦這種暴力解決辦法，如果有誰知道原因請告之。

####################################################################################################################################
12. 無法啓動reginserver daemon，報錯如下：
Exception in thread “main” java.lang.RuntimeException: Failed construction of Regionserver: class org.apache.hadoop.hbase.regionserver.HRegionServer
…
Caused by: java.net.BindException: Problem binding to /10.210.70.57:60020 : Cannot assign requested address
根據錯誤提示，檢查ip對應的機器是否正確，如果出錯機器的ip正確，檢查60020端口是否被佔用。

####################################################################################################################################
13. 執行hbase程序orshell命令出現如下提示：
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hbase-0.92.1/lib/slf4j-log4j12-1.5.8.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hadoop-1.0.3/lib/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
是因爲hbase和hadoop裏都有這個jar包，選擇其一移除即可。

####################################################################################################################################
14. 執行hbase的mapreduce作業，有些節點無任何報錯正常執行，有些節點總報類似Status : FAILED
java.lang.NullPointerException的錯誤，查看tasktracker的log日誌有如下錯誤：
WARN org.apache.zookeeper.ClientCnxn: Session 0×0 for server null, unexpected error, closing socket connection and attempting reconnect
…
caused by java.net.ConnectException: Connection refused
官方對這個錯誤給了說明，
Errors like this… are either due to ZooKeeper being down, or unreachable due to network issues.
當初配置zookeeper時只說儘量配置奇數節點防止down掉一個節點無法選出leader，現在看這個問題貌似所以想執行任務的節點都必須配置zookeeper啊。

####################################################################################################################################
15. 報告找不到方法異常，但是報告的函數並非自己定義的，也並沒有調用這樣的函數，類似信息如下：
java.lang.RuntimeException: java.lang.NoSuchMethodException: com.google.hadoop.examples.Simple$MyMapper.()
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:45)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:32)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:53)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:209)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1210)
Caused by: java.lang.NoSuchMethodException: com.google.hadoop.examples.Simple$MyMapper.()
at java.lang.Class.getConstructor0(Class.java:2705)
at java.lang.Class.getDeclaredConstructor(Class.java:1984)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:41)
… 4 more
網上找到解決方法如下：
This is actually the() function. The display on the web page doesn’t translate into html, but dumps plain text, sois treated as a (nonexistant) tag by your browser. This function is created as a default initializer for non-static classes. This is most likely caused by having a non-static Mapper or Reducer class. Try adding the static keyword to your class declaration, ie:
意思是缺少static關鍵字～添加上即可。如下：
public static class MyMapper extends MapReduceBase implements Mapper {…}

####################################################################################################################################
16. 使用mapreduce程序寫HFile操作hbase時，可能會有這樣的錯誤：

java.lang.IllegalArgumentException: Can’t read partitions file
…
Caused by: java.io.IOException: wrong key class: org.apache.hadoop.io.*** is not class org.apache.hadoop.hbase.io.ImmutableBytesWritable
這裏需要注意的是無論是map還是reduce作爲最終的輸出結果，輸出的key和value的類型應該是：< ImmutableBytesWritable, KeyValue> 或者< ImmutableBytesWritable, Put>。改成這樣的類型就行了。

####################################################################################################################################
17. 如果啓動hbase集羣出現regionserver無法啓動，日誌報告如下類似錯誤時，說明是集羣的時間不同步，只需要同步即可解決。
FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 10.210.78.22,60020,1344329095415: Unhandled exceptio
n: org.apache.hadoop.hbase.ClockOutOfSyncException: Server 10.210.78.22,60020,1344329095415 has been rejected; Reported time is too far out of sync with mast
er. Time difference of 90358ms > max allowed of 30000ms
org.apache.hadoop.hbase.ClockOutOfSyncException: org.apache.hadoop.hbase.ClockOutOfSyncException: Server 10.210.78.22,60020,1344329095415 has been rejected;
Reported time is too far out of sync with master. Time difference of 90358ms > max allowed of 30000ms
……
Caused by: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.ClockOutOfSyncException: Server 10.210.78.22,60020,1344329095415 has been rejected;
Reported time is too far out of sync with master. Time difference of 90358ms > max allowed of 30000ms
只需要執行一下這條命令即可同步國際時間：
/usr/sbin/ntpdate tick.ucla.edu tock.gpsclock.com ntp.nasa.gov timekeeper.isi.edu usno.pa-x.dec.com;/sbin/hwclock –systohc > /dev/null

黃偉偉

發佈了14 篇原創文章 · 獲贊 5 · 訪問量 9萬+

私信關注

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

如何查看DB2監聽的是哪個端口

hibernate中的一級緩存和二級緩存

maven基本用法

mongo基本用法

hadoop+hive+hbase入門

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結