HBase常見錯誤:
1. 向Hbase插入時,報錯java.lang.IllegalArgumentException: KeyValue size too large的解決辦法
2020-04-08 09:34:38,120 ERROR [main] ExecReducer: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":"0","_col1":"","_col2":"2020-04-08","_col3":"joyshebaoBeiJing","_col4":"105","_col5":"北京,"},"value":null}
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:253)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: KeyValue size too large
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:763)
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
... 7 more
Caused by: java.lang.IllegalArgumentException: KeyValue size too large
at org.apache.hadoop.hbase.client.HTable.validatePut(HTable.java:1577)
at org.apache.hadoop.hbase.client.BufferedMutatorImpl.validatePut(BufferedMutatorImpl.java:158)
at org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:133)
at org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:119)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1085)
at org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat$MyRecordWriter.write(HiveHBaseTableOutputFormat.java:146)
at org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat$MyRecordWriter.write(HiveHBaseTableOutputFormat.java:117)
at org.apache.hadoop.hive.ql.io.HivePassThroughRecordWriter.write(HivePassThroughRecordWriter.java:40)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:717)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1007)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:818)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:692)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:758)
... 8 more
在進行插入操作的時候,hbase會挨個檢查要插入的列,檢查每個列的大小是否小於 maxKeyValueSize值,當cell的大小大於maxKeyValueSize時,就會拋出KeyValue size too large的異常。
hbase.client.keyvalue.maxsize 一個KeyValue實例的最大size.這個是用來設置存儲文件中的單個entry的大小上界。因爲一個KeyValue是不能分割的,所以可以避免因爲數據過大導致region不可分割。
明智的做法是把它設爲可以被最大region size整除的數。如果設置爲0或者更小,就會禁用這個檢查。默認10MB。
默認: 10485760
size 的默認大小是10M,如果cell的大小超過10M,那麼就會報 KeyValue size too large的錯誤。
解決方法:
方法一、根據官網提示,修改配置文件hbase-default.xml ,調大hbase.client.keyvalue.maxsize 的值:
<property>
<name>hbase.client.keyvalue.maxsize</name>
<value>20971520</value>
</property>
方法二:修改代碼,使用configuration對象修改此配置:
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.client.keyvalue.maxsize","20971520");
推薦此種方式。