最近生產環境出現了一個錯誤,spark無法寫入數據的到hive報以下錯誤
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Invalid partition for table orc_report_behavior
at org.apache.hadoop.hive.ql.metadata.Partition.initialize(Partition.java:208)
at org.apache.hadoop.hive.ql.metadata.Partition.<init>(Partition.java:106)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllPartitionsOf(Hive.java:2103)
... 194 more
Caused by: MetaException(message:Invalid partition key & values; keys [day, ], values [])
at org.apache.hadoop.hive.metastore.Warehouse.makePartName(Warehouse.java:550)
at org.apache.hadoop.hive.metastore.Warehouse.makePartName(Warehouse.java:483)
at org.apache.hadoop.hive.ql.metadata.Partition.initialize(Partition.java:192)
... 196 more
首先懷疑hive進程問題,直接通過hive查詢表數據,結果還是包以上錯誤。檢查數據源發現有亂碼,百度了下,找到解決方案,特記錄下。
解決方法
刪除所有表分區
#!/bin/bash
source /etc/profile
cd /r2/
if [ $# -ne 2 ] ;then
echo "必須輸入兩個參數,格式爲20180801 20180811"
exit -1
fi
startDate=$1
endDate=$2
startSec=`date -d "$startDate" "+%s"`
endSec=`date -d "$endDate" "+%s"`
echo $startSec "==" $endSec
if [ ${startDate} -gt ${endDate} ] ; then
echo "開始時間必須小於結束時間"
fi
echo ""
for((i=$startSec;i<=$endSec;i+=86400))
do
day=`date -d "@$i" "+%Y%m%d"`
hive -e "use bigdata;alter table orc_report_behavior drop partition(day='$day') "
done
~
~
- .刪除mysql下元數據信息
2.刪除MySQL下的元數據信息
登陸到存放元數據信息的mysql
SELECT * FROM TBLS WHERE TBL_NAME='orc_report_behavior';
找到TBL_ID,根據TBL_ID找到分區信息
select * from PARTITIONS where tbl_id='6552';
找到其中一個分區信息
select * from PARTITION_KEY_VALS where part_id=11426;
select * from PARTITION_PARAMS where part_id=11426;
刪除
delete from PARTITION_KEY_VALS where part_id=11426;
delete from PARTITION_PARAMS where part_id=11426;
delete from PARTITIONS where tbl_id='6552' and part_id=11426;;