【Hive】動態分區插入

原創

NextAction

2020-06-30 17:03

使用動態分區插入數據時，無需指定分區鍵值，系統根據插入的數據，自動分配分區。
動態分區需注意以下幾點:

需有一個同構的普通表做爲源表；
分區鍵值和源表字段之間是根據位置來判斷的，而不是根據命名來匹配的，分區鍵值一般對應SELECT後的最後一個字段；
動態分區默認是關閉的，使用前要設置相關參數；

下面是一個動態分區的例子:

# 創建分區表和普通表
create table myhive.student_dynamic_partition
(
  stu_no int,
  stu_name string
) partitioned by (class_no int)
row format delimited fields terminated by ' ';

create table myhive.student
(
  stu_no int,
  stu_name string,
  class_no int
) 
row format delimited fields terminated by ' ';

# 上傳數據文件到HDFS
[hadoop@node01 hiveData]$ hdfs dfs -put student.txt /
[hadoop@node01 hiveData]$ hdfs dfs -cat /student.txt
1001 john 1
1002 susan 1
1003 smith 2
1004 tom 2
1005 simen 3

# 普通表導入數據
hive (myhive)> load data inpath '/student.txt' overwrite into table student;
hive (myhive)> select * from student;
student.stu_no  student.stu_name        student.class_no
1001    john    1
1002    susan   1
1003    smith   2
1004    tom     2
1005    simen   3

# 使用動態分區插入數據到分區表中
hive (myhive)> set hive.exec.dynamic.partition=true;              #打開動態分區
hive (myhive)> set hive.exec.dynamic.partition.mode=nonstrict;    #動態分區模式設置爲非嚴格
hive (myhive)> set hive.exec.max.dynamic.partitions.pernode=1000; #設置每個mapper或reducer的最大動態分區個數

hive (myhive)> insert overwrite table student_dynamic_partition
             >  partition (class_no)
             > select stu_no,stu_name,class_no
             > from student;
             
hive (myhive)> select * from student_dynamic_partition;
student_dynamic_partition.stu_no        student_dynamic_partition.stu_name      student_dynamic_partition.class_no
1001    john    1
1002    susan   1
1003    smith   2
1004    tom     2
1005    simen   3

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【Hive】動態分區插入

【Hive】性能調優 - map-side JOIN

【Python】Python中的日誌級別

【HDFS】HDFS操作命令

【HDFS】HDFS與dfsadmin結合使用

【HBase】創建表-Java API操作

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結