Hive版本:Hive 1.1.0-cdh5.14.2
1. INSERT插入
1.1 单条插入
INSERT INTO TABLE tablename [PARTITION (partcol1[=val1], partcol2[=val2] …)] VALUES values_row [, values_row …]
举例如下:
0: jdbc:hive2://node03:10000> INSERT INTO TABLE score_part PARTITION (month='2020-02') VALUES ('01','01',98),('01','02',95);
1.2 子查询插入
Standard syntax: INSERT OVERWRITE TABLE tablename1 [PARTITION
(partcol1=val1, partcol2=val2 …) [IF NOT EXISTS]] select_statement1
FROM from_statement; INSERT INTO TABLE tablename1 [PARTITION
(partcol1=val1, partcol2=val2 …)] select_statement1 FROM
from_statement; Hive extension (multiple inserts): FROM
from_statement INSERT OVERWRITE TABLE tablename1 [PARTITION
(partcol1=val1, partcol2=val2 …) [IF NOT EXISTS]] select_statement1
[INSERT OVERWRITE TABLE tablename2 [PARTITION … [IF NOT EXISTS]]
select_statement2] [INSERT INTO TABLE tablename2 [PARTITION …]
select_statement2] …; FROM from_statement INSERT INTO TABLE
tablename1 [PARTITION (partcol1=val1, partcol2=val2 …)]
select_statement1 [INSERT INTO TABLE tablename2 [PARTITION …]
select_statement2] [INSERT OVERWRITE TABLE tablename2 [PARTITION …
[IF NOT EXISTS]] select_statement2] …; Hive extension (dynamic
partition inserts): INSERT OVERWRITE TABLE tablename PARTITION
(partcol1[=val1], partcol2[=val2] …) select_statement FROM
from_statement; INSERT INTO TABLE tablename PARTITION
(partcol1[=val1], partcol2[=val2] …) select_statement FROM
from_statement;
1.2.1 指定分区插入
0: jdbc:hive2://node03:10000> INSERT INTO TABLE score_part PARTITION (month='2020-01') SELECT s_id, c_id, s_score FROM score3;
1.2.2 动态分区插入
注意:
★需要打开动态分区功能;
★设置为nonstrict模式(允许不指定分区插入);
★selec子句最后一个字段,必须为分区字段;
--开启动态分区功能
set hive.exec.dynamic.partition=true;
--设置hive为非严格模式
set hive.exec.dynamic.partition.mode=nonstrict;
--动态加载数据到分区表
insert into table order_dynamic_partition partition(order_time)
select order_number, order_price, order_time from t_order;
2. CTAS创建表并加载数据
0: jdbc:hive2://node03:10000> CREATE TABLE IF NOT EXISTS score_part AS SELECT * FROM score;
3. LOAD文件
注意:导入本地文件需用LOCAL关键字
LOAD DATA [LOCAL] INPATH ‘filepath’ [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 …)]
3.1 load本地文件
0: jdbc:hive2://node03:10000> LOAD DATA LOCAL INPATH '/home/hadoop/hive/data/score.csv' OVERWRITE INTO TABLE score_part PARTITION (month='2020-02');
3.2 load hdfs文件
0: jdbc:hive2://node03:10000> LOAD DATA INPATH '/hivedatas/score.csv' OVERWRITE INTO TABLE score_part PARTITION (month='2020-02');
4. 创建表时指定location
创建表
0: jdbc:hive2://node03:10000> CREATE EXTERNAL TABLE score8 LIKE score4
. . . . . . . . . . . . . . > LOCATION '/scoredatas';
创建目录,并上传数据文件
0: jdbc:hive2://node03:10000> dfs -mkdir /scoredatas;
0: jdbc:hive2://node03:10000> dfs -put /home/hadoop/hive/data/score.csv /scoredatas/;
查询数据
0: jdbc:hive2://node03:10000> SELECT * FROM score8;
5. import导入文件(内部表操作)
注意:import的文件需为export命令导出的
0: jdbc:hive2://node03:10000> CREATE TABLE score9 LIKE score4;
No rows affected (0.123 seconds)
0: jdbc:hive2://node03:10000> EXPORT TABLE score4 to '/score4';
No rows affected (0.203 seconds)
0: jdbc:hive2://node03:10000> IMPORT TABLE score9 FROM '/score4';