HIVE中map,array和structs使用

1:怎樣導入文本文件(格式是怎樣的?),2:怎樣查詢數據,已經能否在join中使用?在子查詢中使用?等等

知道怎麼在hive中導入數組不?
例如:我想把 數組[1,2,3] 和 數組 ["a","b","c"]
導入到table1中
create table table2 ( a array<int> , b array<string>);

那麼 我如何 導入呢?使得
select * from table1;
j結果爲:
[1,2,3] ["a","b","c"]

同樣 在 hive 中 對於 map
怎樣 查詢呢?
例如 
create table table2 ( a MAP<STRING,ARRAY<STRING>>);
select * from table2 結果爲:
{"d01":["d011","d012"],"d02":["d021","d022"]}
{"d01":["d011","d012"],"d02":null}
{"d01":[null,"d012"],"d02":["d021","d022"]}
那麼 我想獲得 key 爲 d01的value值 
該怎麼操作呢

關於數組的操作說明:
drop table table2;

create table table2 (a array<string>, b array<string>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
COLLECTION ITEMS TERMINATED BY ',';


load data local inpath "../hive/examples/files/arraytest.txt"  overwrite into table table2;

arraytest.txt中的數據形式爲:(不同數組間用\t分割,同一數組內不同元素用逗號分割)
b00,b01        b00,b01
b00,b01        b00,b01
b00,b01        b00,b01
b00,b01        b00,b01


hive> select * from table2;

OK
["b00","b01"]   ["b00","b01"]
["b00","b01"]   ["b00","b01"]
["b00","b01"]   ["b00","b01"]
["b00","b01"]   ["b00","b01"]
Time taken: 0.056 seconds

hive> select a from table2;
OK
["b00","b01"]
["b00","b01"]
["b00","b01"]
["b00","b01"]
Time taken: 15.903 seconds

hive> select a[0] from table2;
OK
b00
b00
b00
b00
Time taken: 12.913 seconds

hive> select * from table2 where a[0] = b[0];
OK
["b00","b01"]   ["b00","b01"]
["b00","b01"]   ["b00","b01"]
["b00","b01"]   ["b00","b01"]
["b00","b01"]   ["b00","b01"]
Time taken: 11.803 seconds

 

關於map的操作說明:
drop table table2;

hive> CREATE TABLE table2 (foo STRING , bar MAP<STRING, STRING>)
    > ROW FORMAT DELIMITED
    > FIELDS TERMINATED BY '\t'
    > COLLECTION ITEMS TERMINATED BY ','
    > MAP KEYS TERMINATED BY ':'
    > STORED AS TEXTFILE;


hive> load data local inpath "../hive/examples/files/maptest.txt"  overwrite into table table2;
maptest.txt中的文件格式爲:(不同列之間用一個tab分割,map中key和value用冒號分割,不同K/V間用逗號分割)
a00        b0:b01,b1:b11
a01        b1:b11,b2:b12
a02        b2:b12,b3:b13
a03        b3:b13,b4:b14

hive> select bar from table2;
OK
{"b0":"b01","b1":"b11"}
{"b1":"b11","b2":"b12"}
{"b2":"b12","b3":"b13"}
{"b3":"b13","b4":"b14"}
Time taken: 19.237 seconds
怎麼根據 key來查詢value呢?
hive> select bar['b1'] from table2;
OK
b11
b11
NULL
NULL
Time taken: 11.65 seconds

查看map中的鍵值對個數:
hive> select size(bar) from table2;
OK
2
2
2
2
Time taken: 12.137 seconds

發佈了2 篇原創文章 · 獲贊 16 · 訪問量 14萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章