背景:
HIve在進行行轉列的過程中,如果遇到轉的數組或者MAP()的情況,會出現一種特殊情況,就是數據會消失:
原數據:
SELECT
'1' AS id,
MAP() AS purchase_info
UNION ALL
SELECT
'2' AS id,
MAP() AS purchase_info
UNION ALL
SELECT
'3' AS id,
str_to_map('2019-11-28:100,2019-11-27:1') AS purchase_info
UNION ALL
SELECT
'3' AS id,
str_to_map('2019-11-28:200,2019-11-27:2') AS purchase_info
) all LATERAL VIEW OUTER EXPLODE(purchase_info) info AS purchase_date,amount
在對原數據進行行轉列的時候:
SELECT
id,
info.purchase_date,
info.amount
FROM
(
SELECT
'1' AS id,
MAP() AS purchase_info
UNION ALL
SELECT
'2' AS id,
MAP() AS purchase_info
UNION ALL
SELECT
'3' AS id,
str_to_map('2019-11-28:100,2019-11-27:1') AS purchase_info
UNION ALL
SELECT
'3' AS id,
str_to_map('2019-11-28:200,2019-11-27:2') AS purchase_info
) all LATERAL VIEW EXPLODE(purchase_info)info AS purchase_date,amount
最後的結果是:
發現 purchase_info 爲空的MAP的所有數據都消失了。不符合預期。
解決:如果要包含空數據,需要在lateral view
後加上outer
關鍵字。
SELECT
id,
info.purchase_date,
info.amount
FROM
(
SELECT
'1' AS id,
MAP() AS purchase_info
UNION ALL
SELECT
'2' AS id,
MAP() AS purchase_info
UNION ALL
SELECT
'3' AS id,
str_to_map('2019-11-28:100,2019-11-27:1') AS purchase_info
UNION ALL
SELECT
'3' AS id,
str_to_map('2019-11-28:200,2019-11-27:2') AS purchase_info
) all LATERAL VIEW OUTER EXPLODE(purchase_info)info AS purchase_date,amount
這樣的結果就是:
這樣的話就符合預期了。
對purchase_date和amount 做一下NULL轉換就可以了。