hive select查詢中去除部分列

原創

2020-02-24 06:38

這是HIVE中查詢語句的一個小技巧，一個表有很多字段，我們想要除個別字段外的剩餘所有字段，全部列出來不方便且不美觀，實際上hive語句可以解決這個問題。

1、查詢表的表結構

hive> desc tmp.xx_toutiao_userinfo;
OK
id                      string                                      
membership_level        int                                         
gender                  int                                         
country                 string                                      
province                string                                      
city                    string                                      
extra_info              string                                      
Time taken: 0.071 seconds, Fetched: 7 row(s)
hive>

2、不去除查詢

select
*
from tmp.xx_toutiao_userinfo limit 3;
OK
1702200931xItNqL        0       0       中國    廣西    玉林    NULL
1702200735oSfSFb        0       1       中國    雲南    昆明    NULL
1702200107j01c2c        0       1       中國    新疆    烏魯木齊        NULL
Time taken: 0.115 seconds, Fetched: 3 row(s)

3、去除一列

set hive.support.quoted.identifiers=none;
select
`(membership_level)?+.+`
from tmp.xx_toutiao_userinfo limit 3;
OK
1702200931xItNqL        0       中國    廣西    玉林    NULL
1702200735oSfSFb        1       中國    雲南    昆明    NULL
1702200107j01c2c        1       中國    新疆    烏魯木齊        NULL
Time taken: 0.111 seconds, Fetched: 3 row(s)

4、去除多列

set hive.support.quoted.identifiers=none;
select
`(membership_level|extra_info)?+.+`
from tmp.xx_toutiao_userinfo limit 3;
OK
1702200931xItNqL        0       中國    廣西    玉林
1702200735oSfSFb        1       中國    雲南    昆明
1702200107j01c2c        1       中國    新疆    烏魯木齊
Time taken: 0.11 seconds, Fetched: 3 row(s)

5、去除列中包含不存在的列

set hive.support.quoted.identifiers=none;
select
`(membership_level|extra_info_xxx)?+.+`
from tmp.xx_toutiao_userinfo limit 3;
OK
1702200931xItNqL        0       中國    廣西    玉林    NULL
1702200735oSfSFb        1       中國    雲南    昆明    NULL
1702200107j01c2c        1       中國    新疆    烏魯木齊        NULL
Time taken: 0.218 seconds, Fetched: 3 row(s)

可見，列表中沒有相應的字段也沒有影響，只是把存在的列去除掉，不會報錯。

6、hive-site中配置

上面的 set hive.support.quoted.identifiers=none; 可以替換操作爲：
在 'hive-site.xml' 中添加以下配置，

hive.support.quoted.identifiers=none

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

hive select查詢中去除部分列

1、查詢表的表結構

2、不去除查詢

3、去除一列

4、去除多列

5、去除列中包含不存在的列

6、hive-site中配置

工作中用到的腳本合集

24-5-18 X

mysql逗號分隔List字段轉多行

標籤庫建設

需用歷史全量數據計算的替代方案

數據湖淺析

配合任務遷移數倉ETL腳本按需替換方案2

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結