感謝分享:http://blog.csdn.net/chunsport/article/details/70808814
http://blog.csdn.net/yeweiouyang/article/details/41645315
錯誤語句:
select * from a
where a.id IN (SELECT b.id FROM b WHERE b.x='1');
原因:
1.1版本支持in,但是不支持in的子查詢
解決:
用left semi join代替in子查詢
select * from a
LEFT SEMI JOIN b
ON(a.id=b.id AND b.x='1');
=======================以下一個是錯誤原因的博客,一個是left semi join的博客====================================
select cid as c from daily_conn where cid not in (select c from top_channel_log where v<>'5.8.7' and (objlabel='1' or objlabel is null or objlabel='13557')) and v<>'5.8.7' ;
這條語句執行時出現了下面這條錯誤
Error: Error while compiling statement: FAILED: SemanticException [Error 10249]: Line 1:38 Unsupported SubQuery Expression 'cid': Correlating expression cannot contain unqualified column references. (state=42000,code=10249)
找了好半天,甚至改寫了sql,變成如下形式
select cid from daily_conn where v<>'5.8.7' except (select c from top_channel_log where v<>'5.8.7' and (objlabel=1 or objlabel is null or objlabel=13557)) as b;
Error: Error while compiling statement: FAILED: ParseException line 1:86 missing EOF at 'except' near '20170425' (state=42000,code=40000)
出現了這樣的錯誤,最終改成了
select cid from daily_conn left outer join top_channel_log on daily_conn.cid=top_channel_log.c where daily_conn.v<>'5.8.7' and daily_conn.p_date>=20170423 and daily_conn.p_date<=20170425 and top_channel_log.v<>'5.8.7' and (top_channel_log.objlabel=1 or top_channel_log.objlabel is null or top_channel_log.objlabel=13557);
造成這樣的原因是:本人使用的是hive1.3的版本,1.3的版本中支持not in 但是不支持not in的子查詢;不支持except的使用。
用left semi join替代in子查詢的方式
執行如下hive sql:
- select * from trackinfo
- where ds=$date and session_id in (select session_id from rcmd_track_path where ds=$date and add_cart_flag>0 and product_id>0);
提示報錯如下:
- FAILED: ParseException line 2:39 cannot recognize input near 'select' 'session_id' 'from' in expression specification
原因分析 & 解決方案 如下:
hive不支持in 子查詢的用法,可以考慮用left semi join的方式來替換in,對上面的hive sql改寫如下:
- select *
- from trackinfo t1
- left semi join rcmd_track_path t2
- on (t1.session_id=t2.session_id and t2.add_cart_flag>0 and t2.product_id>0 and t1.ds=$date and t2.ds=$date);
left semi join是hive