總體上Hive中的窗口函數用法和MySQL8.0中窗口函數的用法相似。這篇文章不再詳細介紹Hive中的窗口函數用法,具體可以參考:MySQL中的窗口函數
首先,看下面一段SQL代碼:
select feature,feature_name,bins,all_rate,ind_rate,member_rate, week,date_range,
lead(all_rate,1)over(partition by feature,bins order by week asc range between 10 preceding and 10 following) as last_all_rate,
lead(ind_rate)over(partition by feature,bins order by week asc range between 10 preceding and 10 following) as last_ind_rate,
lead(member_rate)over(partition by feature,bins order by week asc range between 10 preceding and 10 following) as last_member_rate
from xy_jc.mashang_c_bins_rate_weekly
在hive上執行上述代碼會提示如下錯誤:
Error while compiling statement: FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies.
將上述代碼中的window子句中的range換成row,也會提示上述錯誤。(如果將window子句改成between unbounded preceding and unbounded following 則是可以正常執行的)。在網上找了很久的資料,都沒有找到解決方法(似乎其他人都不是在窗口的函數的問題上提示這個錯誤的)。所以就只能換其他寫法了(知覺覺得應該是lead()不能跟windoow子句用)。最後的改寫代碼如下:
select feature,feature_name,bins,all_rate,ind_rate,member_rate, week,date_range,
first_value(all_rate)over(partition by feature,bins order by week asc range between 1 preceding and current row) as last_all_rate,
first_value(ind_rate)over(partition by feature,bins order by week asc range between 1 preceding and current row) as last_ind_rate,
first_value(member_rate)over(partition by feature,bins order by week asc range between 1 preceding and current row) as last_member_rate
from xy_jc.mashang_c_bins_rate_weekly