參考文章:
1.ORALCE函數:LAG()和LEAD() 分析函數詳解、
https://blog.csdn.net/pelifymeng2/article/details/70313943
LAG, LEAD 函數簡單介紹
Lag和Lead分析函數可以在同一次查詢中取出同一字段的前N行的數據(Lag)和後N行的數據(Lead)作爲獨立的列。
在實際應用當中,若要用到取今天和昨天的某字段差值時,Lag和Lead函數的應用就顯得尤爲重要。當然,這種操作可以用表的自連接實現,但是 LAG 和 LEAD 與 left join、right join 等自連接相比,效率更高,SQL更簡潔。下面我就對這兩個函數做一個簡單的介紹。
函數語法
lag(exp_str,offset,defval) over(partion by ..order by …)
lead(exp_str,offset,defval) over(partion by ..order by …)
其中exp_str是字段名
Offset是偏移量,即是上1個或上N個的值,假設當前行在表中排在第5行,則offset 爲3,則表示我們所要找的數據行就是表中的第2行(即5-3=2)。
Defval默認值,當兩個函數取上N/下N個值,當在表中從當前行位置向前數N行已經超出了表的範圍時,lag()函數將defval這個參數值作爲函數的返回值,若沒有指定默認值,則返回NULL,那麼在數學運算中,總要給一個默認值纔不會出錯。
例子
構建表,插入測試數據
use data_warehouse_test;
CREATE TABLE IF NOT EXISTS user_old_salary_info (
user_name STRING
,salary_vaild_date STRING
,salary BIGINT
)
;
INSERT OVERWRITE TABLE user_old_salary_info VALUES
('szh', '2011-11-06', 1000)
,('sx', '2011-11-07', 2000)
,('szh', '2015-06-11', 4000)
,('sx', '2016-07-12', 5000)
,('szh', '2017-08-20', 10000)
,('sg', '2017-08-20', 30000)
,('szh', '2020-06-20', 25000)
;
進行相關查詢
use data_warehouse_test;
SELECT *
FROM user_old_salary_info
;
SELECT user_name, salary, LAG(salary, 1, 0) OVER(PARTITION BY user_name ORDER BY salary_vaild_date) AS last_salary
FROM user_old_salary_info
;
SELECT user_name, salary, LEAD(salary, 1, 0) OVER() AS next_salary
FROM user_old_salary_info
;
查詢表中所有的記錄:
SELECT *
FROM user_old_salary_info
;
+---------------------------------+-----------------------------------------+------------------------------+
| user_old_salary_info.user_name | user_old_salary_info.salary_vaild_date | user_old_salary_info.salary |
+---------------------------------+-----------------------------------------+------------------------------+
| szh | 2011-11-06 | 1000 |
| sx | 2011-11-07 | 2000 |
| szh | 2015-06-11 | 4000 |
| sx | 2016-07-12 | 5000 |
| szh | 2017-08-20 | 10000 |
| sg | 2017-08-20 | 30000 |
| szh | 2020-06-20 | 25000 |
+---------------------------------+-----------------------------------------+------------------------------+
=============================
某一個員工 本次的薪水 和 上一次的薪水
SELECT user_name, salary, LAG(salary, 1, 0) OVER(PARTITION BY user_name ORDER BY salary_vaild_date) AS last_salary
FROM user_old_salary_info
;
+------------+---------+--------------+
| user_name | salary | last_salary |
+------------+---------+--------------+
| sg | 30000 | 0 |
| sx | 2000 | 0 |
| sx | 5000 | 2000 |
| szh | 1000 | 0 |
| szh | 4000 | 1000 |
| szh | 10000 | 4000 |
| szh | 25000 | 10000 |
+------------+---------+--------------+
=============================
順序查看 本次的薪水 和 上一次的薪水
SELECT user_name, salary, LEAD(salary, 1, 0) OVER() AS next_salary
FROM user_old_salary_info
;
+------------+---------+--------------+
| user_name | salary | next_salary |
+------------+---------+--------------+
| szh | 25000 | 30000 |
| sg | 30000 | 10000 |
| szh | 10000 | 5000 |
| sx | 5000 | 4000 |
| szh | 4000 | 2000 |
| sx | 2000 | 1000 |
| szh | 1000 | 0 |
+------------+---------+--------------+