針對員工最大連續打卡天數等類似問題SQL求解,思路如下,
示例數據如下,
id | date | success |
1 | 2020/4/2 | 1 |
1 | 2020/4/3 | 1 |
1 | 2020/4/4 | 1 |
1 | 2020/4/5 | 0 |
1 | 2020/4/6 | 1 |
1 | 2020/4/7 | 1 |
2 | 2020/4/2 | 1 |
2 | 2020/4/3 | 1 |
1)通過窗口函數對員工打卡數據進行排序
id | date | rn |
1 | 2020/4/2 | 1 |
1 | 2020/4/3 | 2 |
1 | 2020/4/4 | 3 |
1 | 2020/4/6 | 4 |
1 | 2020/4/7 | 5 |
2 | 2020/4/2 | 1 |
2 | 2020/4/3 | 2 |
2)計算當前打卡日期與序號差值,我們可以看到如果是連續打卡,則label_date值是相同的
id | date | rn | label_date |
1 | 2020/4/2 | 1 | 2020/4/1 |
1 | 2020/4/3 | 2 | 2020/4/1 |
1 | 2020/4/4 | 3 | 2020/4/1 |
1 | 2020/4/6 | 4 | 2020/4/2 |
1 | 2020/4/7 | 5 | 2020/4/2 |
2 | 2020/4/2 | 1 | 2020/4/1 |
2 | 2020/4/3 | 2 | 2020/4/1 |
3)對label_date進行計數,取計數結果count_day的最大值即爲最大連續打卡天數
id | label_date | count_day |
1 | 2020/4/1 | 3 |
1 | 2020/4/2 | 2 |
2 | 2020/4/1 | 2 |
SQL代碼如下,
select
c.id,max(count_day) as max_day
from
(
select
b.id,b.label_date,count(*) as count_day
from
(
select
a.id,a.date,date_sub(a.date, cast(rn as int)) as label_date
from
(
select
id,date,
row_number() over(partition by id order by date) as rn
from events where success=1
)a
)b
group by b.id,b.label_date
)c
group by c.id;