前言
- 本文主要用於記錄TopK問題的一些簡答的SQL解法。如果讀者有更好的解法,或者覺得博主方法存在缺陷,歡迎在評論區交流。
測試用表
表結構:
CREATE TABLE employees(
`employee_id` int(6) NOT NULL auto_increment, -- '員工編號'
`salary` double(10,2) DEFAULT NULL,-- '月薪'
`department_id` int(6) DEFAULT NULL, -- '部門ID'
PRIMARY KEY(`employee_id`)
);
表數據:
100 24000 90
101 17000 90
102 17000 90
103 9000 60
104 6000 60
...
199 2600 50
200 4400 10
201 13000 20
202 6000 20
203 6500 40
204 10000 70
205 12000 110
206 8300 110
問題:求各個部門的工資前3名的員工信息
方法一:使用連接查詢
1)查詢每位員工的員工信息,以及同部門中員工工資大於此員工的員工信息,生成表t1。
即對兩張原始表a和b進行左連接LEFT JOIN,連接條件爲a.department_id=b.department_id AND a.salary < b.salary
。
SELECT
a.employee_id,
a.salary,
a.department_id,
b.employee_id as r_employee_id,
b.salary as r_salary
FROM
employees a
LEFT JOIN
employees b
ON a.department_id = b.department_id AND a.salary < b.salary;
2)基於表t1,過濾出在對應部門工資排名前3的員工記錄
即基於表t1,按照department_id和employee_id進行分組,過濾出行組內記錄數小於等於2的組
SELECT
department_id,
count(r_employee_id)+1 AS rank,
employee_id,
salary
FROM (
SELECT
a.employee_id,
a.salary,
a.department_id,
b.employee_id as r_employee_id,
b.salary as r_salary
FROM
employees a
LEFT JOIN
employees b
ON a.department_id = b.department_id AND a.salary < b.salary
) AS t1
GROUP BY t1.department_id,t1.employee_id
HAVING count(r_employee_id)<=2
ORDER BY department_id,rank; # 爲了便於觀察,對結果進行了排序
方法二:使用窗口函數
根據實際情況,選擇不同的窗口函數,即選擇不同的排名方式:
- row_number() over():按照分組、排序後的行號排名,不重複。例如:值[1,1,3]的排名爲[1,2,3]
- rank() over():按照分組、排序後的字段值進行排名,可重複,會跳過前面相同值對應的行號名次。例如:值[1,1,3]的排名爲[1,1,3]
- dense_rank() over():按照分組、排序後的字段值進行排名,可重複,不會跳過名次。例如:值[1,1,3]的排名爲[1,1,2]
1)查詢每位員工的員工信息,並計算其工資水平在部門中的排名,生成表t1
即使用窗口函數rank(),按照部門分組,工資大小計算排名
SELECT
department_id,
employee_id,
salary,
rank() over(partition by department_id order by salary DESC) as rank
FROM
employees;
2)基於表t1,過濾出在對應部門工資排名前3的員工記錄
即使用WHERE子句進行過濾
SELECT
*
FROM (
SELECT
department_id,
employee_id,
salary,
rank() over((PARTITION BY department_id ORDER BY salary DESC) as rank
FROM
employees
) AS t1
WHERE t1.rank <= 3;
方法三:WHERE子句執行關聯子查詢
1)查詢每個員工的員工信息,並使用關聯子查詢查找和當前員工在同部門,且工資大於當前員工的員工ID,子查詢的SELECT子句中統計子查詢的記錄數num,在外部查詢的WHERE子句中過濾num小於等於2的記錄
SELECT
department_id,
employee_id,
salary
FROM
employees t1
WHERE (
SELECT
count(t2.employee_id)
FROM
employees t2
WHERE
t1.department_id = t2.department_id
AND
t1.salary < t2.salary
) <= 2
ORDER BY department_id,salary DESC; # 爲了便於觀察,對結果進行了排序