經常寫SQL語句的人應該知道Group by語句的主要用法是進行分類彙總,下面是一種它最常見的用法(根據部門、職位分別統計業績):
SELECT a.dname,b.job,SUM(b.sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY a.dname,b.job;
DNAME JOB SUM_SAL
-------------- --------- ----------
SALES MANAGER 2850
SALES CLERK 950
SALES SALESMAN 5600
ACCOUNTING MANAGER 2450
ACCOUNTING PRESIDENT 5000
ACCOUNTING CLERK 1300
RESEARCH MANAGER 2975
RESEARCH ANALYST 6000
RESEARCH CLERK 1900
這時候,如果有人跑過來跟你說:我除了以上數據之外,還要每個部門總的業績以及所有部門加起來的業績,這時候你很可能會想到如下的笨方法(union all):
select * from (
SELECT a.dname,b.job,SUM(b.sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY a.dname,b.job
UNION ALL
--實現了部門的小計
SELECT a.dname,NULL, SUM(b.sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY a.dname
UNION ALL
--實現了所有部門總的合計
SELECT NULL,NULL, SUM(b.sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno)
order by dname;
DNAME JOB SUM_SAL
-------------- --------- ----------
ACCOUNTING CLERK 1300
ACCOUNTING MANAGER 2450
ACCOUNTING PRESIDENT 5000
ACCOUNTING 8750
RESEARCH CLERK 1900
RESEARCH MANAGER 2975
RESEARCH ANALYST 6000
RESEARCH 10875
SALES CLERK 950
SALES MANAGER 2850
SALES SALESMAN 5600
SALES 9400
29025
union all 合併笨辦法產生的執行計劃
-------------------------------------------------------------------------------
Plan hash value: 2979078843
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 29 | 812 | 23 (22)| 00:00:01 |
| 1 | SORT ORDER BY | | 29 | 812 | 23 (22)| 00:00:01 |
| 2 | VIEW | | 29 | 812 | 22 (19)| 00:00:01 |
| 3 | UNION-ALL | | | | | |
| 4 | HASH GROUP BY | | 14 | 756 | 8 (25)| 00:00:01 |
|* 5 | HASH JOIN | | 14 | 756 | 7 (15)| 00:00:01 |
| 6 | TABLE ACCESS FULL| DEPT | 4 | 88 | 3 (0)| 00:00:01 |
| 7 | TABLE ACCESS FULL| EMP | 14 | 448 | 3 (0)| 00:00:01 |
| 8 | HASH GROUP BY | | 14 | 672 | 8 (25)| 00:00:01 |
|* 9 | HASH JOIN | | 14 | 672 | 7 (15)| 00:00:01 |
| 10 | TABLE ACCESS FULL| DEPT | 4 | 88 | 3 (0)| 00:00:01 |
| 11 | TABLE ACCESS FULL| EMP | 14 | 364 | 3 (0)| 00:00:01 |
| 12 | SORT AGGREGATE | | 1 | 39 | | |
|* 13 | HASH JOIN | | 14 | 546 | 7 (15)| 00:00:01 |
| 14 | TABLE ACCESS FULL| DEPT | 4 | 52 | 3 (0)| 00:00:01 |
| 15 | TABLE ACCESS FULL| EMP | 14 | 364 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------
其實,如果你知道Group By的Rollup擴展的話,這種需求只是小case:
SELECT a.dname,b.job, SUM(b.sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY ROLLUP(a.dname,b.job);
DNAME JOB SUM_SAL
-------------- --------- ----------
SALES CLERK 950
SALES MANAGER 2850
SALES SALESMAN 5600
SALES 9400
RESEARCH CLERK 1900
RESEARCH ANALYST 6000
RESEARCH MANAGER 2975
RESEARCH 10875
ACCOUNTING CLERK 1300
ACCOUNTING MANAGER 2450
ACCOUNTING PRESIDENT 5000
ACCOUNTING 8750
29025
rollup寫法產生的執行計劃
-----------------------------------------------------------------------------
Plan hash value: 1037965942
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 756 | 8 (25)| 00:00:01 |
| 1 | SORT GROUP BY ROLLUP| | 14 | 756 | 8 (25)| 00:00:01 |
|* 2 | HASH JOIN | | 14 | 756 | 7 (15)| 00:00:01 |
| 3 | TABLE ACCESS FULL | DEPT | 4 | 88 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | EMP | 14 | 448 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
可以發現,這種方法不但SQL書寫方便,性能也能得到提高。這時候,如果又有人跑過來說:除了以上數據,他還需要每個職位總的業績,你只要把rollup換成cube就可以了,如下所示:
-- CUBE分組
SELECT a.dname,b.job, SUM(b.sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY CUBE(a.dname,b.job);
DNAME JOB SUM_SAL
-------------- --------- ----------
29025
CLERK 4150
ANALYST 6000
MANAGER 8275
SALESMAN 5600
PRESIDENT 5000
SALES 9400
SALES CLERK 950
SALES MANAGER 2850
SALES SALESMAN 5600
RESEARCH 10875
RESEARCH CLERK 1900
RESEARCH ANALYST 6000
RESEARCH MANAGER 2975
ACCOUNTING 8750
ACCOUNTING CLERK 1300
ACCOUNTING MANAGER 2450
ACCOUNTING PRESIDENT 5000
從上面可以看出:cube比rollup的展現的粒度更細一些。這時候,如果又有人跑過來說:他不需要那麼細的數據,只需要彙總的數據,可以使用Grouping Sets:
---GROUPING SETS分組
SELECT to_char(b.hiredate,'yyyy') hire_year,a.dname,b.job, SUM(sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY GROUPING SETS(to_char(b.hiredate,'yyyy'),a.dname,b.job);
HIRE DNAME JOB SUM_SAL
---- -------------- --------- ----------
1987 4100
1980 800
1982 1300
1981 22825
ACCOUNTING 8750
RESEARCH 10875
SALES 9400
CLERK 4150
SALESMAN 5600
PRESIDENT 5000
MANAGER 8275
ANALYST 6000