通过GROUP BY grouping sets提升数据分组统计效率

使用 GROUPING SETS 的 GROUP BY 子句可以生成一个等效于由多个简单 GROUP BY 子句的 UNION ALL 生成的结果集
示例:(sql server 2008 R2)
创建测试数据库及表并插入测试数据

use master
CREATE DATABASE db_sales
go
use db_sales
go
CREATE TABLE [dbo].[tb_sale](
	[id] [int] IDENTITY(1,1) NOT NULL,
	[server] [nvarchar](50) NULL,
	[pname] [nvarchar](50) NULL,
	[pinpai] [nvarchar](50) NULL,
	[dates] [smalldatetime] NULL,
	[cnt] [int] NULL
) ON [PRIMARY]

go

INSERT INTO [db_ControlManager_ft].[dbo].[test]([server],[pname],[pinpai],[dates],[cnt])
     VALUES('A','computer','hp','2012-01-01',1),
		   ('A','computer','hp','2012-01-02',3),
		   ('A','computer','hp','2012-01-03',5),
		   ('A','computer','hp','2012-01-04',1),
		   ('A','computer','hp','2012-01-05',3),
		   ('A','computer','hp','2012-01-06',5),
		   ('A','computer','dell','2012-01-01',2),
		   ('A','computer','dell','2012-01-02',4),
		   ('A','computer','dell','2012-01-03',6),
		   ('A','computer','dell','2012-01-04',7),
		   ('A','computer','dell','2012-01-05',2),
		   ('A','computer','dell','2012-01-06',4),
		   ('B','computer','hp','2012-01-01',3),
		   ('B','computer','hp','2012-01-02',3),
		   ('B','computer','hp','2012-01-03',3),
		   ('B','computer','hp','2012-01-04',3),
		   ('B','computer','hp','2012-01-05',3),
		   ('B','computer','hp','2012-01-06',2),
		   ('B','computer','dell','2012-01-01',2),
		   ('B','computer','dell','2012-01-02',2),
		   ('B','computer','dell','2012-01-03',2),
		   ('B','computer','dell','2012-01-04',2),
		   ('B','computer','dell','2012-01-05',1),
		   ('B','computer','dell','2012-01-06',1),
		   ('A','TV','hp','2012-01-01',1),
		   ('A','TV','hp','2012-01-02',3),
		   ('A','TV','hp','2012-01-03',5),
		   ('A','TV','hp','2012-01-04',1),
		   ('A','TV','hp','2012-01-05',3),
		   ('A','TV','hp','2012-01-06',5),
		   ('A','TV','dell','2012-01-01',2),
		   ('A','TV','dell','2012-01-02',4),
		   ('A','TV','dell','2012-01-03',6),
		   ('A','TV','dell','2012-01-04',7),
		   ('A','TV','dell','2012-01-05',2),
		   ('A','TV','dell','2012-01-06',4),
		   ('B','TV','hp','2012-01-01',3),
		   ('B','TV','hp','2012-01-02',3),
		   ('B','TV','hp','2012-01-03',3),
		   ('B','TV','hp','2012-01-04',3),
		   ('B','TV','hp','2012-01-05',3),
		   ('B','TV','hp','2012-01-06',2),
		   ('B','TV','dell','2012-01-01',2),
		   ('B','TV','dell','2012-01-02',2),
		   ('B','TV','dell','2012-01-03',2),
		   ('B','TV','dell','2012-01-04',2),
		   ('B','TV','dell','2012-01-05',1),
		   ('B','TV','dell','2012-01-06',1)
go

现在要求分别计算出每天的销量,总销量,每个销售员的总销量,每个产品的总销量,每个品牌的总销量,及每个销售员按品牌的产品销量

使用普通的UNION ALL语句

select N'总销量',null,null,null,null,SUM(cnt) from tb_sale
union all
select N'每日销量',null,null,null,dates,SUM(cnt) from tb_sale group by dates
union all
select N'按人员总销量',[server],null,null,null,SUM(cnt) from tb_sale group by [server]
union all
select N'按品牌总销量',null,null,pinpai,null,SUM(cnt) from tb_sale group by pinpai
union all
select N'按产品总销量',null,pname,null,null,SUM(cnt) from tb_sale group by pname
union all
select N'按产品及品牌总销量',null,pname,pinpai,null,SUM(cnt) from tb_sale group by pname,pinpai
union all
select N'按人员产品及品牌总销量',[server],pname,pinpai,null,SUM(cnt) from tb_sale group by [server],pname,pinpai

使用grouping sets 和grouping_id实现

select 
[server] as 销售人员 ,
pname as 产品名称,
pinpai as  品牌 ,
dates as 销售时间,
SUM(cnt) as 销售数量,
(case
when GROUPING_ID([server],pname,pinpai,dates)=15 then N'总销量'
when GROUPING_ID([server],pname,pinpai,dates)=14 then N'每日销量'
when GROUPING_ID([server],pname,pinpai,dates)=13 then N'按品牌总销量'

when GROUPING_ID([server],pname,pinpai,dates)=11 then N'按产品总销量'
when GROUPING_ID([server],pname,pinpai,dates)=9 then N'按产品及品牌总销量'
when GROUPING_ID([server],pname,pinpai,dates)=7 then N'按人员总销量'

when GROUPING_ID([server],pname,pinpai,dates)=1 then N'按人员产品及品牌总销量'
end
)
as 项目说明
  from tb_sale
group by GROUPING sets(([server],pname,pinpai),(pname,pinpai),[server],pname,pinpai,dates,())
order by [server],pname,pinpai,dates


使用union操作会增加IO开销,会减少cpu和内存的开销,使用grouping sets会减少IO开销,会增加cpu和内存的消耗.

GROUPING SETS在遇到多个条件时,聚合是一次性从数据库中取出所有需要操作的数据,在内存中对数据库进行聚合操作并生成结果。而UNION ALL是多次扫描表,将返回的结果进行UNION操作.这也就是为什么GROUPING SETS和UNION操作所返回的数据顺序是不同的.
grouping sets的执行方式在group by后面有多列的时候,grouping sets带来的性能提升非常明显


 


 

 

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章