SQL Server 列存儲索引性能總結(10)——行組的大小影響

接上文SQL Server 列存儲索引性能總結(9)——重建和重組聚集列存儲索引所需的內存我們知道,爲了更好的性能,行組(row group)的大小最好是1045678行,或者小於10萬行。如果沒有辦法達到最佳大小,在讀取大量數據的時候,就很難用到列存儲的優點。

前言

   在列存儲索引中,最重要的概念就是行組和片段,它們分別代表了數據存儲在行存儲和列存儲中。在片段中,不管你存了1行還是100萬行數,讀取的時候都是每個頁或者區來讀取,所以如果行數太少,是挺浪費的。
   如果對未排序的列使用篩選條件,那可能會調用很多額外的片段,因爲片段最好還好是已排序。未排序的數據可能會分不到多個片段中。我們知道頁越多,最終的性能就越差。對於小型表(百萬/千萬級別),當然很難只讀一個片段,也不可能只讀一個頁/區,不過這種規模並不是非常影響。但是如果是數十億行的表,不必要的片段將會成爲性能殺手。

環境搭建

   接下來繼續用ContosoRetailDW來做演示,並把兼容級別設置到150也就是使用SQL Server 2019的特性。:


USE [master]
GO
ALTER DATABASE [ContosoRetailDW] SET COMPATIBILITY_LEVEL = 150
GO

-- 創建聚集列存儲索引:
create clustered columnstore Index CCI  on dbo.FactOnlineSales;


select * into dbo.FactOnlineSales_SmallGroups_Test from dbo.FactOnlineSales;

   接下來的技巧要注意了,我把SQL Server的Max Server Memory降低,比如300MB(只能在你自己的實驗環境下測試,畢竟300MB內存在任何企業環境下都會導致系統緩慢甚至無法響應),用來強制只用少量的行創建行組:

EXEC sys.sp_configure N'show advanced options', N'1'  RECONFIGURE WITH OVERRIDE
GO
EXEC sys.sp_configure N'max server memory (MB)', N'300'
GO
RECONFIGURE WITH OVERRIDE
GO
EXEC sys.sp_configure N'show advanced options', N'0'  RECONFIGURE WITH OVERRIDE
GO

   接下來創建聚集列存儲索引到測試表上,由於內存原因,需要跑一段時間,大概3分鐘左右:

create clustered columnstore index CCI on dbo.FactOnlineSales_SmallGroups_test;

   然後對比一下空間大小:

exec sp_spaceused '[dbo].[FactOnlineSales]';
exec sp_spaceused '[dbo].[FactOnlineSales_SmallGroups_test]';

在這裏插入圖片描述

   兩者有所差距,但是大小不是非常明顯,源表佔了163MB的空間,測試表有189MB。但是一旦行組的數量非常多的時候,這個差異將會非常明顯。我們來細化一下兩個表的行組信息:

SELECT object_name(i.object_id) as TableName, count(*) as RowGroupsCount
	FROM sys.indexes AS i
	INNEr JOIN sys.column_store_row_groups AS rg with(nolock)
		ON i.object_id = rg.object_id
	AND i.index_id = rg.index_id 
	WHERE object_name(i.object_id) in ( 'FactOnlineSales','FactOnlineSales_SmallGroups_test')
	group by object_name(i.object_id)
	ORDER BY object_name(i.object_id);

在這裏插入圖片描述
   可以看出行組的數量差異很大。測試表有79個行組但是源表只有15個,差了快6倍。接下來看看查詢(打開實際執行計劃)的效果:

dbcc freeproccache;
dbcc dropcleanbuffers;
set statistics io on
set statistics time on

select prod.ProductName, sum(sales.SalesAmount)
	from dbo.FactOnlineSales sales
		inner join dbo.DimProduct prod
			on sales.ProductKey = prod.ProductKey
		inner join dbo.DimCurrency cur
			on sales.CurrencyKey = cur.CurrencyKey
		inner join dbo.DimPromotion prom
			on sales.PromotionKey = prom.PromotionKey
	where cur.CurrencyName = 'USD' and prom.EndDate >= '2004-01-01' 
	group by prod.ProductName;
--清空緩存以免受影響
dbcc freeproccache;
dbcc dropcleanbuffers;

select prod.ProductName, sum(sales.SalesAmount)
	from dbo.FactOnlineSales_SmallGroups_test sales
		inner join dbo.DimProduct prod
			on sales.ProductKey = prod.ProductKey
		inner join dbo.DimCurrency cur
			on sales.CurrencyKey = cur.CurrencyKey
		inner join dbo.DimPromotion prom
			on sales.PromotionKey = prom.PromotionKey
	where cur.CurrencyName = 'USD' and prom.EndDate >= '2004-01-01' 
	group by prod.ProductName;

   執行計劃看上去沒有明顯差異,均佔據開銷50%。
   從這些信息來看,第一個執行比第二個要慢,從CPU Time(CPU時間源表小於測試表)和Escaped Time(源表大於測試表)可以看出。如果查看Statistics IO的結果,可以看到總邏輯讀還是有點差異的(源表:19,640,測試表:22,808)。另外從執行時間來看:源表15個行組1371 ms,測試表29個行組 1253 ms,沒有非常大的差異。



(2516 行受影響)
Table 'FactOnlineSales'. Scan count 4, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 6420, lob physical reads 33, lob page server reads 0, lob read-ahead reads 13220, lob page server read-ahead reads 0.
Table 'FactOnlineSales'. Segment reads 15, segment skipped 0.
Table 'DimProduct'. Scan count 5, logical reads 370, physical reads 1, page server reads 0, read-ahead reads 123, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'DimPromotion'. Scan count 5, logical reads 4, physical reads 1, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'DimCurrency'. Scan count 5, logical reads 4, physical reads 1, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

(1 行受影響)

 SQL Server Execution Times:
   CPU time = 756 ms,  elapsed time = 618 ms.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.


(2516 行受影響)
Table 'FactOnlineSales_SmallGroups_Test'. Scan count 4, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 8609, lob physical reads 158, lob page server reads 0, lob read-ahead reads 14199, lob page server read-ahead reads 0.
Table 'FactOnlineSales_SmallGroups_Test'. Segment reads 75, segment skipped 0.
Table 'DimProduct'. Scan count 5, logical reads 370, physical reads 1, page server reads 0, read-ahead reads 126, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'DimCurrency'. Scan count 5, logical reads 4, physical reads 1, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'DimPromotion'. Scan count 5, logical reads 4, physical reads 1, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

(1 行受影響)

 SQL Server Execution Times:
   CPU time = 931 ms,  elapsed time = 322 ms.

   接下來用下面的命令再做進一步分析:

SELECT i.name, object_name(p.object_id) tablename, p.index_id, i.type_desc 
   	,sum(p.rows)/count(seg.segment_id) as 'rows'
	,sum(seg.on_disk_size) as 'size in Bytes'
	,cast( sum(seg.on_disk_size) / 1024. / 1024. / 1024 as decimal(8,3)) as 'size in GB'
	,count(distinct seg.segment_id) as 'Segments'
	,count(distinct p.partition_id) as 'Partitions'
	FROM sys.column_store_segments AS seg 
		INNER JOIN sys.partitions AS p 
			ON seg.hobt_id = p.hobt_id 
		INNER JOIN sys.indexes AS i 
			ON p.object_id = i.object_id
	WHERE i.type in (5, 6)
	GROUP BY i.name, p.object_id, p.index_id, i.type_desc;

   結果如下,片段的多少並不是非常影響整體體積,畢竟是使用了高效的列式壓縮。
在這裏插入圖片描述
   還有字典的情況:

select 
	OBJECT_NAME(t.object_id) as 'Table Name',
	sum(dict.on_disk_size)/1024./1024 as DictionarySizeMB
	from sys.column_store_dictionaries dict
	inner join sys.partitions as p 
		ON dict.partition_id = p.partition_id
	inner join sys.tables t
		ON t.object_id = p.object_id
	inner join sys.indexes i
		ON i.object_id = t.object_id
	where i.type in (5,6) -- Clustered 和 Nonclustered Columnstore
	group by t.object_id

   在字典層面,測試表佔了更大的字典大小。另外如果檢查每個列的字典數量和類型,可以看到下面結果:
在這裏插入圖片描述

select t.name as 'Table Name'
	,dict.column_id
	,col.name
	,tp.name
	,case dict.dictionary_id
		when 0 then 'Global Dictionary'
		else 'Local Dictionary'
	end as 'Dictionary Type'
	,count(dict.type) as 'Count'
	,sum(dict.on_disk_size) as 'Size in Bytes'
	,cast(sum(dict.on_disk_size) / 1024.0 / 1024 as Decimal(16,3)) as 'Size in MBytes'
	from sys.column_store_dictionaries dict
	inner join sys.partitions as p 
		ON dict.partition_id = p.partition_id
	inner join sys.tables t
		ON t.object_id = p.object_id
	inner join sys.all_columns col
		on col.column_id = dict.column_id and col.object_id = t.object_id
	inner join sys.types tp 
		ON col.system_type_id = tp.system_type_id AND col.user_type_id = tp.user_type_id   
	where t.[is_ms_shipped] = 0 
		and col.name in ('SalesAmount','ProductKey','CurrencyKey','PromotionKey')
	group by t.name,
			 case dict.dictionary_id
				when 0 then 'Global Dictionary'
				else 'Local Dictionary'
			 end, 
			 col.name,
			 tp.name,
			 dict.column_id
	order by dict.column_id, t.name;

在這裏插入圖片描述

   對比Size的話,實際上兩者差距還是挺大的。特別是Local Dictionary,接近10倍的差距。

總結

   從上面的結果看出,小型行組跟大型行組在某些指標上各有優勢,所以我們不能一概而論,還是那句話:具體問題具體分析。
   對於這種行組數量差異,只要對聚集列存儲索引rebuild一下即可。可以看到其實微軟還是希望你使用大型行組的。畢竟rebuild是經常需要用到的維護操作,一旦rebuild成功,行組就會恢復差不多的水平。
在這裏插入圖片描述
   最後記得把Max Server Memory調回去。
   下一文:SQL Server 列存儲索引性能總結(11)——列存儲的維護

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章