Hi Friends,
By this time, many of you already know about columnstore indexes, a column store architecture that groups and stores data for each column on separate pages and then joins all the columns to return data, as required.
Note that you can have only one columnstore index per table and as per Microsoft guidelines, you may want to put many columns in that one index. I am a little scared of putting ‘many’ columns in that one index 🙂 – thus keeping an eye on the size of your columnstore index is critical. Here is a simple example to show how you can determine the size of a columnstore index. The query is widely available on MSDN & TechNet sites.
Here is my customized demo – fill in the blanks and analyze the size of the objects yourself and do post your observations as comments…
USE AdventureWorksDWDenali; GO -- observe the size of the table sp_spaceused N'dbo.FactProductInventory' --what's the size?? -- let us create a non clustered index CREATE NONCLUSTERED INDEX [NCI_FactProductInventory_DateKey] ON [dbo].[FactProductInventory] ([DateKey],[ProductKey]) INCLUDE ([UnitCost],[UnitsOut]); GO -- observe the size of the Non-clustered index select (page_count * 8)/1024 as IndexSizeInMB from sys.dm_db_index_physical_stats (db_id(),object_id(N'dbo.FactProductInventory'), NULL, NULL, 'SAMPLED') where index_id > 0 --whats the size?? --create a non clustered columnstore index CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_CS_FactProductInventory] ON dbo.FactProductInventory ( ProductKey, DateKey, UnitCost, UnitsIn, UnitsOut, UnitsBalance ); -- determine the size of the columnstore index SELECT SUM(on_disk_size_MB) AS TotalSizeInMB FROM ( (SELECT SUM(css.on_disk_size)/(1024.0*1024.0) on_disk_size_MB FROM sys.indexes AS i JOIN sys.partitions AS p ON i.object_id = p.object_id JOIN sys.column_store_segments AS css ON css.hobt_id = p.hobt_id WHERE i.object_id = object_id('FactProductInventory') AND i.type_desc = 'NONCLUSTERED COLUMNSTORE') UNION ALL (SELECT SUM(csd.on_disk_size)/(1024.0*1024.0) on_disk_size_MB FROM sys.indexes AS i JOIN sys.partitions AS p ON i.object_id = p.object_id JOIN sys.column_store_dictionaries AS csd ON csd.hobt_id = p.hobt_id WHERE i.object_id = object_id('FactResellerSalesPtnd') AND i.type_desc = 'NONCLUSTERED COLUMNSTORE') ) AS SegmentsPlusDictionary -- whats the size??
Will wait for your observations as comments 🙂