The title for this post might sound ironical since rebuild of the index is used to remove the fragmentation of the index so how does rebuild of index affect the fragmentation of index? Well, this is what I would like to explain in this blog post.
Generally, for rebuilding of indexes (alter index … rebuild) optimizer tries to generate a serial plan with no parallelism, however for large tables, the index rebuild compels the optimizer to generate parallel plan to distribute the keys across multiple threads to perform parallel sorts so that indexes rebuild can be performed faster at the expense of higher CPU.
The Parallel Index rebuild Query Plan uses range partitioning where in each of the thread is given a range of keys which it sorts and build a smaller index subunits.Finally there is Gather stream operator which links all the index subunits together (logically) to form a logically contiguous index page links but physically non-contiguous.
Thus when the optimizer decides to go for parallel plan to rebuild the index, the parallel threads does extent allocation in noncontiguous allocation and thereby doesn’t eliminate the fragmentation completely and still leads to some fragmentation due to extent allocation in non-contiguous location.
Let me illustrate this with an example
create table T1 ( a int primary key, b char(80) )
Performing large number of random inserts in the table which will cause fragmentation in the table
Declare @i int; SET @i = 1 while (@i <=500000) begin insert into T1 values(@i,'Test') insert into T1 values(1000000-@i,'Test') set @i = @i + 1 end
Let us know check the fragmentation in the clustered index of the table caused due to random inserts
select database_id,object_id,index_id,index_level,avg_fragmentation_in_percent,fragment_count,avg_fragment_size_in_pages,page_count from sys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('T1'),1,1,'DETAILED')
database_id object_id index_id index_level avg_fragmentation_in_percent fragment_count avg_fragment_size_in_pages page_count
12 197575742 1 0 66.7894587701899 2358 1.49660729431722 3529
12 197575742 1 1 90.9090909090909 11 1 11
12 197575742 1 2 0 1 1 1
As we see in the above output, the leaf level of the clustered index (level 0) is around 67% fragmentated with 2358 fragments with around 1.5 pages per fragments. Hence the total clustered index takes around 3541 pages (3529+11+1).
Now let me rebuild the index by using the following command
alter index all on T1 rebuild WITH(MAXDOP=2)
Even if I do not specific MAXDOP=2 , SQL instance on my laptop will use parallelism of 2 by default since it is a dual-core machine.
After rebuild the index, let us check the fragmentation again.
select database_id,object_id,index_id,index_level,avg_fragmentation_in_percent,fragment_count,avg_fragment_size_in_pages,page_count from sys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('T1'),1,1,'DETAILED')
database_id object_id index_id index_level avg_fragmentation_in_percent fragment_count avg_fragment_size_in_pages page_count
12 197575742 1 0 0.0868809730668983 8 143.875 1151
12 197575742 1 1 0 2 1 2
12 197575742 1 2 0 1 1 1
As expected we see in the above output, the fragmentation in the leaf level of the index is reduced to 0.09% with 8 fragments at leaf level and 2 fragments at the intermediate level. The leaf level has on an average 144 pages per fragments which intermediatel level has 1 page per fragments
Now let us truncate the table,populate the table again with the same script above and rebuild the index but this time WITH MAXDOP=1 setting and check the fragmentation of the index
TRUNCATE TABLE t1; GO Declare @i int; SET @i = 1 while (@i <=50000) begin insert into T1 values(@i,'Test') insert into T1 values(100000-@i,'Test') set @i = @i + 1 end alter index all on T1 rebuild WITH(MAXDOP=1) GO selectdatabase_id,object_id,index_id,index_level,avg_fragmentation_in_percent,fragment_count,avg_fragment_size_in_pages,page_count fromsys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('T1'),1,1,'DETAILED')
database_id object_id index_id index_level avg_fragmentation_in_percent fragment_count avg_fragment_size_in_pages page_count
12 197575742 1 0 0 3 383.333333333333 1150
12 197575742 1 1 0 1 2 2
12 197575742 1 2 0 1 1 1
As we see in the output now, the leaf level of the clustered index has now only 3 fragments with each fragments having around 384 pages on an average per fragment. Further the entire leaf level now requires only 1150 pages as compared 1151 pages when the index is rebuild WITH MAXDOP=2. if we compare the intermediate level we now have only 1 fragments which consists of 2 pages as compared to 2 fragments observed previously
A fragment is a chunk or a segment in the datafile with continuous allocation of pages. So more number of fragments is an indication of more number of dis-continuous allocation of segments hence more fragmentation.
My laptop didn’t had enough resources nor did I have patience for data load, but if you want to see difference more prominent, you can load more rows in the above script and if you have more cpus and hence more parallel threads used for rebuild, the amount of fragments will be huge
Using the above learning we conclude the following
Conclusion
For large databases/datawarehouse generally of the order of TBs which are hosted on the servers with large number of CPUs i.e SMP architecture, while rebuilding indexes it is recommended to rebuild with (MAXDOP=1). This can lead to index rebuild being slower but at the same time we ensure the indexes are completely defragmentated and further serial plan uses less resources (CPU and memory) as compared to parallel plan which causes high utilization of CPU and Memory.
Further, due to same reason in SQL 2008 R2 Fastrack Datawarehouse, Microsoft recommends to rebuild the index WITH (MAXDOP=1).
So, I would not recommend to use serial plan for rebuild indexes always. The answer is our favourite “Its depends”, if the parallel index rebuild doesn’t use much resources and reduces the fragmentation to a tolerable limit we can go for the parallel plan which is default. However if either the resource utilization caused by the index rebuild is high or if the fragmentation of the index is not reduced within the desired limit, we should go for MAXDOP=1
Further Reading
——————
http://msdn.microsoft.com/en-us/library/ms191292(v=sql.90).aspx
Hope this helps!!!
Regards
Parikshit Savjani
Like us on FaceBook | Follow us on Twitter | Join the fastest growing SQL Server group on FaceBook
Follow me on Twitter | Follow me on FaceBook
Nice Post.
Good one..nicely explained.
Nice one
Excellent post! Very helpful.
Thanks for this. Good learning. Will help me in the environment I support.
Good Post.
Nice post