Hi Geeks,
First of all I would like to thank Amit Sir and SQLServerGeeks.com for giving me this opportunity to blog.
This is my 1st blog on SQL Server Statistics Only Database, so I am pretty much excited about it, hope you all will like it.
In this blog, I will show you a technique of re-producing a production server execution plan on dev or test environment without replicating real data. Let me explain: On your production box, you have millions or perhaps billions of records in your table and you are running dozens of queries against that table. Of course, for each query you have an execution plan. You want to generate the same execution plan for tuning or query performance investigation purposes, but on your dev or test box. You don’t needs to import the entire table or millions of rows into your dev/test box. You can only create a statistics-only database and mimic the same query plan behavior. Yes, no user/real data is required.
What is Statistics-Only Database?
As the name suggest it’s a database that contain statistics of the database but no real user data. Yes you read it right “No Real User Data”. This is how you can create a statistics-only database:
How to create a statistics only database?
1. Open SQL Server Management Studio.
2. In the Object Explorer, expand Databases, and then locate the database that you want to script.
3. Right-click the database, point to Tasks, and then click Generate Scripts.
4. In the Script Wizard, verify that the correct database is selected. Click to select the Script entire database and all objects to script radio button, and then click Next.
5. In Output Type select Save scripts to a specific location radio button.
6. Click on Advanced and Choose Advanced Scripting Options dialog box, change the following settings from the default value to the value that is listed in the following table.
Scripting option Value to select
Ansi Padding True
Continue Scripting on Error True
Generate Script for Dependent Objects True
Include System Constraint Names True
Script Collation True
Script Database Create True
Script Logins True
Script Object Level Permissions True
Script Statistics Script Statistics and histograms
Script Indexes True
Script Triggers True
7. Note The Script Logins option and the Script Object Level Permissions option may not be required unless the schema contains objects that are owned by logins other than dbo.
8. Select Save to file radio buttonand thenselect Single file radio button. In File name select the location where the output will be generated and Save as will be Unicode.
9. Click Next.
10. Click Next.
11. Click Finish.
Note: Generation of scripts of huge database with complex subsystems will created load on the system so do this in non-peak hours.
Demo:
I have created a simple database with one table having 15 records.
use [master] go CREATEDATABASE [PWI_TESTDB] CONTAINMENT =NONE ONPRIMARY ( NAME =N'PWI_TESTDB',FILENAME=N'E:\temp\PWI_TESTDB.mdf', SIZE = 10240KB , FILEGROWTH = 1024KB ) LOGON ( NAME =N'PWI_TESTDB_log',FILENAME=N'E:\temp\PWI_TESTDB_log.ldf', SIZE = 1024KB , FILEGROWTH = 10%) GO --Create table testtab use [PWI_TESTDB] go createtable testtab(id int, name varchar(50), dept varchar(10), sal int); go --Insert data into the testtab insertinto testtab values(1,'debjeet','acc',10000); go insertinto testtab values(2,'abhisek','acc',20000); go insertinto testtab values(3,'arup','it',30000); go insertinto testtab values(4,'sumit','hr',10000); go insertinto testtab values(5,'anil','it',50000); go insertinto testtab values(6,'rahul','it',80000); go insertinto testtab values(7,'suhasini','acc',60000); go insertinto testtab values(8,'ramol','it',70000); go insertinto testtab values(9,'sudesh','it',20000); go insertinto testtab values(10,'debjeet','hr',30000); go insertinto testtab values(11,'mahek','it',23000); go insertinto testtab values(12,'amol','it',90000); go insertinto testtab values(13,'debjeet','it',50000); go insertinto testtab values(14,'susmit','it',13000); go insertinto testtab values(15,'komal','it',44000); go
When I run the following query (actual execution plan turned on), observe the cardinality estimation:
SELECT name FROM testtab WHERE name ='debjeet'; GO
Note: Actual Number of Rows = 3 and Estimated Number of Rows = 3
Now, I have scripted out this database (using the technique mentioned above) and created it with another name pwi_statdb. This database only contains the objects, metadata & statistics but no user data. Here is what I executed to create this new stats-only database:
USE [master] GO CREATEDATABASE [PWI_STATDB] CONTAINMENT =NONE ONPRIMARY ( NAME =N'PWI_STATDB',FILENAME=N'E:\temp\PWI_STATDB.mdf', SIZE = 10240KB , MAXSIZE =UNLIMITED, FILEGROWTH = 1024KB ) LOGON ( NAME =N'PWI_STATDB_log',FILENAME=N'E:\temp\PWI_STATDB_log.ldf', SIZE = 1024KB , MAXSIZE = 2048GB , FILEGROWTH = 10%) GO ALTERDATABASE [PWI_STATDB] SETAUTO_CREATE_STATISTICSOFF GO ALTERDATABASE [PWI_STATDB] SETAUTO_UPDATE_STATISTICSOFF GO --Create table testtab with no records in it USE [PWI_STATDB] GO CREATETABLE [dbo].[testtab]( [id] [int] NULL, [name] [varchar](50)NULL, [dept] [varchar](10)NULL, [sal] [int] NULL )ON [PRIMARY] GO --Update the stats for empty table testtab UPDATESTATISTICS [dbo].[testtab] WITHROWCOUNT= 15,PAGECOUNT= 1 GO
Following things to observe in the above code:
1. I have turned off auto create stats and auto update stats
2. No user data is entered in to the table
3. I have updated the statistics of the table with Update STATS command (notice ROWCOUNT & PAGE COUNT) (this is automatically generate by Generate Scripts Wizard)
4. Note that there is not stats for column ‘name.’
Now, I execute the same query on the new stats-only db:
USE [PWI_STATDB] GO SELECT name FROM testtab WHERE name ='debjeet'; GO
And, you can see the following execution plan:
Note: Actual Number of Rows = 0(Because no user data is there in the table) and Estimated Number of Rows = 7.62199
This is the default selectivity of the optimizer. But what happed to the previous estimated number of rows = 3? We will get that only when we create statistics stream for column ‘name’ that was scripted out by the wizard. Here it is:
CREATESTATISTICS [_WA_Sys_00000002_0EA330E9] ON [dbo].[testtab]([name]) WITHSTATS_STREAM= 0x01000000010000000000000000000000B2AD2CC000000000FE03000000000000BE03000000000000A702FFFFA7000000320000000000000008D000340000000007000000A91E120054A200000F000000000000000F000000000000000000803FD9899D3D00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000C0000000C00000001000000100000005555B54000007041000000005555B540000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000013000000000000000000000000000000B6010000000000004A02000000000000520200000000000060000000000000007E000000000000009900000000000000B400000000000000CF00000000000000ED000000000000000901000000000000250100000000000041010000000000005D010000000000007A010000000000009901000000000000300010000000803F000000000000803F04000001001E006162686973656B300010000000803F000000000000803F04000001001B00616D6F6C300010000000803F000000000000803F04000001001B00616E696C300010000000803F000000000000803F04000001001B00617275703000100000004040000000000000803F04000001001E006465626A656574300010000000803F000000000000803F04000001001C006B6F6D616C300010000000803F000000000000803F04000001001C006D6168656B300010000000803F000000000000803F04000001001C00726168756C300010000000803F000000000000803F04000001001C0072616D6F6C300010000000803F000000000000803F04000001001D00737564657368300010000000803F000000000000803F04000001001F007375686173696E69300010000000803F0000803F0000803F04000001001D007375736D6974FF01000000000000000A00000009000000280000002800000000000000000000002E0000006162686973656B6E696C7275706465626A6565746D6168656B726168756C737564657368686173696E69736D69740C0000004000000000C0010000008106010000810307000001030A000082070D00008105140000810519000040021E00008104200000810624000001042A0000000F00000000000000 GO DBCC FREEPROCCACHE GO
Now, if you run the same query again, this is the execution plan:
So, here it is: A statistics only database that contains the objects, metadata, statistics but no user data. And you can mimic the same query plan for tuning. Monitoring and/or any other query plan performance investigation purposes.
I recently learnt this technique and wanted to share with the SQL community. Hope you enjoyed.
Regards
Debjeet Bhowmik
Like us on FaceBook | Follow us on Twitter
Join the fastest growing SQL Server group on FaceBook
Interesting technique Debjeet. Thanks for writing about it. I’m curious though, how do you go about testing the performance of queries which rely on data to join tables, when the tables don’t have any data in them? Or, am I missing something here?