SQL Server – ROW_NUMBER versus TOP N

Hello Friends,

This blog post is the result of question asked by some professionals working on databases. While writing T-SQL code they generally use TOP N in the code to get the top n number of rows in the result set. They knew the another way to achieve the same result by using ROW_NUMBER. Now the question is “Which option is better to use?”. In today’s blog pots we will do the comparison of ROW_NUMBER versus TOP N.

Step 1: Create a database for the test:

USE [Master]
GO
DROP DATABASE IF EXISTS MayTest;
GO
CREATE DATABASE MayTest;
GO

Step 2: Create a table and insert some data:

USE [MayTest]
GO
CREATE TABLE tbl_RowNumberVsTop
(
RecordID INT IDENTITY(1,1) NOT NULL,
FName VARCHAR(50),
LName VARCHAR(50),
City VARCHAR(100),
DeptID INT NOT NULL
) ON [PRIMARY]
GO

CREATE CLUSTERED INDEX CIX_tbl_RowNumberVsTop_RecordID ON tbl_RowNumberVsTop(RecordID)
GO

SET NOCOUNT ON;

INSERT INTO tbl_RowNumberVsTop VALUES('Kabir','Khan','Gurgaon',3)
GO 1000
INSERT INTO tbl_RowNumberVsTop VALUES('Gurpreet','Singh','Gurgaon',7)
GO 1000
INSERT INTO tbl_RowNumberVsTop VALUES('Prince','Rastogi','Noida',4)
GO 1000
INSERT INTO tbl_RowNumberVsTop VALUES('Robin','Sharma','Delhi',5)
GO 1000

Step 3: Run below queries where first one is using ROW_NUMBER and second is using TOP N to get the same result set, with actual execution Plan:

DBCC DROPCLEANBUFFERS()
GO
DBCC FREEPROCCACHE()
GO
USE [MayTest]
GO
SELECT RecordID
FROM (
    SELECT RecordID, ROW_NUMBER() OVER (ORDER BY RecordID) AS RN
    FROM dbo.tbl_RowNumberVsTop
    ) AS T
WHERE T.RN <= 100

GO
SELECT 
	TOP 100 RecordID
FROM dbo.tbl_RowNumberVsTop
ORDER BY RecordID
GO

SQL Sever – ROW_NUMBER versus TOP N

For both the plans estimated cost is similar i.e. 50%. If you will check the estimated number of rows to read in both Clustered Index Scan operators then you will notice that both are showing the same value i.e. 100. We can say that estimated and actual number of rows for both Clustered Index Scan are same i.e. 100.

Step 4: Run queries again but with a small change, specify 1000 number of rows rather than 100, with actual execution Plan:

--Do not run below DBCC commands in production
DBCC DROPCLEANBUFFERS()
GO
DBCC FREEPROCCACHE()
GO
USE [MayTest]
GO
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT RecordID
FROM (
    SELECT RecordID, ROW_NUMBER() OVER (ORDER BY RecordID) AS RN
    FROM dbo.tbl_RowNumberVsTop
    ) AS T
WHERE T.RN <= 1000

GO
SELECT 
	TOP 1000 RecordID
FROM dbo.tbl_RowNumberVsTop
ORDER BY RecordID
GO

SQL Sever – ROW_NUMBER versus TOP N

   

From above screenshot you will say query with ROW_NUMBER is much faster as compare to TOP N i.e. 31% versus 69%. But wait, before any conclusion because the cost you are seeing here is just the estimated cost. If estimates are not correct for operators then query plan estimation cost will also not be correct. Check the properties of Clustered Index scan in both the plans:

SQL Sever – ROW_NUMBER versus TOP N

SQL Sever – ROW_NUMBER versus TOP N

You can see that estimated number of rows for query with ROW_NUMBER is 100 while actual is 1000. Query plan estimated cost was calculated based on estimated number of rows i.e. 100. We can not trust on estimated plan cost here. Let’s check the STATISTICS data:

SQL Sever – ROW_NUMBER versus TOP N

After watching above stats, I can say that take the decision based on statistics data rather than comparing estimated cost of execution plans.

HAPPY LEARNING!

Regards:
Prince Kumar Rastogi

Like us on FaceBook | Join the fastest growing SQL Server group on FaceBook
Follow Prince Rastogi on Twitter | Follow Prince Rastogi on FaceBook

   

About Prince Rastogi

Prince Rastogi is working as Database Administrator at Elephant Insurance, Richmond. He is having more than 8 years of experience and worked in ERP Domain, Wealth Management Domain. Currently he is working in Insurance domain. In the starting of his career he was working on SQL Server, Internet Information Server and Visual Source Safe. He is post graduate in Computer Science. Prince is ITIL certified professional. Prince likes to explore technical things for Database World and Writing Blogs. He is Technical Editor and Blogger at SQLServerGeeks.com. He is a regular speaker at DataPlatformDay events in Delhi NCR. He has also presented some in depth sessions about SQL Server in SQL Server Conferences in Bangalore.

View all posts by Prince Rastogi →

One Comment on “SQL Server – ROW_NUMBER versus TOP N”

  1. so in conclusion we can say, ROW_NUMBER is expensive for big datasets comparing to TOP
    and Looking at steps Optimizer took to fetch rows in ROW_NUMBER it is certain that TOp_Number will consume more source.

Leave a Reply

Your email address will not be published.