Percy Reyes

Friday 12 November 2010

SQL11 (Denali): OFFSET and FETCH (New options for ORDER BY)

It is known that ORDER BY clause has been improved in 'Denali' (code name of the upcoming version SQL Server 2012) adding more options as a result from looking for new solutions to work on specific business cases such as fetching top N rows per window of results from the result set which we used to do by combining TOP, OVER, and ORDER BY. Nevertheless, the performance of this approach was poor and not even close to being as good as OFFSET and FETCH in 'Denali'.

OFFSET keyword is used for excluding first N rows (default value is zero) whereas FETCH is used for reading/fetching M rows per each window of results starting from position (N+1). Therefore, OFFSET and FETCH allow to get only rows from position (N+1) to (N+M). I do recommend creating one Index (preferably Index Clustered) on ORDER BY columns and using these columns on WHERE clause as much as possible in order to avoid costly operations such as Index Scans or Table Scans. I have performed many tests on 10 millions of rows, and to be to honest, it worked very well. To illustrate, the following query will start fetching from position 4+1=5 (OFFSET position=4 for SQL Server) to the end of the table.

SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address   
FROM dbo.Customers    
ORDER BY CustomerID 
   OFFSET (4) ROWS

Now with DESC the results will be different (excluding last 4 rows)

SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address   
FROM dbo.Customers    
ORDER BY CustomerID  DESC
   OFFSET (4) ROWS

Basically, each execution of the query is done independently, which means that not only are the results sent back to the client side, but also immediately the resources are released. Here is another example how to get only the top 3 rows excluding first 4 rows (OFFSET=4), which means that it only reads the fifth, sixth, and seventh rows.

SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../  
FROM dbo.Customers    
ORDER BY CustomerID  
   OFFSET (4) ROWS 
   FETCH NEXT 3 ROWS ONLY --  FETCH FIRST 3 ROWS ONLY
GO

Having a need to read the next three rows (eighth, ninth, and tenth rows), we have to increase OFFSET value in three.

SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../  
FROM dbo.Customers    
ORDER BY CustomerID  
   OFFSET (7) ROWS ---ó OFFSET (7) ROW 
   FETCH NEXT 3 ROWS ONLY -- ó FETCH FIRST 3 ROWS ONLY
GO

Here the script to read dynamically each result by using variables (and optionally OPTIMIZE FOR hint to optimise this Ad-Hoc query).

DECLARE @Start INT, @Next INT
SET @Start=0 

SET @Next=3
SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../  
FROM dbo.Customers    
ORDER BY CustomerID 
     OFFSET (@Start) ROWS    
     FETCH NEXT @Next ROWS ONLY 

 OPTION(OPTIMIZE FOR (@Start=4,@Next=5))
GO

Now the same code inside an stored procedure.

CREATE PROCEDURE dbo.PageFetch(@PageNumber INT, @PageSize INT)
AS
BEGIN
SELECT ROW_NUMBER() OVER (ORDER BY CustomerID) AS 'Nro', 
CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../  
FROM dbo.Customers    
ORDER BY CustomerID 
     OFFSET (@PageSize * (@PageNumber -1) ) ROWS
     FETCH NEXT @PageSize ROWS ONLY 
END

Querying the results for second, third and fifth windows of results with OFFSET.

EXEC dbo.PageFetch 2,3;
GO
EXEC dbo.PageFetch 3,3;
GO
EXEC dbo.PageFetch 5,3;

Luckily, OFFSET and FETCH are supported by subqueries, functions, derived tables, but not by indexed views and using directly with TOP, OVER, INSERT, UPDATE, MERGE, o DELETE (except inside of independent queries at lower level such as subqueries). Here an illustration.

SELECT  TOP(2) ROW_NUMBER() OVER(ORDER BY CustomerID) AS 'NRO', CompanyName, ContactName, ContactTitle  FROM (
        SELECT  CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../  
        FROM dbo.Customers    
        ORDER BY CustomerID  
           OFFSET (4) ROWS ---ó OFFSET (4) ROW 
           FETCH NEXT 3 ROWS ONLY -- ó FETCH FIRST 3 ROWS ONLY
) AS T1

Finally, using it with Views.

CREATE VIEW dbo.v1
as
SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../  
FROM dbo.Customers  
ORDER BY CustomerID 
   OFFSET (0) ROWS ---ó OFFSET (4) ROW 
GO
SELECT CustomerID, CompanyName, ContactName FROM dbo.v1
WHERE  CustomerID LIKE 'AN%'

To sum up, OFFSET and FETCH are excellent improvements which allow to implement solutions to fetch only specific pages/windows of results from the result set with ease. Let me know any remarks you may have. Thanks for reading!

Monday 18 October 2010

SQL11 (Denali): How to migrate SQL Server 2000 databases to SQL Server 2012?

Having found 'SQL Server 2000 (80)' inside the compatibility levels supported by SQL Server 2012, I need to say that the upcoming version of SQL Server is not going to support it. we do know that each new version only supports compatibility levels of two below versions, which means that it is an error as this is a CTP build.

What is going on if we try to restore a SQL Server 2000 database on SQL Server 2012? well, we will get the following error:

Msg 3169, Level 16, State 1, Line 1
The database was backed up on a server running version 8.00.0760. That version is incompatible with this server, which is running version 11.00.1103. Either restore the database on a server that supports the backup, or use a backup that is compatible with this server.
Msg 3013, Level 16, State 1, Line 1
RESTORE DATABASE is terminating abnormally.

Attaching the database neither works.

Msg 1813, Level 16, State 2, Line 1
Could not open new database ‘Northwind’. CREATE DATABASE is aborted.
Msg 1813, Level 16, State 2, Line 1
Could not open new database ‘Northwind’. CREATE DATABASE is aborted.
Msg 950, Level 20, State 1, Line 1
Database ‘Northwind’ cannot be upgraded because its non-release version (0) is not supported by this version of SQL Server. You cannot open a database that is incompatible with this version of sqlservr.exe. You must re-create the database.

Therefore, SQL Server 2000 databases have to be migrated to SQL Server 2012 following the next steps: (1) restore SQL Server 2000 databases on SQL Server 2005 or SQL Server 2008; (2) take Full Backups of the new SQL Server 2005/2008 databases, and finally; (3) restore them on SQL Server 2012. In other words, there will not be direct migration from SQL Server 2000 to SQL Server 2012. That is all for now, let me know any remarks you may have. Thanks for reading again!

Wednesday 5 May 2010

Geo-Replication Performance Gains with Microsoft SQL Server 2008 Running on Windows Server 2008

MSCOM Ops Team has done a tremendous and excellent work in testing the performance of SQL Server 2005 replication environments on Windows Server 2003 and SQL Server 2008 on Windows Server 2008, and good benefits were achieved as a result of the performance enhancements in the next generation TCP/IP stack. If you open the link, you will read that the team concludes the following:

“… the team discovered that SQL Server 2008 running on Windows Server 2008 yielded up to 100 times faster performance without requiring any expensive wide area network (WAN) acceleration hardware”

Furthermore, scalability and performance have been improved in SQL Server 2008 Native Client, exactly, in the way of invoking stored procedures with ODBC call syntax and OLE DB remote procedure call (RPC) syntax. To find out more details about it, check out Geo-Replication Performance Gains with Microsoft SQL Server 2008 Running on Windows Server 2008. It is also found best practices and baselines, here are some of the them:

"Testing showed that using transactional replication with SQL Server 2008 running on Windows Server 2008 dramatically outperformed SQL Server 2005 running on Windows Server 2003. As illustrated in Table 2, the most substantial performance gains occurred when the Publisher and Subscriber were both running SQL Server 2008 on Windows Server 2008…
…Testing also showed that the scope of the performance gains correlated with the type of replication and the type of data. Push subscription replication of character data with SQL Server 2008 running on Windows Server 2008 yielded a 104 percent increase over SQL Server 2005 running on Windows Server 2003, and pull subscription replication of the same data yielded a 1,298 percent gain.
Both lab and real-life testing by the MSCOM Ops team indicate that highly trafficked Web sites can gain the benefits of geo-replication most effectively when the site is built on SQL Server 2008 running on Windows Server 2008. Based on solid evidence of the feasibility of WAN–based geo-replication, MSCOM Ops plans to expand its implementation of this solution.
In addition, the MSCOM Ops team learned several valuable lessons because of its extensive performance testing of SQL Server 2008 running on Windows Server 2008, including:

Windows Server 2008 and SQL Server 2008 with the TCP/IP stack improvements and partnering with application development teams can bolster global user experiences, produce higher availability, higher scalability, and better resiliency for sites, services, and applications through WAN-based geo-replication.
Replication performance is significantly better for pull subscription scenarios than push subscriptions.
The solution identified in this paper will not work for all applications, particularly applications that cannot handle the inherit latency involved with replicating data between geographically dispersed data centers."

To sum up, there are significant gains in terms of performance, scalability and disaster recovery implemented in SQL Server 2008 on Windows Server 2008. Not only will it be much faster than SQL Server 2005 on Windows Server 2003, but also much secure and cheaper. Therefore, SQL Server Replication technology will be considered a stronger solution inclusive for high availability purposes.
I hope you enjoy reading both document as they are a good read. That is all for now, let me know any remark you may have. Thanks for reading.

Sunday 21 February 2010

How to schedule a SQL Job to run every second in SQL Server 2005

It is well known that it is not possible to schedule a SQL Job to run every second or some seconds in SQL Server 2005 by using the wizard (I mean less than one minute). Nevertheless, when there is a need to do it we may have one way to achieve it. Let me expand on what I am saying. Firstly, we have to create the SQL Job in SQL Server 2008 by using the Wizard and then we need to generate the script to create the SQL Job in SQL Server 2005.

Having successfully completing that will you be able to have a SQL Job in SQL Server 2005 to run every second with any hitches. I hope you find this tip very useful for you. That is all for now. Thanks for reading. Stay tuned.

Saturday 16 February 2008

SQL Server 2008: Row Value Constructor

Definitely, we were sometimes in a need of a new technique to insert data massively in SQL Server 2005 and previous versions. We used to do it by using traditional techniques like UNION ALL and individual INSERT statements. This story has ended up with the arrival of a new feature in SQL Server 2008 which allows to insert much data with only one INSERT statement, that is, Row Value Constructor.

To begin with, we are going to illustrate this new feature by creating a new table where four data rows will be inserted.

CREATE TABLE [Production].[Document](
    [DocumentID] [int] IDENTITY(1,1) NOT NULL,
    [Title] [nvarchar](50) COLLATE Latin1_General_CS_AS NOT NULL,
    [FileName] [nvarchar](400) COLLATE Latin1_General_CS_AS NOT NULL,
    [FileExtension] [nvarchar](8) COLLATE Latin1_General_CS_AS NOT NULL,
    [Revision] [nchar](5) COLLATE Latin1_General_CS_AS NOT NULL,
    [ChangeNumber] [int] NOT NULL CONSTRAINT [DF_Document_ChangeNumber]  DEFAULT ((0)),
    [Status] [tinyint] NOT NULL,
    [DocumentSummary] [nvarchar](max) COLLATE Latin1_General_CS_AS NULL,
    
    [ModifiedDate] [datetime] NOT NULL CONSTRAINT [DF_Document_ModifiedDate]  DEFAULT (getdate()),
 CONSTRAINT [PK_Document_DocumentID] PRIMARY KEY CLUSTERED 
(
    [DocumentID] ASC
)
) ON [PRIMARY]

At first, we will do it in SQL Server 2005. This is the traditional and old-fashioned way (one INSERT statement per row):

INSERT INTO [Production].[Document]([Title],[FileName],[FileExtension],[Revision],[ChangeNumber],[Status],[DocumentSummary])
VALUES ('Installing Replacement Pedals','C:DocumentsInstalling Replacement Pedals.doc','.doc','0',32,2,'Detailed instructions ...')

INSERT INTO [Production].[Document]([Title],[FileName],[FileExtension],[Revision],[ChangeNumber],[Status],[DocumentSummary])
VALUES ('Introduction 1','C:DocumentsIntroduction 1.doc','.doc','4',28,2,NULL)

INSERT INTO [Production].[Document]([Title],[FileName],[FileExtension],[Revision],[ChangeNumber],[Status],[DocumentSummary])
VALUES ('Lubrication Maintenance','C:DocumentsLubrication Maintenance.doc','.doc','2',11,1,'Guidelines and recommendations...')

INSERT INTO [Production].[Document]([Title],[FileName],[FileExtension],[Revision],[ChangeNumber],[Status],[DocumentSummary])
VALUES ('Seat Assembly','C:DocumentsSeat Assembly.doc','.doc','8',55,2,'Worn or damaged seats...')

This is the cumbersome way by using UNION ALL clause which has sometimes poor performance:

INSERT INTO [Production].[Document]([Title],[FileName],[FileExtension],[Revision],[ChangeNumber],[Status],[DocumentSummary])
SELECT 'Installing Replacement Pedals','C:DocumentsInstalling Replacement Pedals.doc','.doc','0',32,2,'Detailed instructions ...'
UNION ALL
SELECT 'Introduction 1','C:DocumentsIntroduction 1.doc','.doc','4',28,2,NULL
UNION ALL 
SELECT 'Lubrication Maintenance','C:DocumentsLubrication Maintenance.doc','.doc','2',11,1,'Guidelines and recommendations...'
UNION ALL
SELECT 'Seat Assembly','C:DocumentsSeat Assembly.doc','.doc','8',55,2,'Worn or damaged seats...'

Now in SQL Server 2008 we can use Row Value Constructor feature (only one INSERT statement to insert four data rows):

INSERT INTO [Production].[Document](  
      [Title]
      ,[FileName]
      ,[FileExtension]
      ,[Revision]
      ,[ChangeNumber]
      ,[Status]
      ,[DocumentSummary]
)
VALUES ('Installing Replacement Pedals','C:DocumentsInstalling Replacement Pedals.doc','.doc','0',32,2,'Detailed instructions ...'),
('Introduction 1','C:DocumentsIntroduction 1.doc','.doc','4',28,2,NULL),
('Lubrication Maintenance','C:DocumentsLubrication Maintenance.doc','.doc','2',11,1,'Guidelines and recommendations...'),
('Seat Assembly','C:DocumentsSeat Assembly.doc','.doc','8',55,2,'Worn or damaged seats...')

As you have noticed, we do not need to use UNION ALL any longer, it allow to save a lot of time and work. It is truly useful when we need to insert much data and it also offers a remarkable performance. I hope you enjoy this new feature. Thanks for reading again!

Pages