Saturday 16 December 2017

Implementing dynamic PIVOT in SQL Server

It is well known that PIVOT is one of the new features included in SQL Server 2005 which allows to convert rows into columns. But what happens if we wanted to make it dynamic going beyond limits respect to the number of columns? Today in this post I am going to show you how to do it (applies to SQL Server 2005 through SQL Server 2017). To begin with, we are going to use the following example to analyse some data, which will be pivoted shortly after.

USE AdventureWorks 
GO
SELECT CustomerID,YEAR(DueDate) [Year], TotalDue 
FROM Sales.SalesOrderHeader
ORDER BY CustomerID

According to the result set, there are many sales per customer between the years 2001 and 2004.



In order to pivot the 'TotalDue' per 'Year' we do need to indicate each year in the PIVOT clause. In this case we do also need to know the years which will be taken into account. For instance, this query will pivot 'TotalDue' for the years 2001, 2002, 2003, and 2004.

SELECT CustomerID, [2001] AS '2001', [2002] AS '2002', [2003] AS '2003', [2004]  AS '2004'
FROM (
      SELECT CustomerID,YEAR(DueDate) [Year], TotalDue FROM Sales.SalesOrderHeader
     ) pvt
PIVOT (SUM(TotalDue) FOR [Year] IN ([2001],[2002],[2003],[2004])) AS Child
ORDER BY CustomerID

Having successfully executed the query, we will get the following nice result:


Up to now, everything seems to be perfect. Nevertheless, what's going on if we wanted to pivot for many more years? Obviously, we would have to deal with a big limitation at first glance since we will need to add the years manually inside the query, but it does not make sense for a real business case. As a result, we can say that PIVOT is not scalable, I mean that PIVOT is not 'dynamic' by design. Luckily, the purpose of this post is to show how to implement an algorithm to simulate a dynamic PIVOT in SQL Server by using the native PIVOT clause and sp_executesql.

The algorithm is quite simple and does not deserve major explanation, however, I am going to say that this will only create a query dynamically to pivot 'TotalDue' by adding all years inside, and finally the output code will be executed via sp_executesql.

DECLARE @TableYears AS TABLE([Year] INT NOT NULL)
DECLARE @Year INT, @YearsPVT NVARCHAR(MAX) 
INSERT INTO @TableYears  SELECT DISTINCT YEAR(DueDate) AS [Year] FROM  Sales.SalesOrderHeader 
SET @Year = (SELECT MIN([Year]) FROM @TableYears)
SET @YearsPVT=N''
WHILE @Year IS NOT NULL
BEGIN
  SET @YearsPVT = @YearsPVT + N',['+ CONVERT(NVARCHAR(10),@Year) + N']'
  SET @Year = (SELECT MIN([Year]) FROM @TableYears WHERE [Year]>@Year)
END
SET @YearsPVT = SUBSTRING(@YearsPVT,2,LEN(@YearsPVT))
PRINT @YearsPVT
DECLARE @SQL NVARCHAR(MAX)
 SET @SQL = N'SELECT *
            FROM ( 
                SELECT CustomerID,YEAR(DueDate) [Year], TotalDue FROM Sales.SalesOrderHeader
                ) pvt
            PIVOT (SUM(TotalDue) FOR [Year] IN (' + @YearsPVT + ')) AS Child
            ORDER by CustomerID'
 
EXECUTE sp_executesql @SQL

Now I am going to illustrate another example by using data from Northwind database.

SELECT P.ProductID, C.CategoryName, OD.UnitPrice * OD.Quantity AS TotalAmount  
FROM Products P 
    INNER JOIN dbo.[Order Details] OD
ON P.ProductID=OD.ProductID
   INNER JOIN Categories C
ON C.CategoryID=P.CategoryID

Here the simple result without PIVOT.


Using PIVOT in its old-fashioned way:

SELECT ProductID, [Beverages], [Condiments], [Confections], [Dairy Products], 
       [Grains/Cereals], [Meat/Poultry],[Produce],[Seafood] 
FROM 
(
    SELECT P.ProductID, C.CategoryName, (OD.UnitPrice * OD.Quantity) AS Monto
    FROM Products P 
        INNER JOIN dbo.[Order Details] OD
            ON P.ProductID=OD.ProductID
        INNER JOIN Categories C
            on C.CategoryID=P.CategoryID
) PIV
PIVOT (SUM(Monto) FOR  CategoryName IN ([Beverages], [Condiments], [Confections], [Dairy Products], 
       [Grains/Cereals], [Meat/Poultry],[Produce],[Seafood])) AS Child

Having executed the code above, we will get this result pivoted.



Finally, using dynamic PIVOT the result will be the same.

DECLARE @CatPVT AS NVARCHAR(MAX), @Categorias AS VARCHAR(20)
DECLARE @CatID INT 
SET @CatID=(SELECT MIN(CategoryID) FROM Categories)
SET @Categorias = ( SELECT CategoryName FROM Categories WHERE CategoryID = @CatID)
SET @CatPVT = N''
WHILE @Categorias IS NOT NULL
BEGIN
  SET @CatPVT = @CatPVT + N',['+ @Categorias +N']'
  SET @Categorias = (SELECT TOP(1) CategoryName 
                     FROM Categories WHERE CategoryID > @CatID 
                     ORDER BY CategoryID ASC)
  SET @CatID=(SELECT MIN(CategoryID) FROM Categories WHERE Categoryname=@Categorias)
END
print @CatPVT
SET @CatPVT = SUBSTRING(@CatPVT, 2, LEN(@CatPVT))
 
DECLARE @sql AS NVARCHAR(MAX)
SET @sql = N'SELECT *  
            FROM (SELECT P.ProductID, C.CategoryName, (OD.UnitPrice * OD.Quantity) AS Monto
                  FROM Products P 
                    INNER JOIN dbo.[Order Details] OD
                        ON P.ProductID=OD.ProductID
                    INNER JOIN Categories C
                    ON C.CategoryID=P.CategoryID
            ) PIV
            PIVOT (SUM(Monto) FOR  CategoryName IN ('+ @CatPVT  + ')) AS Child'
 
EXEC sp_executesql @sql

As you have seen, dynamic PIVOT is truly useful for a real business case. Therefore, I hope you make the most out of this algorithm. That is all for now, let me know any remarks you may have. Thanks for reading. Stay tuned.

No comments:

Post a Comment

Let me know any remarks or questions you may have. Please write down your name.

HELLO, I'M PERCY REYES! I've been working as a senior SQL Server Database Engineer for over 20 years; I'm a three-time Microsoft Data Platform MVP. I'm a cryptographer conducting research on cryptographic Boolean functions and their applications.