Pivoting on Multiple Columns

January 31, 2009/33 Comments/in SQL Server/by Plamen Ratchev

Pivoting data on more than one column is not a common request. But there are times when it is a very useful technique for reporting purposes. Here is one example to demonstrate different methods to handle that.

The goal is to report product data by pivoting columns with value and quantity for each product based on year. Here is the script to create the table and insert sample data.

CREATE TABLE Products (

 product VARCHAR(30),

 market_year INT,

 value INT,

 quantity INT,

 CONSTRAINT pk_products

 PRIMARY KEY (product, market_year));

INSERT INTO Products VALUES('Corn', 2003, 100, 20);
INSERT INTO Products VALUES('Corn', 2004, 200, 25);

INSERT INTO Products VALUES('Corn', 2005, 150, 30);

INSERT INTO Products VALUES('Corn', 2006, 150, 10);

The request is to produce the following output:

product v2003 q2003 v2004 q2004 v2005 q2005 v2006 q2006

------- ----- ----- ----- ----- ----- ----- ----- -----

Corn    100   20    200   25    150   30    150   10

As usual, the first suspect for pivoting solution is the CASE function. Using CASE expressions is the most flexible method to manipulate data for pivoting. There is not much difference when pivoting a single or multiple columns. Here is the solution with CASE:

SELECT product,

       SUM(CASE WHEN market_year = 2003 THEN value ELSE 0 END) AS v2003,

       SUM(CASE WHEN market_year = 2003 THEN quantity ELSE 0 END) AS q2003,

       SUM(CASE WHEN market_year = 2004 THEN value ELSE 0 END) AS v2004,

       SUM(CASE WHEN market_year = 2004 THEN quantity ELSE 0 END) AS q2004,

       SUM(CASE WHEN market_year = 2005 THEN value ELSE 0 END) AS v2005,

       SUM(CASE WHEN market_year = 2005 THEN quantity ELSE 0 END) AS q2005,

       SUM(CASE WHEN market_year = 2006 THEN value ELSE 0 END) AS v2006,

       SUM(CASE WHEN market_year = 2006 THEN quantity ELSE 0 END) AS q2006

FROM Products

GROUP BY product;

Next, let's look at the PIVOT operator that was added in SQL Server 2005. The PIVOT operator has a few limitations, and one of them is that it supports pivoting only on a single column. However, T-SQL allows to specify multiple PIVOT operators in the FROM clause, that way providing a solution for pivoting on multiple columns. There is one catch: the second PIVOT consumes the temporary result set output of the first PIVOT operator. Because of that the year column is not available (since it is used as pivoting column in the first PIVOT) and there is a need to define a virtual column that replicates the values for the year. Also, since the first PIVOT operator uses the year values as column names, the new virtual column subtracts 2000 from the year value to generate different column names for the second PIVOT operator. Here is the query using two PIVOT operators:

SELECT product,

       MAX([2003]) AS v2003,

       MAX([3]) AS q2003,

       MAX([2004]) AS v2004,

       MAX([4]) AS q2004,

       MAX([2005]) AS v2005,

       MAX([5]) AS q2005,

       MAX([2006]) AS v2006,

       MAX([6]) AS q2006

FROM (SELECT product, market_year, 

             market_year - 2000 AS market_year2,

             SUM(value) AS value, 

             SUM(quantity) AS quantity

      FROM Products

      GROUP BY product, market_year) AS T

      PIVOT

     (SUM(value) FOR market_year IN 

     ([2003], [2004], [2005], [2006])) AS P1

      PIVOT

     (SUM(quantity) FOR market_year2 IN 

     ([3], [4], [5], [6])) AS P2

GROUP BY product;

In this particular case there is more elegant solution using a single PIVOT operator. With little math the value and the quantity can be combined into single column and then after the pivoting split back to two separate columns. Here is the solution using a single PIVOT operator:

SELECT product,

       [2003] / 1000 AS v2003,

       [2003] % 1000 AS q2003,

       [2004] / 1000 AS v2004,

       [2004] % 1000 AS q2004,

       [2005] / 1000 AS v2005,

       [2005] % 1000 AS q2005,

       [2006] / 1000 AS v2006,

       [2006] % 1000 AS q2006

FROM (SELECT product, market_year, 

             value * 1000 + quantity AS value

      FROM Products) AS T

      PIVOT

     (SUM(value) FOR market_year IN 

     ([2003], [2004], [2005], [2006])) AS P;

33 replies

nessik says:
April 6, 2009 at 9:57 am
Thanks!
Good, but not dynamic 🙂
Reply
Plamen Ratchev says:
April 6, 2009 at 10:14 am
Hi nessik,
You can vote here https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=127071 to implement dynamic pivoting in SQL Server. Until then, this may work: http://pratchev.blogspot.com/2008/12/dynamic-pivoting.html
Reply
samdc says:
May 7, 2009 at 3:46 pm
Thank you for taking time to write this. It saved me tons effort and time!
Reply
Brad Schulz says:
May 23, 2009 at 11:01 am
Hi Plamen…
Here's another approach (sorry about the formatting):
select
  P1.product,v2003,q2003,v2004,q2004,v2005,q2005,v2006,q2006
from
  (select product, pivotkey='v'+convert(char,market_year), value
   from Products) I
   pivot (sum(value) for pivotkey in ([v2003],[v2004],[v2005],[v2006])) P1
full join
  (select product, pivotkey='q'+convert(char,market_year), quantity
   from Products) I
   pivot (sum(quantity) for pivotkey in ([q2003],[q2004],[q2005],[q2006])) P2
on P1.product=P2.product;
Reply
Brad Schulz says:
May 23, 2009 at 11:08 am
Follow-up to my last comment…
No need for a FULL JOIN. I don’t know how that got in there. A regular INNER JOIN would do.
Reply
Plamen Ratchev says:
May 23, 2009 at 11:18 am
Hi Brad,
Yes, this method works and I have used that in the past. But in essence my goal was to accomplish the task without multiple subqueries, which drags down performance. I am convinced the real solution is to enhance the PIVOT operator to allow pivoting on multiple columns. Until then, using CASE expressions is my choice.
Reply
Brad Schulz says:
May 23, 2009 at 11:45 am
Yet another approach just popped into my head (though its cost is about twice the previous approach I posted):
select *
from
  (select product, pivotkey='v'+convert(char,market_year), pivotamt=value
   from Products
   union all
   select product, pivotkey='q'+convert(char,market_year), pivotamt=quantity
   from Products) I
pivot
  (sum(pivotamt) for pivotkey in (v2003,q2003,v2004,q2004,v2005,q2005,v2006,q2006)) P
Reply
anurenj says:
June 10, 2009 at 4:47 am
thankyou thankou..so .soo.sooo much…..its an excellent piece of logic that i ever seen in my coding life…
Reply
anurenj says:
June 10, 2009 at 4:48 am
thankyou soooooooooooo much.its the excellent piece of code that i ever seen in my coding life….great great great
Reply
Paddy says:
November 25, 2009 at 7:49 am
Genious! You're a lifesaver!
Reply
Matt Poland says:
December 18, 2009 at 11:14 am
I'm not a SQL expert by any means so feel free to tear this query apart but I've found that this works cleanly well when you have a lot of pivots to pull off. The idea here is to use a CTE for each pivot and simply join them together afterwards.
;
WITH SourceProduct AS
(
SELECT DISTINCT product AS ProductName
FROM Products
),
ValueTotals AS
(
SELECT
product,
COALESCE(SUM(PivotData.[2003]), 0) AS Year2003Value,
COALESCE(SUM(PivotData.[2004]), 0) AS Year2004Value,
COALESCE(SUM(PivotData.[2005]), 0) AS Year2005Value,
COALESCE(SUM(PivotData.[2006]), 0) AS Year2006Value
FROM Products
INNER JOIN SourceProduct ON Products.product = SourceProduct.ProductName
PIVOT
(
SUM(value)
FOR market_year IN ([2003], [2004], [2005], [2006])
) AS PivotData
GROUP BY product
),
QuantityTotals AS
(
SELECT
product,
COALESCE(SUM(PivotData.[2003]), 0) AS Year2003Quantity,
COALESCE(SUM(PivotData.[2004]), 0) AS Year2004Quantity,
COALESCE(SUM(PivotData.[2005]), 0) AS Year2005Quantity,
COALESCE(SUM(PivotData.[2006]), 0) AS Year2006Quantity
FROM Products
INNER JOIN SourceProduct ON Products.product = SourceProduct.ProductName
PIVOT
(
SUM(quantity)
FOR market_year IN ([2003], [2004], [2005], [2006])
) AS PivotData
GROUP BY product
)
SELECT
ProductName,
Year2003Value,
Year2004Value,
Year2005Value,
Year2006Value,
Year2003Quantity,
Year2004Quantity,
Year2005Quantity,
Year2006Quantity
FROM SourceProduct
INNER JOIN ValueTotals ON SourceProduct.ProductName = ValueTotals.product
INNER JOIN QuantityTotals ON SourceProduct.ProductName = QuantityTotals.product
Reply
Plamen Ratchev says:
December 18, 2009 at 11:47 am
Hi Matt,
This is the same approach that Brad posted earlier in his comment. Yes, it works, but it is very inefficient, especially if you have to pivot on many columns.
Reply
Matt Poland says:
December 18, 2009 at 4:28 pm
I just came to that same realization and your point is enforced from my perspective. The ability to pivot on multiple columns in a single query is definitely a desired feature and your SUM(CASE) approach seems the preferable method.
Reply
Mohammad Salimabadi says:
August 4, 2010 at 4:54 pm
Hi Plamen,
Following query is suitable for pivoting om multiple columns.
;with c1(product, sum_value, sum_quantity) as
(select product, sum(value), sum(quantity)
from products
where market_year=2003
group by product),
c2(product, sum_value, sum_quantity) as
(select product, sum(value), sum(quantity)
from products
where market_year=2004
group by product),
c3(product, sum_value, sum_quantity) as
(select product, sum(value), sum(quantity)
from products
where market_year=2005
group by product),
c4(product, sum_value, sum_quantity) as
(select product, sum(value), sum(quantity)
from products
where market_year=2006
group by product)
select c1.product, c1.sum_value as v2003, c1.sum_quantity as q2003,
c2.sum_value as v2004, c2.sum_quantity as q2004,
c3.sum_value as v2005, c3.sum_quantity as q2005,
c4.sum_value as v2006, c4.sum_quantity as q2006
from (select distinct product from products) d
left join c1
on d.product = c1.product
left join c2
on d.product=c2.product
left join c3
on d.product=c3.product
left join c4
on d.product=c4.product;
Reply
Plamen Ratchev says:
August 4, 2010 at 10:10 pm
Hi Mohammad,
This approach will work too. It will still access the table multiple times, which is what the method with CASE expressions avoids (since it does a single scan).
Reply
Mohammad Salimabadi says:
August 8, 2010 at 11:59 am
Hi Plamen,
An approach with single PIVOT and concatenating concept, where has same performance with CASE solution:
select product,
cast(substring([2003],1,4) as int) as v2003,
cast(substring([2003],5,4) as int) as q2003,
cast(substring([2004],1,4) as int) as v2004,
cast(substring([2004],5,4) as int) as q2004,
cast(substring([2005],1,4) as int) as v2005,
cast(substring([2005],5,4) as int) as q2005,
cast(substring([2006],1,4) as int) as v2006,
cast(substring([2006],5,4) as int) as q2006
from
(
select product,
market_year,
cast(sum(value) as binary(4)) +
cast(sum(quantity) as binary(4)) as value
from Products
group by product, market_year
)d
pivot (max(value) for market_year in ([2003],[2004],[2005],[2006])
)p
Reply
Mohammad Salimabadi says:
August 8, 2010 at 12:14 pm
For casting and formatting columns data is better help from a UDF.
so after creating this UDF:
create function dbo.formatting (@market_year binary(8), @col_nbr bit)
returns integer as
begin return cast(substring(@market_year,case @col_nbr when 1 then 1 else 5 end, 4) as int) end
The query code will be a bit simpler:
select product,
dbo.formatting([2003],1) as v2003,
dbo.formatting([2003],2) as q2003,
dbo.formatting([2004],1) as v2004,
dbo.formatting([2004],2) as q2004,
dbo.formatting([2005],1) as v2005,
dbo.formatting([2005],2) as q2005,
dbo.formatting([2006],1) as v2006,
dbo.formating([2006],2) as q2006
from
(
select product,
market_year,
cast(sum(value) as binary(4)) +
cast(sum(quantity) as binary(4)) as value
from Products
group by product, market_year
)d
pivot (max(value) for market_year in ([2003],[2004],[2005],[2006])
)p
Reply
Naomi says:
August 22, 2010 at 1:33 pm
It would be nice to have the ability to PIVOT on multiple columns included into the T-SQL language
Reply
Randall Toepfer says:
February 18, 2011 at 8:59 pm
Example where 2nd pivot value is the first non-null value of the column?
For example, the second column is a list of prices per year instead of the quantity sold per year and we want to get the latest/most recent price.
I see T-SQL has no First aggregate function like MS Access. Most efficient method the multiple join method mentioned in one of the comments but with a second join in each column that joins on a set with only the latest record?
Thanks for the post very useful for a new T-SQL programmer like myself!
Reply
Plamen Ratchev says:
February 21, 2011 at 10:25 pm
Hi Randall,
You do not need to use a join. To solve the problem you can use a CTE or derived table where you can define an expression column based on the ranking functions (like ROW_NUMBER) that will indicate the latest price per year. Then use the first method to pivot with CASE expressions and pick for that column the values only where rank is 1. If you post a sample table and values I can write you a solution.
Reply
Naomi says:
October 20, 2011 at 3:42 pm
Here is a link for dynamic pivot on multiple columns using T-SQL approach
Dynamic Pivot on multiple columns
In case this functionality is really needed.
Reply
Anonymous says:
June 29, 2012 at 10:34 am
Hi Mr. Ratchev,
I wanted to take the time to thank you for your post.
It as forever benefited my ability to work with SQL.
Best regards,
Martin Provost
Reply
nitin says:
July 9, 2012 at 4:01 am
Hi Plamen,
I just want to say that you are a true SQL genius….please keep up the good work and keep posting solutions to commonly faced SQL scenarios and keep helping us!!
Thanks!
Reply
Anonymous says:
January 21, 2013 at 6:11 pm
Hi Plamen,
Lots of thanks for this great post, This has helped me a lot. THANK YOU 🙂
Aman
Reply
Anonymous says:
January 21, 2014 at 11:36 pm
Hi Plamen,
Thanks so much for this post!! You wrote this five years ago, and people are still benefiting from it 🙂
Thanks again!
Reply
THILLAI RAJA says:
October 4, 2014 at 11:36 am
Really you are Geniussssssssssssssss…
Reply
Wairimu Murigi says:
July 29, 2015 at 12:15 pm
I used the first approach but i keep getting the error invalid column name. I already have the table created hence I am using the query window to try and do the SQL query. I am a beginner at this. Followed the first query to the letter.
What can you suggest?
Reply
David says:
September 17, 2015 at 10:53 am
Thank you Plamen, for your posts, and also to all contributing with questions and answers here.
Here are my two cents, the dynamic versions of your CASE based pivot:
DECLARE @cols NVARCHAR(MAX),
@sql NVARCHAR(MAX);
SELECT @Cols = STUFF((select ',
SUM(CASE WHEN [market_year] = ' + CAST([market_year] as char(4)) + ' THEN [value] ELSE 0 END) AS [' + CAST([market_year] as char(4)) + N'-V],
SUM(CASE WHEN [market_year] = ' + CAST([market_year] as char(4)) + ' THEN [quantity] ELSE 0 END) AS [' + CAST([market_year] as char(4)) + N'-Q]'
FROM Products
ORDER BY [market_year] FOR XML PATH(''),type).value('.','varchar(max)'),1,2,'')
;
–PRINT @cols;
SET @sql = 'SELECT product, ' + @cols + ' FROM Products GROUP BY Product';
–PRINT @sql;
EXEC(@sql);
Reply
Thomas McShane says:
September 28, 2015 at 7:26 am
Plamen, this is truly a great example of your work. I was wondering how could it be taken one step more and add a division to the results of MAX([2003]) AS v2003 / MAX([3]) AS q2003,
Reply
Anonymous says:
November 11, 2015 at 9:12 am
thank you. very good explanation
Reply
Ross Worth says:
September 1, 2016 at 2:31 pm
Is there a way to achieve this dynamically when the columns are not known or subject to change?
Reply
ChrisD says:
May 29, 2018 at 8:44 am
Hi, looks like a great solution but my SQL skills are not great and I’d like to understand how this works with eg. 4 ‘value’ columns? I have a column for each of Net, Gross, Profit and Units but I can’t translate this logic.
As I get it, this uses % ie. MOD to work out the remainder to split the two columns back out – correct?
Also, won’t this method only work if [value] is greater than [quantity]?
Reply
Plamen Ratchev says:
May 29, 2018 at 8:52 pm
Chris,
The easiest approach in your scenario is to use the CASE expressions.
You are correct about the logic of the last example. It uses integer division and % to separate the numbers. It does not matter which value is greater, as long as it does not exceed the divisor (in this case 1000). You can use the same approach with more values but you have to be careful what numbers you pick to multiply/divide by. For example, you can use 100, 10000, 1000000, and 100000000. That will “shift” your values appropriately.
Reply

Want to join the discussion?
Feel free to contribute!

Pivoting on Multiple Columns

Leave a Reply

Leave a Reply Cancel reply

SITE MAP

SEARCH

You might also like

Leave a Reply

Leave a Reply Cancel reply

SITE MAP

SEARCH