1

I have an issue that is like a non-aggregating pivot, but I want my table to include multiple results. As an example I start out like this:

ID | OrganizationID | Group | Vendor | Product 
 1         3            G1      V1       P1
 2         3            G1      V2       P2
 3         3            G2      V3       P3
 4         4            G1      V1       P1
 5         4            G1      V1       P2

And need my data to end up like this:

ID | OrganizationID | G1_V1 | G1_V2 | G2_V3
 1         3            P1      P2      P3 
 2         4            P1     NULL    NULL
 3         4            P2     NULL    NULL

The data I am working with is around 7 million rows so self-joining with dynamic sql has proven extremely slow. Is there something easy I'm missing? Thanks in advance!

Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
R Kelly
  • 45
  • 1
  • 6
  • Are there a number of fixed columns you want, or are you looking for a dynamic solution? The latter may not be workable on some RDBMS's. – Joachim Isaksson Mar 17 '14 at 16:45
  • I am currently considering both SQL Server 2012 and Postgres 9.3 as potential solutions. I have a fixed number of columns although it is something like 673, so I'm using dynamic sql to put the list together. – R Kelly Mar 17 '14 at 18:21

1 Answers1

3

You can use a common table expression and a set of CASE statements to get the pivot structure you need; this will work identically in most RDBMS's.

WITH cte AS (
  SELECT id, organizationid, 
         CONCAT([Group],'_',Vendor) col, 
         product,
         ROW_NUMBER() OVER (
           PARTITION BY organizationid,[group],vendor 
           ORDER BY product) rn
  FROM myTable
)
SELECT 
  organizationid,
  MAX(CASE WHEN col='G1_V1' THEN product ELSE NULL END) G1_V1,
  MAX(CASE WHEN col='G1_V2' THEN product ELSE NULL END) G1_V2,
  MAX(CASE WHEN col='G2_V3' THEN product ELSE NULL END) G2_V3
FROM cte
GROUP BY organizationid, rn;

...or, alternately, you can use TSQL's regular PIVOT statement to replace the CASE part, which will probably be faster but may not work in all RDBMS's;

WITH cte AS (
  SELECT id, organizationid, 
         CONCAT([Group],'_',Vendor) col, 
         product,
         ROW_NUMBER() OVER (
           PARTITION BY organizationid,[group],vendor 
           ORDER BY product) rn
  FROM myTable
)
SELECT organizationid as organizationid, 
[G1_V1], [G1_V2], [G2_V3]
FROM
(SELECT organizationid,rn,col, product FROM cte) AS SourceTable
PIVOT
(
  MAX(product) FOR col IN ([G1_V1], [G1_V2], [G2_V3])
) AS PivotTable;

An SQLfiddle to test both.

You can produce a similar pivot in PostgreSQL using crosstab, however I don't have a PostgreSQL database with the tablefunc module installed to test on, so can only link to the details.

Community
  • 1
  • 1
Joachim Isaksson
  • 176,943
  • 25
  • 281
  • 294
  • @CraigRinger Could you add a sample? Tried some basic sample code and I can't [load the extension](http://sqlfiddle.com/#!15/b341d/2) and it does not seem to be [loaded by default](http://sqlfiddle.com/#!15/b341d/2). – Joachim Isaksson Mar 18 '14 at 05:31
  • I just checked, and I was wrong, many extensions are installed in sqlfiddle but tablefunc doesn't seem to be one of them. – Craig Ringer Mar 18 '14 at 05:57
  • Sqlfiddle link in answer may dead or maybe just an issue on their website today. – interesting-name-here Feb 09 '17 at 16:58