3

I am trying to build a one to many query in PostgreSQL that looks at a customer table, a transaction table and the result is a table showing which products the customer has purchased.

table: customers
id
1
2
3
4
5

table: purchases
custid  product
1       software 1
1       software 2
3       software 2
3       software 3  
4       software 1
5       software 1
5       software 2
5       software 3

The result I want to get is:

custid  software1   software2   software3
1       TRUE        TRUE        NULL
3       NULL        TRUE        TRUE
4       TRUE        NULL        NULL
5       TRUE        TRUE        TRUE

From what I can tell I need to use crosstab (since postgreSQL doesn't support pivot), but I'm not sure on the snytax. I'd also be fine if the output included row:

custid  software1   software2   software3
2       NULL        NULL        NULL

If its easier to do it one way or the other, it doesn't really matter.

sagi
  • 40,026
  • 6
  • 59
  • 84
fauxgt4
  • 33
  • 5

2 Answers2

2

Ignoring customers that did not buy anything, because that's a bit shorter and faster:

SELECT * FROM crosstab(
    'SELECT custid, product, true FROM purchases ORDER BY 1, 2'    
  , $$VALUES ('software 1'::text), ('software 2'), ('software 3')$$)
AS ct (custid int, software1 bool, software2 bool, software3 bool);

Details:

One minor difficulty here: you need to add the boolean value in the query, since it's not in the table.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • are the values supposed to be `('software 1'::text), ('software 2'::text), ('software 3'::text)`? – fauxgt4 Feb 16 '16 at 16:55
  • 1
    @fauxgt4: You can do that, but it's not necessary. Data types in a `VALUES` expression are determined by the first row *only*. Further rows are coerced to the same type. – Erwin Brandstetter Feb 16 '16 at 16:59
1

I'm not familiar with crosstab but you can it with a group by clause and a case expression like this(only if there are only those 3 softwares, if its unlimited that this solution is no good):

SELECT t.id,
       max(case when s.product = 'software 1' then true end) as software1,
       max(case when s.product = 'software 2' then true end) as software2,
       max(case when s.product = 'software 3' then true end) as software3
FROM customers t
LEFT OUTER JOIN purchases s ON(t.id = s.custid)
GROUP BY t.id

This will also include a row for the missing ID's.

sagi
  • 40,026
  • 6
  • 59
  • 84
  • This worked (with the exception of max not working on a boolean, just switching to integer 1 instead though and it worked great). Downside on this was processing time. Even with some limits on the output it still was a bit long, but does get the result I need. For the full query I'll just schedule overnight. – fauxgt4 Feb 16 '16 at 16:44