0

I have a dataset that looks like this:

id | test_id
---+--------
1  | a
1  | b
1  | u 
2  | a 
2  | u 
3  | a 
3  | b 
3  | u

And I would like to roll it up into a new table such that the test_id's are the column names (along with id) and the rows are either 1 or 0s depending if the test was given to that id, like so

id | a | b | u
---+---+---+--
1  | 1 | 1 | 1
2  | 1 | 0 | 1
3  | 1 | 1 | 1

Is there a way in Postgres that I can rearrange the table such as this?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Is the number of different items in `test_id` fixed or can there be other items too? – jpw Mar 14 '15 at 22:10
  • the `test_id` items are fixed. There are actually about 30 test items. But every id may not have 30 associated `test_id`s – aubonphysics Mar 15 '15 at 11:47
  • If an id don't have an associated test_id they would get 0 in my answer. – jpw Mar 15 '15 at 12:18
  • yes. thank you and that works great. Is there a way that counts them? It turns out that some of the test_ids repeat for a given customer. Meaning a customer has tested twice on a particular `test_id`. If zero, add one. If one, add one again (2), etc. – aubonphysics Mar 15 '15 at 17:21
  • If you want to count the number of times a customer has taken each test just change the `max()` to `sum()`. That would assign 0 to those who haven't taken the test and N (the number of times a test was taken). See this example: http://www.sqlfiddle.com/#!15/59c3f/1 – jpw Mar 15 '15 at 17:32

2 Answers2

1

If the number of possible test_id is fixed and known the easiest way to do this is to use conditional expressions like this:

select 
    id, 
    max(case when test_id = 'a' then 1 else 0 end) as a,
    max(case when test_id = 'b' then 1 else 0 end) as b,
    max(case when test_id = 'u' then 1 else 0 end) as u
from your_table
group by id
order by id

Sample SQL Fiddle

If the test_id values are unknown and can vary then you need to use dynamic sql to generate the query.

jpw
  • 44,361
  • 6
  • 66
  • 86
1

The actual crosstab ("pivot") solution for a given set of test_id's:

SELECT id
     , COALESCE(a, 0) AS a
     , COALESCE(b, 0) AS b
     , COALESCE(u, 0) AS u
FROM   crosstab('SELECT id, test_id, 1 AS val FROM tbl ORDER BY 1,2'
              , $$VALUES ('a'), ('b'), ('u')$$
       ) AS t (id int, a int, b int, u int);

For a dynamic set of test_id's you need to build the statement in a first query and execute it in a second query. Or you return arrays instead.

Similar to this one:

Basics about crosstab() (read this if you are new to it!):

Alternatives:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228