69

I have the following table, which gives multiple email addresses for each user.

table sample

I need to flatten this out to columns on a user query. To give me the "newest" 3 email addresses based on the creation date. Like:

user_name user_id email1 email2 email3
Mary 123 mary@gmail.com mary@yahoo.co.uk mary@test.com
Joe 345 joe@gmail.com [NULL] [NULL]
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
dacology
  • 954
  • 2
  • 7
  • 13

1 Answers1

87

Use crosstab() from the tablefunc module.

SELECT * FROM crosstab(
   $$SELECT user_id, user_name, rn, email_address
     FROM  (
        SELECT u.user_id, u.user_name, e.email_address
             , row_number() OVER (PARTITION BY u.user_id
                            ORDER BY e.creation_date DESC NULLS LAST) AS rn
        FROM   usr u
        LEFT   JOIN email_tbl e USING (user_id)
        ) sub
     WHERE  rn < 4
     ORDER  BY user_id
   $$
  , 'VALUES (1),(2),(3)'
   ) AS t (user_id int, user_name text, email1 text, email2 text, email3 text);

I used dollar-quoting for the first parameter, which has no special meaning. It's just convenient to escape single quotes in the query string, which is a common case:

Detailed explanation and instructions:

And in particular, for "extra columns":

The special difficulties here are:

  • The lack of key names.
    → We substitute with row_number() in a subquery.

  • The varying number of emails.
    → We limit to a max. of three in the outer SELECT and use crosstab() with two parameters, providing a list of possible keys.

Pay attention to NULLS LAST in the ORDER BY.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 2
    perfect, thankyou. Does exactly what I need. This query would have been a bit beyond me, but has helped me learn and understand much. – dacology Apr 14 '14 at 14:01
  • 6
    For those few wondering. No, this does not work on AWS Redshift. – Marcello Grechi Lins Jul 07 '16 at 19:27
  • 2
    @MarcelloGrechiLins: This Q & A are for Postgres, not Redshift. – Erwin Brandstetter Jul 07 '16 at 21:24
  • 10
    @ErwinBrandstetter I know, but since Redshift uses "Postgre Compliant" drivers, most Redshift users looking for answers to their questions search for "PostgreSQL" instead, since the community is bigger and the chances if finding the answer for your Redshift issue, on a Postgre post is higher :) Hopefully that makes sense – Marcello Grechi Lins Jul 13 '16 at 20:46
  • 11
    @MarcelloGrechiLins: Redshift is a fork based on Postgres 8.0.2 and the [list of unsupported Postgres features](http://docs.aws.amazon.com/redshift/latest/dg/c_unsupported-postgresql-features.html) has grown very long. BTW, [it's *PostgreSQL* or *Postgres*, never *"Postgre"*](https://wiki.postgresql.org/wiki/Identity_Guidelines). That's like saying *"Redshif"*. – Erwin Brandstetter Jul 14 '16 at 01:01
  • Hi! Landed on this here 8 years later & it's still helpful! Thank you – Rachit Kyte.One Sep 19 '22 at 15:38