0

Goal - update all records for each user with a value from the first record for that user

Background - I have a post-production database (i.e I can not change how data is captured) that is a log of all user interactions over 12 months. For analysis purposes, I want to tag each row with the month (yyyy-mm) of that user's very first appearance in the database. The new field (first_use) will come from existing field (month) in the first record for that user.

Selecting the first row in the database is the easy part:

SELECT month,user_email, MIN(month) as first_month
    FROM table
    GROUP BY user_email 

This gives me the first row (first transaction) for each user in the table.

My question is how to write the UPDATE statement. When I try something like this:

UPDATE table set first_use = (
SELECT MIN(month) as first_month
    FROM table
    GROUP BY user_email)

then all first_use fields contain the same date (the earliest date in the table). How do I get the update to apply to all records in a group, one group at a time with the right month for that group?

globalSchmidt
  • 1,329
  • 16
  • 28

1 Answers1

1

YOu need to narrow your group by:

UPDATE table set first_use = (
    SELECT MIN(month) as first_month
    FROM table T2
    where table.user_email = T2.user_email
    ) 

You don't need to explicitly group by in the subquery. But it is a slow update, since it will run a subquery for every tuple in table.

dmg
  • 4,231
  • 1
  • 18
  • 24
  • In my case "table" and "T2" are one and the same. I, therefore, fully expected `table.user_email = T2.user_email` to not work but it did! @dmg was right though: it is VERY slow. Accepted! – globalSchmidt Oct 31 '16 at 19:27