3

If I have a table with the following structure and data:

id | user_id | created_at
-------------------------
 1 |       7 |    0091942
 2 |       3 |    0000014
 3 |       6 |    0000890
 4 |       6 |    0029249
 5 |       7 |    0000049
 6 |       3 |    0005440
 7 |       9 |    0010108

What query would I use to get the following results (explanation to follow):

id | user_id | created_at
-------------------------
 1 |       7 |    0091942
 6 |       3 |    0005440
 4 |       6 |    0029249
 7 |       9 |    0010108

As you can see:

  • Only one row per user_id is returned.
  • The row with the highest created_at is the one returned.

Is there a way to accomplish this without using subqueries? Is there a name in relational algebra parlance that this procedure goes by?

Alexander Trauzzi
  • 7,277
  • 13
  • 68
  • 112
  • In SQL Server you can use the ranking function `row_number()`, but there doesn't seem to be an equivalent in MySql: http://stackoverflow.com/questions/1895110/row-number-in-mysql – Blorgbeard Jun 11 '13 at 00:49
  • Please feel free to post the subquery-based solution if you'd like so I can flag it correct. – Alexander Trauzzi Jun 11 '13 at 00:53
  • you should change created_at column into datetime so you can fetch teh latest date i think. – Jhonathan H. Jun 11 '13 at 00:57

2 Answers2

6

The query is known as a groupwise maximum, which (in MySQL, at least) can be implemented with a subquery. For example:

SELECT my_table.* FROM my_table NATURAL JOIN (
  SELECT   user_id, MAX(created_at) created_at
  FROM     my_table
  GROUP BY user_id
) t

See it on sqlfiddle.

eggyal
  • 122,705
  • 18
  • 212
  • 237
2

You can just get the max and group by the user_id:

select id,user_id,max(created_at)
from supportContacts
group by user_id
order by id;

Here is what it outputs:

ID  USER_ID MAX(CREATED_AT)
1   7   91942
2   3   5440
3   6   29249
7   9   10108

See the working demo here

Note that the example on the fiddle uses the created_at field as int, just use your format it should make no difference.

EDIT: I will leave this answer as a referece but note that his query will produce undesired results as Gordon stated, please do not use this in production.

isJustMe
  • 5,452
  • 2
  • 31
  • 47
  • Interesting, SQL Server wouldn't let you do that since `id` is not part of the `group by` and not aggregated either. – Blorgbeard Jun 11 '13 at 01:19
  • yeah, postgreSQL won't either, luckily mysql does. – isJustMe Jun 11 '13 at 01:21
  • 2
    @isJustMe . . . MySQL documentation is explicitly clear that the results are not defined ("The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate."). You can read the documentation here: http://dev.mysql.com/doc/refman/5.5/en/group-by-extensions.html. – Gordon Linoff Jun 11 '13 at 01:39
  • @GordonLinoff excelent catch, I would have never noted that... I left an edit warning that it should not be used, thanks for pointing this out. – isJustMe Jun 11 '13 at 01:56
  • More than just "*might produce undesired results*", the example actually *does* produce undesired results. Note that the `ID` values returned for `USER_ID` 3 and 6 are 2 and 3 respectively, whereas they should be 6 and 4. (Also be aware that if you delete this answer, you'll be rewarded with a shiny [Disciplined](http://stackoverflow.com/badges/37/disciplined) badge). – eggyal Jun 11 '13 at 07:24
  • @eggyal edited the "might" for a "will", however I think is useful to leave the answer as it might help other users to avoid this this type of issues. – isJustMe Jun 11 '13 at 14:17