1

I have a table with data like this:

+-----------+-------+------+----------+
| timestamp | event | data | moreData |
+-----------+-------+------+----------+
| 100000000 |     1 |   10 |       20 |
| 100000001 |     1 |   15 |       10 |
| 100000002 |     1 |   30 |       30 |
| 100000003 |     1 |    5 |       50 |
| 100000004 |     2 |  110 |      120 |
| 100000005 |     2 |  115 |      110 |
| 100000006 |     2 |  130 |      130 |
| 100000007 |     2 |   15 |      150 |
+-----------+-------+------+----------+

Now I want to select only the newest rows for each event. So in the end I want to have this result set:

+-----------+-------+------+----------+
| timestamp | event | data | moreData |
+-----------+-------+------+----------+
| 100000003 |     1 |    5 |       50 |
| 100000007 |     2 |   15 |      150 |
+-----------+-------+------+----------+

So far I was not able to do this. In MySQL I can use "GROUP BY event" but then I get some random row from the database, not the newest. ORDER BY doesn't help because the grouping is done before ordering. Using an aggregation like MAX(timestamp) while grouping by event also doesn't help because then the timestamp is the newest but "data" and "moreData" is still from some other random row.

I guess I have to do a sub select so I have to first get the latest timestamp like this:

SELECT MAX(timestamp), event FROM mytable GROUP BY event

and then use the result set to filter a second SELECT, but how? And maybe there is a clever way to do it without a sub select?

ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235
kayahr
  • 20,913
  • 29
  • 99
  • 147
  • Seems pretty similar to this SO question: http://stackoverflow.com/questions/2657482/sql-find-the-max-record-per-group Might look at that one too. – itsmatt Sep 04 '11 at 19:20

4 Answers4

2

AFAIK, sub select is your best option, as follows:

SELECT * 
FROM mytable mt 
    JOIN ( SELECT MAX(timestamp) as max, event 
           FROM mytable 
           GROUP BY event) m_mt 
    ON (mt.timestamp = m_mt.max AND mt.event = m_mt.event);
Roman
  • 19,581
  • 6
  • 68
  • 84
Ben Mosher
  • 13,251
  • 7
  • 69
  • 80
2

You could use an inner join as a filter:

select  *
from    events e1
join    (
        select  event
        ,       max(timestamp) as maxtimestamp
        from    events
        group by
                event
        ) e2
on      e1.event = e2.event
        and e1.tiemstamp = e2.maxtimestamp 
Andomar
  • 232,371
  • 49
  • 380
  • 404
1
SELECT * FROM 
(SELECT * FROM mytable ORDER BY timestamp DESC) AS T1 
GROUP BY event;
Richard H
  • 38,037
  • 37
  • 111
  • 138
  • Hi Ben - It didn't :) I've updated my answer. You can't GROUP BY after a sort, so a sub-query is required. – Richard H Sep 04 '11 at 19:17
  • Works, too. But I wonder if this is slower than Ben's solution. Let's assume the table has millions of entries. The sub select in Ben's answer just selects the grouped max timestamps (Let's assume it's indexed correctly) while your sub select selects ALL rows of the database in the first select and then groups them in a second. Well, maybe the database optimizes it so there is no difference but I go with Ben's solution for now. – kayahr Sep 04 '11 at 19:51
  • @kayahr - yes it's probably slower, would be interesting to benchmark. However you must put an INDEX on timestamp, otherwise both will be slow if the table is large. Without an index the whole table will be read with either solution. – Richard H Sep 05 '11 at 09:04
0
SELECT   e2.*
FROM     events e
JOIN     events e2 ON e2.event = e.event AND e2.timestamp = MAX(e2.timestamp)
GROUP BY e.id
Arnaud Le Blanc
  • 98,321
  • 23
  • 206
  • 194