GROUP BY with MAX(DATE)

Question

I'm trying to list the latest destination (MAX departure time) for each train in a table, for example:

Train    Dest      Time
1        HK        10:00
1        SH        12:00
1        SZ        14:00
2        HK        13:00
2        SH        09:00
2        SZ        07:00

The desired result should be:

Train    Dest      Time
1        SZ        14:00
2        HK        13:00

I have tried using

SELECT Train, Dest, MAX(Time)
FROM TrainTable
GROUP BY Train

by I got a "ora-00979 not a GROUP BY expression" error saying that I must include 'Dest' in my group by statement. But surely that's not what I want...

Is it possible to do it in one line of SQL?

For those who are wondering, the cleanest "plain sql" solution is [the one by Joe](http://stackoverflow.com/a/14841015/632951). Second prize goes to [Claudio](http://stackoverflow.com/a/24370489/632951). — Pacerier, Apr 30 '15 at 06:26

score 224 · Answer 1 · answered Aug 16 '10 at 07:42

224

SELECT train, dest, time FROM ( 
  SELECT train, dest, time, 
    RANK() OVER (PARTITION BY train ORDER BY time DESC) dest_rank
    FROM traintable
  ) where dest_rank = 1

answered Aug 16 '10 at 07:42

Thilo

257,207
101
511
656

1

Thx Thilo :-) Indeed your answer is also correct. But as I can only accept 1 answer, I picked Oliver because I tried his answer first. – Aries Aug 16 '10 at 08:50
16

@Aries - Thilo's answer is superior to Oliver's, as Thilo's answer will tend to perform less I/O. The analytic function allows the SQL to process the the table in a single pass, whereas Oliver's solution requires multiple passes. – Adam Musch Aug 16 '10 at 14:33
1

Agreed, the GROUP BY causes an unnecessary performance hit. Using this method or even a Left Join will be far more efficient, especially with larger tables. – Joe Meyer Feb 13 '13 at 01:08
1

what's the difference of above code with the one below using row_number? can any one explain to me. SELECT train, dest, time FROM ( SELECT train, dest, time, ROW_NUMBER() OVER (PARTITION BY train ORDER BY time DESC) rn FROM traintable ) where rn = 1 – Bat_Programmer May 06 '13 at 06:11
@AdamMusch, Poor MySQL devs don't have luxury of `rank()` – Pacerier Apr 30 '15 at 04:17
@Pacerier, original question was regarding Oracle. – Tagar Jul 02 '15 at 14:19
3

@Ruslan, Yea I was pitying MySQL devs. – Pacerier Jul 04 '15 at 12:01
Help with similar question https://stackoverflow.com/questions/46062964/fetch-row-which-has-the-maximum-value-for-a-column-minimum-value-of-the-same-c – user104309 Sep 05 '17 at 20:53
I tried this with example employee database in PostgresQL and the original query performed better. Don't know why – iamareebjamal Jul 12 '18 at 09:29
This answer lead me in the correct direction, found my answer here [{coding} Sight](https://codingsight.com/methods-to-rank-rows-in-sql-server-rownumber-rank-denserank-and-ntile/) – zerodoc Aug 01 '19 at 14:46
1

This worked for me, but I had to give the table an alias and use it as a prefix to "dest_rank" before I could actually use it. Select From (blah) as t where t.dest_rank = 1 for example. Other than that, great answer! – Urk May 06 '20 at 13:16
Isn't that suboptimal to sort (aka ORDER BY) to get only the max value? Sorting is done in `O(n log (n))` while finding max is only `O(n)`. – hans Apr 27 '23 at 08:00

score 201 · Accepted Answer · edited Apr 30 '15 at 06:23

201

You cannot include non-aggregated columns in your result set which are not grouped. If a train has only one destination, then just add the destination column to your group by clause, otherwise you need to rethink your query.

Try:

SELECT t.Train, t.Dest, r.MaxTime
FROM (
      SELECT Train, MAX(Time) as MaxTime
      FROM TrainTable
      GROUP BY Train
) r
INNER JOIN TrainTable t
ON t.Train = r.Train AND t.Time = r.MaxTime

edited Apr 30 '15 at 06:23

Pacerier

86,231
106
366
634

answered Aug 16 '10 at 07:44

Oliver Hanappi

12,046
7
51
68

16

Careful, this will not work if there are "ties" for max(time) because you'll be getting multiple rows. You need to add `group by train,dest` right behind the whole query above. – Pacerier Apr 30 '15 at 06:13
Ok. But how does this work if the inner query is joined to another table? Let's pretend that Destination in the TrainTable has it's own table. So the inner query would look like: select t.train, d.dest, max(time) from TrainTable t join Destination d on t.destid = d.id group by t.train, d.dest – not_ur_avg_cookie Jul 12 '18 at 20:00

score 99 · Answer 3 · edited Oct 12 '15 at 11:43

99

Here's an example that only uses a Left join and I believe is more efficient than any group by method out there: ExchangeCore Blog

SELECT t1.*
FROM TrainTable t1 LEFT JOIN TrainTable t2
ON (t1.Train = t2.Train AND t1.Time < t2.Time)
WHERE t2.Time IS NULL;

edited Oct 12 '15 at 11:43

Community

1
1

answered Feb 12 '13 at 20:16

Joe Meyer

4,315
20
28

9

I like this approach because it uses just standard SQL and works really fine and fast. – GreenTurtle Apr 17 '13 at 13:05
Wonderful solution. Much better than the complicated group by clauses had in my mind. I just love cleverly elegent things like this, thanks! – Ran Sagy Jan 13 '15 at 18:41
18

It's eye-opening how many people keep saying this solution is "wonderful" and "the best" yet none had **tried** it. Simply put, **it doesn't work.** The correct query is: `select t1.* from TrainTable t1 left join TrainTable t2 on (t1.Train= t2.Train and t1.Time < t2.Time) where t2.Time is null`. – Pacerier Apr 30 '15 at 05:29
1

@Pacerier you're right, looking at the results the op was expecting that is the correct sql (although the concept was the same). I've updated my answer. – Joe Meyer Apr 30 '15 at 12:34
As was mentioned by others, there will be multiple records in the result if there are "ties" (not the case with the timetable probably, but possible in many other cases). – mp31415 Oct 16 '15 at 22:26
7

Interesting, but I've checked on my MS SQL Server server (21000 records) and this 3 times slower than MAX + GROUP BY – CoperNick Mar 18 '16 at 15:02
5

VERY slow! The currency converter table with fields date,from,to,rate. 203161 rows total. Joe Meyer method gives 362 rows in set (31,29 sec). Oliver Hanappi method gives 362 rows in set (0,04 sec) – TheRoSS Dec 05 '17 at 13:24
The method causes large rows should be examined. – John Huang Jan 30 '18 at 07:05
I was able to use this answer in Visual Studios and with a little tweaking... adding another left join for another table, and replacing the .* with the columns I really wanted, I was able to get the results I needed. I was unable to get either of the other 2 top rated answers to work at all in visual studios – user5419026 Sep 05 '19 at 03:02
1

This only works if there are exactly 2 rows with the same Train value. It selects the top N-1, where N is the number of rows with the same Train value. For example, if there are 3 rows with same Train value, it selects the top 2, if there are 4, it selects the top 3 – kane Sep 15 '20 at 04:28

score 16 · Answer 4 · answered Jun 23 '14 at 16:07

16

Another solution:

select * from traintable
where (train, time) in (select train, max(time) from traintable group by train);

answered Jun 23 '14 at 16:07

Claudio Negri

179
1
4

3

Careful, this will not work if there are "ties" for max(time) because you'll be getting multiple rows. Use this instead: `select * from traintable where (train, time) in (select train, max(time) from traintable group by train) group by train,dest; ` – Pacerier Apr 30 '15 at 06:03

score 9 · Answer 5 · answered Aug 16 '10 at 23:24

9

As long as there are no duplicates (and trains tend to only arrive at one station at a time)...

select Train, MAX(Time),
      max(Dest) keep (DENSE_RANK LAST ORDER BY Time) max_keep
from TrainTable
GROUP BY Train;

answered Aug 16 '10 at 23:24

Gary Myers

34,963
3
49
74

3

"and trains tend to only arrive at one station at a time"... This is not stated. – Pacerier Apr 30 '15 at 06:04

score 5 · Answer 6 · edited Jan 14 '16 at 13:29

5

I know I'm late to the party, but try this...

SELECT 
    `Train`, 
    `Dest`,
    SUBSTRING_INDEX(GROUP_CONCAT(`Time` ORDER BY `Time` DESC), ",", 1) AS `Time`
FROM TrainTable
GROUP BY Train;

Src: Group Concat Documentation

Edit: fixed sql syntax

edited Jan 14 '16 at 13:29

CoKe

639
5
13

answered Jul 02 '15 at 14:09

Gravy

12,264
26
124
193

Documentation link broken. [Try this](https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html#function_group-concat) – f.llanquipichun May 02 '22 at 22:03

GROUP BY with MAX(DATE)

6 Answers6

Linked

Related