Fetch Max from a date column grouped by a particular field

Question

I have a table similar to this:

LogId  RefId   Entered
==================================
1      1       2010-12-01
2      1       2010-12-04
3      2       2010-12-01
4      2       2010-12-06
5      3       2010-12-01
6      1       2010-12-10
7      3       2010-12-05
8      4       2010-12-01

Here, LogId is unique; For each RefId, there are multiple entries with timestamp. What I want to extract is LogId for each latest RefId.

I tried solutions from this link:http://stackoverflow.com/questions/121387/sql-fetch-the-row-which-has-the-max-value-for-a-column. But, it returns multiple rows with same RefId. The LogId as well as RefId should be unique.

Can someone help me with this?

Thanks

Vamyip

If the solution given by @Dylan Beattie returns duplicate records, then you have same max dates for the same Refid, since his solution is correct — Lamak, Jan 17 '11 at 14:09
@Lamak: Exactly!!! This is what I was looking for. I was getting multiple records no matter what way I tried. You have pointed out the correct reason for duplicate entries. You are equally eligible as answerer. Thanks — vamyip, Jan 17 '11 at 14:41
Don't worry, glad my comment helped you identify your problem. — Lamak, Jan 17 '11 at 14:51
@Lamak: Just out of curiosity, why does the result contain duplicate RefIds even if we use distinct keyword against RefId column? — vamyip, Jan 18 '11 at 06:17
Well, the DISTINCT clause returns distinct values for the **complete row**, that means that no complete record is duplicated. In your case, you are not actually getting duplicate results, you get duplicate values for RefId and MaxDate, but since you are also selecting the LogId (wich is the primary key) its impossible that the whole row can have duplicate results. — Lamak, Jan 18 '11 at 12:17

Dylan Beattie · Accepted Answer · 2011-01-17T18:44:08.903

3

You need to use a subquery that extracts the latest Entered value for each RefId, and then join your source table to this on RefId, Entered:

SELECT DISTINCT MyTable.LogId, MyTable.Entered FROM MyTable
INNER JOIN (SELECT RefId, MAX(Entered) as Entered FROM MyTable GROUP BY RefId) Latest
ON MyTable.RefId = Latest.RefId AND MyTable.Entered = Latest.Entered

edited Jan 17 '11 at 18:44

answered Jan 17 '11 at 13:54

Dylan Beattie

53,688
35
128
197

It still returns duplicate records. Also you forgot the from clause in the right hand side of Join. Thanks for your prompt reply though. – vamyip Jan 17 '11 at 14:05
@Lamak is correct. The results had same max dates for same refid. Thanks a lot for the reply. – vamyip Jan 17 '11 at 14:39

score 0 · Answer 2 · answered Jan 17 '11 at 14:39

Since it appears auto-increment log ID, they would be date/time stamped in sequential order. So, by grabbing the last LogID per Reference ID, you'll have the "most recent" one in the "PreQuery" below, then join based on that single ID to the original table to get the actual date stamp info (or other details) you need from the actual log.

select PreQuery.RefID,
       PreQuery.LastLogEntry,
       L.Entered
   from 
      ( select RefID,
               Max( LogID ) LastLogEntry
           from 
               YourLog
           group by
               RefID ) PreQuery,
      YourLog L
   where
      PreQuery.LastLogEntry = L.LogID

score 0 · Answer 3 · answered Jan 17 '11 at 18:49

To handle the duplicates correctly:

SELECT  m.*
FROM    (
        SELECT  DISTINCT refid
        FROM    mytable
        ) md
JOIN    mytable m
ON      m.LogID = 
        (
        SELECT  LogID
        FROM    mytable mi
        WHERE   mi.refid = md.refid
        ORDER BY
                mi.refid DESC, mi.entered DESC, mi.logid DESC
        LIMIT 1
        )

Create an index on mytable (refid, entered, logid) for this to work fast.

Fetch Max from a date column grouped by a particular field

3 Answers3

Linked