1

I have a basic table:

+-----+--------+------+------+
| id, | name,  | cat, | time |
+-----+--------+------+------+
|   1 | jamie  |    1 |  100 |
|   2 | jamie  |    2 |  100 |
|   3 | jamie  |    1 |   50 |
|   4 | jamie  |    2 |  150 |
|   5 | bob    |    1 |  100 |
|   6 | tim    |    1 |  300 |
|   7 | alice  |    4 |  100 |
+-----+--------+------+------+

I tried using the "Left Joining with self, tweaking join conditions and filters" part of this answer: SQL Select only rows with Max Value on a Column but some reason when there are records with a value of 0 it breaks, and it also doesn't return every unique answer for some reason.

When doing the query on this table I'd like to receive the following values:

+-----+--------+------+------+
| id, | name,  | cat, | time |
+-----+--------+------+------+
|   1 | jamie  |    1 |  100 |
|   4 | jamie  |    2 |  150 |
|   5 | bob    |    1 |  100 |
|   6 | tim    |    1 |  300 |
|   7 | alice  |    4 |  100 |
+-----+--------+------+------+

Because they are unique on name and cat and have the highest time value.

The query I adapted from the answer above is:

SELECT a.name, a.cat, a.id, a.time
FROM data A
INNER JOIN (
    SELECT name, cat, id, MAX(time) as time
    FROM data
    WHERE extra_column = 1
    GROUP BY name, cat
) b ON a.id = b.id AND a.time = b.time
Community
  • 1
  • 1
Jamie
  • 674
  • 1
  • 10
  • 30

3 Answers3

3

The issue here is that ID is unique per row you can't get the unique value when getting the max; you have to join on the grouped values instead.

SELECT a.name, a.cat, a.id, a.time
FROM data A
INNER JOIN (
    SELECT name, cat, MAX(time) as time
    FROM data
    WHERE extra_column = 1
    GROUP BY name, cat
) b ON A.Cat = B.cat and A.Name = B.Name AND a.time = b.time

Think about it... So what ID is mySQL returning form the Inline view? It could be 1 or 3 and 2 or 4 for jamie. Hows does the engine know to pick the one with the max ID? it is "free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. " it could pick the wrong one resulting in incorrect results. So you can't use it to join on.

https://dev.mysql.com/doc/refman/5.0/en/group-by-handling.html

xQbert
  • 34,733
  • 2
  • 41
  • 62
  • This looks correct to me. For some understanding of where you went wrong, read the link below. Basically, MySQL should never have let your inner query run, but it did, and its by design: https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html --Edit: I see xQbert thought to add the same note :) – Cargo23 Nov 25 '15 at 15:34
  • Resolve incident first, then improve upon the answer :P as to Why it's correct :P – xQbert Nov 25 '15 at 15:35
  • Thank you very much - that's brilliant. All working now :) (gotta wait 3 mins before I can mark it correct) – Jamie Nov 25 '15 at 15:37
  • @JamieH be sure to check out fthiella's as well. It's a different approach but yields the same results and may be better performance depending on data volume and indexes! – xQbert Nov 25 '15 at 15:40
  • Cheers - If I hit any performance problems with this solution then I'll give that one ago :) – Jamie Nov 25 '15 at 15:44
3

If you want to use a self join, you could use this query:

SELECT
  d1.*
FROM
  date d1 LEFT JOIN date d2
  ON d1.name=d2.name
     AND d1.cat=d2.cat
     AND d1.time<d2.time
WHERE
  d2.time IS NULL
fthiella
  • 48,073
  • 15
  • 90
  • 106
0

It is very simple

SELECT MAX(TIME),name,cat FROM table name group by cat

hoda
  • 11
  • 1
  • 3