There are several ways to get the desired result.
Correlated scalar subquery...
SELECT vt1.*
FROM vehicle_tire vt1
WHERE vt1.recordId = (SELECT vt2.recordId
FROM vehicle_tire vt2
WHERE vt2.vehicleId = vt1.vehicleId
ORDER BY vt2.dateOfTireChange DESC limit 1);
...or derived table...
SELECT vt2.*
FROM vehicle_tire vt2
JOIN (SELECT vt1.vehicleId as vehicleId,
MAX(vt1.dateOfTireChange) as maxDateOfTireChange
FROM vehicle_tire vt1
GROUP BY vt1.vehicleId) dt ON vt2.vehicleId = dt.vehicleId
AND vt2.dateOfTireChange = dt.dateOfTireChange;
...are two that come to mind.
The reason GROUP BY
is not correct when applied to the whole table is that any columns you do not GROUP BY
and that are also not the subject of aggregate functions MIN()
MAX()
AVG()
COUNT()
, etc., are assumed by the server to be columns that you know to be identical in every row of the groups established by the GROUP BY
clause.
If, for example, I'm doing a query like this...
SELECT p.id,
p.full_name,
p.date_of_birth,
COUNT(c.id) AS number_of_children
FROM parent p LEFT JOIN child c ON c.parent_id = p.id
GROUP BY p.id;
The correct way to write this query would be GROUP BY p.id, p.full_name, p.date_of_birth, because none of those columns are part of the aggregate function COUNT()
.
The MySQL optimization allows you to exclude those columns that you know have to, by definition, be the same on each group from the GROUP BY
, and the server will fill those columns with data from any row in the group. Which row is not defined. As you can see, in the example, the parent's full_name would be the same in all rows within a group-by parent.id, and that is a case when this optimization is legitimate. The justification is that it allows the server to have to handle smaller values (fewer bytes) when executing the grouping... but in a query like yours where the ungrouped columns have different values within each group, you get an invalid result, by design.
The SQL_MODE
ONLY_FULL_GROUP_BY
disables this optimization.