0

I have a database and one of tables has the following structure:

recordId, vehicleId, dateOfTireChange, expectedKmBeforeNextChange, tireType

I want to make such a selection from the table that i only get thouse rows that contain the most recent date for each vehicleId.

I tried this approach

SELECT vehicleid, 
       Max(dateoftirechange) AS lastChange, 
       expectedkmbeforenextchange, 
       tiretype 
FROM   vehicle_tires 
GROUP  BY vehicleid 

but it doesn't select the kilometers associated with the most recent date so it does not work.

Any idea how to make this selection?

Kermit
  • 33,827
  • 13
  • 85
  • 121
Joro Seksa
  • 1,525
  • 3
  • 18
  • 44
  • 2
    Please stop using MySQL's `GROUP BY` extension. It's not ANSI standard and will hurt you in the long run. – Kermit Oct 17 '13 at 15:13
  • ok. i will think about it. – Joro Seksa Oct 17 '13 at 15:14
  • 3
    There's nothing wrong with using `GROUP BY` -- the problem is that MySQL [allows you to write queries](https://dev.mysql.com/doc/refman/5.6/en/group-by-extensions.html) that include columns that are not in the group-by and not part of aggregate functions. This is intended to be an optimization (and works quite well as such) but you have to understand it or it allows you to write queries that do not return deterministic results... as you are seeing here. You [can configure MySQL not to let you do this](https://dev.mysql.com/doc/refman/5.6/en/server-sql-mode.html#sqlmode_only_full_group_by). – Michael - sqlbot Oct 17 '13 at 18:08

1 Answers1

3

There are several ways to get the desired result.

Correlated scalar subquery...

SELECT vt1.*
  FROM vehicle_tire vt1
 WHERE vt1.recordId = (SELECT vt2.recordId
                         FROM vehicle_tire vt2
                        WHERE vt2.vehicleId = vt1.vehicleId
                        ORDER BY vt2.dateOfTireChange DESC limit 1);

...or derived table...

 SELECT vt2.*
   FROM vehicle_tire vt2
   JOIN (SELECT vt1.vehicleId as vehicleId,
                MAX(vt1.dateOfTireChange) as maxDateOfTireChange
           FROM vehicle_tire vt1
          GROUP BY vt1.vehicleId) dt ON vt2.vehicleId = dt.vehicleId
                                    AND vt2.dateOfTireChange = dt.dateOfTireChange;

...are two that come to mind.

The reason GROUP BY is not correct when applied to the whole table is that any columns you do not GROUP BY and that are also not the subject of aggregate functions MIN() MAX() AVG() COUNT(), etc., are assumed by the server to be columns that you know to be identical in every row of the groups established by the GROUP BY clause.

If, for example, I'm doing a query like this...

SELECT p.id,
       p.full_name,
       p.date_of_birth,
       COUNT(c.id) AS number_of_children
  FROM parent p LEFT JOIN child c ON c.parent_id = p.id
 GROUP BY p.id;

The correct way to write this query would be GROUP BY p.id, p.full_name, p.date_of_birth, because none of those columns are part of the aggregate function COUNT().

The MySQL optimization allows you to exclude those columns that you know have to, by definition, be the same on each group from the GROUP BY, and the server will fill those columns with data from any row in the group. Which row is not defined. As you can see, in the example, the parent's full_name would be the same in all rows within a group-by parent.id, and that is a case when this optimization is legitimate. The justification is that it allows the server to have to handle smaller values (fewer bytes) when executing the grouping... but in a query like yours where the ungrouped columns have different values within each group, you get an invalid result, by design.

The SQL_MODE ONLY_FULL_GROUP_BY disables this optimization.

Michael - sqlbot
  • 169,571
  • 25
  • 353
  • 427