0

I have a table with multiple rows per "website_id"

CREATE TABLE `MyTable` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `tagCheckResult` int(11) DEFAULT NULL,
  `website_id` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `IX_website_id` (`website_id`),
) ENGINE=InnoDB;

I am trying to select the latest entry per website_id

-- This creates a temporary table with the last entry per website_id, and joins it
-- to get the entire row
SELECT *
FROM `WebsiteStatus` ws1 
JOIN (
    SELECT MAX(id) max_id, website_id FROM `WebsiteStatus` 
    GROUP BY website_id) ws2
ON ws1.id = ws2.max_id

Now, I know the correct way to get the last row per website_id is as above. My qusetion is - I also tried the following simpler query, at it seemed to return the exact same results as above:

SELECT * FROM `WebsiteStatus` 
GROUP BY website_id
ORDER BY website_id DESC

I know that in principle GROUP BY without operators (e.g. MAX), like I do in my 2nd query can return any of the relevant rows ... but in practice it returns the last one. Is there an implementation detail in mysql that guarantees this is always the case?

(Just asking for academic curiosity, I know the 1st query is "more correct").

ripper234
  • 222,824
  • 274
  • 634
  • 905
  • Related: http://stackoverflow.com/questions/537223/mysql-control-which-row-is-returned-by-a-group-by – ripper234 Dec 04 '12 at 10:11
  • There is no guarantee. It may appear to be consistent in this case, maybe because the clustered index is `id`, you want the maximum of it, and you order by `DESC`. Try a different combination, say you want the `MAX(tagCheckResult)`, how would you get it? – ypercubeᵀᴹ Dec 04 '12 at 10:23
  • I believe the specific returned row is not defined. Might be the right one but not always (and most flavours of SQL will just error if you try to return a non aggregate row not specified in the GROUP BY clause) – Kickstart Dec 04 '12 at 10:25
  • 2
    See also the answer by Quassnoi in this question: **[Why does MySQL add a feature that conflicts with SQL standards?](http://stackoverflow.com/questions/7594865/why-does-mysql-add-a-feature-that-conflicts-with-sql-standards/)**, where he mentions *"The only thing guaranteed is that the column values belong to some of the records sharing the grouping expression (**not even to one record!**)."* – ypercubeᵀᴹ Dec 04 '12 at 10:26
  • 1
    What values are chosen depend on the execution plan, that's why you get different results with different orderings. – ypercubeᵀᴹ Dec 04 '12 at 10:28

0 Answers0