I have a table that looks like this
ID | identifier | data | created_at
------------------------------------
1 | 500 | test1 | 2011-08-30 15:27:29
2 | 501 | test1 | 2011-08-30 15:27:29
3 | 500 | test2 | 2011-08-30 15:28:29
4 | 865 | test3 | 2011-08-30 15:29:29
5 | 501 | test2 | 2011-08-30 15:31:29
6 | 500 | test3 | 2011-08-30 15:31:29
What I need is the most up to date entry for each identifier, that could be decided by either the ID or the date in created_at. I assumed ID is the better choice due to the indexing.
I would expect this result set:
4 | 865 | test3 | 2011-08-30 15:29:29
5 | 501 | test2 | 2011-08-30 15:31:29
6 | 500 | test3 | 2011-08-30 15:31:29
The result should be ordered by either date or ID in ascending order.
It's important that this is a table that contains ~ 8 millions of rows.
I tried quite some approaches now with self joining and sub queries. Unfortunately all of those came out with either wrong results or half a decade of run time.
To provide an example:
SELECT lo1.*
FROM table lo1
INNER JOIN
(
SELECT MAX(id) MaxID, identifier, id
FROM table
GROUP BY identifier
) lo2
ON lo1.identifier= lo2.identifier
AND lo1.id = lo2.MaxID
ORDER BY lo1.id DESC
LIMIT 10
The above query takes very long and does sometimes not return the latest result for an identifier, not quite sure why though.
Does anyone have an approach that is able to fetch the required result sets and preferably does not take a decade?
As asked, here is the create code:
CREATE TABLE `table` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`identifier` int(11) NOT NULL,
`data` varchar(200) COLLATE latin1_bin NOT NULL,
`created_at` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `identifier` (`identifier`),
KEY `created_at` (`created_at`),
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_bin