18

The Table:

(`post_id`, `forum_id`, `topic_id`, `post_time`) 
(79, 8, 4, '2012-11-19 06:58:08');
(80, 3, 3, '2012-11-19 06:58:42'),
(81, 9, 9, '2012-11-19 06:59:04'),
(82, 11, 6, '2012-11-19 16:05:39'),
(83, 9, 9, '2012-11-19 16:07:46'),
(84, 9, 11, '2012-11-19 16:09:33'),

The Query:

SELECT  post_id, forum_id, topic_id FROM posts 
GROUP BY topic_id 
ORDER BY post_time DESC
LIMIT 5

The Results:

[0] => [post_id] => 84 [forum_id] => 9 [topic_id] => 11  
[1] => [post_id] => 82 [forum_id] => 11 [topic_id] => 6  
[2] => [post_id] => 81 [forum_id] => 9 [topic_id] => 9  
[3] => [post_id] => 80 [forum_id] => 3 [topic_id] => 3  
[4] => [post_id] => 79 [forum_id] => 8 [topic_id] => 4

The Problem:

How to rewrite the query so that it returns post_id -> 83 instead of post_id -> 81 ?

They both have the same forum and topic ids, but post_id -> 81 has an older date than post_id -> 83.

But it seems that Group By gets the 'first' record and not the 'newest' one.

I tried changing the query to

SELECT  post_id, forum_id, topic_id, MAX(post_time)

but that returns both post_id 81 and 83

shanebp
  • 1,904
  • 3
  • 17
  • 28

5 Answers5

27

If you select attributes that are not used in the group clause, and are not aggregates, the result is unspecified. I.e you don't know which rows the other attributes are selected from. (The sql standard does not allow such queries, but MySQL is more relaxed).

The query should then be written e.g. as

SELECT post_id, forum_id, topic_id
FROM posts p
WHERE post_time =
  (SELECT max(post_time) FROM posts p2
   WHERE p2.topic_id = p.topic_id
   AND p2.forum_id = p.forum_id)
GROUP BY forum_id, topic_id, post_id
ORDER BY post_time DESC
LIMIT 5;

or

SELECT post_id, forum_id, topic_id FROM posts
NATURAL JOIN
(SELECT forum_id, topic_id, max(post_time) AS post_time
 FROM posts
 GROUP BY forum_id, topic_id) p
ORDER BY post_time
LIMIT 5;
Chnossos
  • 9,971
  • 4
  • 28
  • 40
Terje D.
  • 6,250
  • 1
  • 22
  • 30
  • I used you first query, but it wond not work before i changed ... WHERE post_time = ... TO ... WHERE post_time IN ... as i got more that one result back. – Benjamin Karlog Sep 02 '15 at 08:51
  • Why is this such a hassle? using SQL it would be just `SELECT * FROM [table] GROUP BY [column]` which select the newest rather than the oldest, seems logical right? – Jean-Paul May 22 '17 at 10:46
  • Also add how to add count. – shreyas d Nov 15 '18 at 12:21
12

It's not very beautiful , but it works:

SELECT * FROM (SELECT  post_id, forum_id, topic_id FROM posts
ORDER BY post_time DESC) as temp
GROUP BY topic_id
4

try something like

SELECT post_id, forum_id, topic_id 
FROM   (SELECT post_id, forum_id, topic_id
        FROM posts
        ORDER BY post_time DESC) 
GROUP BY topic_id 
ORDER BY topic_id desc
LIMIT 0,5

change the order by and limit as needed.

NappingRabbit
  • 1,888
  • 1
  • 13
  • 18
  • The result is an empty array. – shanebp Nov 19 '12 at 18:40
  • I edited the query slightly. I do not understand why it would not work. I did not try it, but it is a common technique. nesting queries that is. assuming that the inner query returns the right result, the outer query should give a result from that. – NappingRabbit Nov 19 '12 at 18:44
  • if you wish to debug, first run the inner query. – NappingRabbit Nov 19 '12 at 18:45
  • Tried your edit, still returns an empty array. The inner query returns a single result, the most recent post. – shanebp Nov 19 '12 at 18:51
  • well there is your problem then. I just took the query from your question. one second I will rewrite it – NappingRabbit Nov 19 '12 at 19:04
  • 1
    This seems like a better option than the accepted option in terms of performance. We had a big table, using subquery in condition took around 45 secs while this method ran in less one second. – Selvam Oct 31 '14 at 09:29
  • Same queries have to be run, I thought it was best practice to put subqueries as 'far up as possible' and save the conditional filtering for programatic solutions... it is almost always much better performance. Everyone likes to do the fancy joins, but the elegant solution must give way to the practical. Disclaimer: This comment is not helpful in any way, but I do not believe it detracts from the original thread. – NappingRabbit Nov 19 '14 at 00:16
0

Maybe not the best way of doing it but sometimes the function group_concat() can be userfull, it will return a string of all aggregated values sorted like you want and seperated by comma (coupled value are separated by space). I then use the function SPLIT_STRING() to cut the first id in the string.

SELECT  
post_id, 
SPLIT_STRING( group_concat( forum_id, post_time ORDER BY post_time DESC ) ,' ',1 )as forum_id, 
SPLIT_STRING( group_concat( topic_id, post_time ORDER BY post_time DESC ) ,' ',1 )as topic_id ,
FROM posts 
GROUP BY topic_id 
ORDER BY post_time DESC
LIMIT 5

So the aggregated forum_id, post_time will be like this :

81 2012-11-19 06:59:04,83 2012-11-19 16:07:46

So you need to work with a string representation of integers and datetime couples, each couples separated by comma so I used this function to get the first INT :

CREATE FUNCTION SPLIT_STRING(str VARCHAR(255), delim VARCHAR(12), pos INT)
RETURNS VARCHAR(255)
RETURN REPLACE(SUBSTRING(SUBSTRING_INDEX(str, delim, pos),
       LENGTH(SUBSTRING_INDEX(str, delim, pos-1)) + 1),
       delim, '');

Note : function SPLIT_STRING(str, delim, pos) was found here : Equivalent of explode() to work with strings in MySQL

Community
  • 1
  • 1
0

This will also work fine for you.

SELECT *
FROM (
  SELECT post_id, forum_id, topic_id FROM posts
  ORDER BY post_time DESC
  LIMIT 5
) customeTable
GROUP BY topic_id
josemmo
  • 6,523
  • 3
  • 35
  • 49
Vikash
  • 643
  • 7
  • 13