5

I have this query in a table with about 100k records, it runs quite slow (3-4s), when I take out the group it's much faster (less than 0.5s). I'm quite at loss what to do to fix this:

SELECT msg.id,
       msg.thread_id,
       msg.senderid,
       msg.recipientid, 
       from_user.username AS from_name,
       to_user.username AS to_name
FROM msgtable AS msg
LEFT JOIN usertable AS from_user ON msg.senderid = from_user.id
LEFT JOIN usertabe AS to_user ON msg.recipientid = to_user.id
GROUP BY msg.thread_id
ORDER BY msg.id desc

msgtable has indexes on thread_id, id, senderid and recipientid.

explain returns:

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  msg ALL NULL    NULL    NULL    NULL    162346  Using temporary; Using filesort
1   SIMPLE  from_user   eq_ref  PRIMARY PRIMARY 4   db.msg.senderid 1    
1   SIMPLE  to_user eq_ref  PRIMARY PRIMARY 4   db.msg.recipientid  1

Any ideas how to speed this up while returning the same result (there are multiple messages per thread, i want to return only one message per thread in this query).

thanks in advance.

Sherif Buzz
  • 1,218
  • 5
  • 21
  • 38
  • 1
    What about `usertable` indexes? Can you please run `EXPLAIN ` and post the results? – Frankie Oct 26 '10 at 16:25
  • 1
    Typically, you have to declare all columns mentioned in the SELECT that are not encapsulated by aggregate functions (COUNT, SUM, MIN, MAX, etc) in the GROUP BY. Would `DISTINCT` serve you better in this situation? – OMG Ponies Oct 26 '10 at 16:26
  • Why the left join? Wouldn't each message require a recipient and sender? – John Hartsock Oct 26 '10 at 16:29
  • Yes a message would require a recipient and sender, hence left join not required. – Sherif Buzz Oct 26 '10 at 16:41
  • 1
    Also, paste the output of `show create table msgtable`. – Glen Solsberry Oct 26 '10 at 17:26
  • @Sherif Buzz when you reply to comments, like the one where on added `EXPLAIN` you can tag the people that asked using @Frankie @OMG Ponies so that they know something new is up. Thanks! – Frankie Oct 26 '10 at 17:44

2 Answers2

1

try this:

select m.thread_id, m.id, m.senderid, m.recipientid, 
       f.username as from_name, t.username as to_name
from msgtable m
join usertable f on m.senderid = f.id
join usertable t on m.recipientid = t.id
where m.id = (select MAX(id) from msgtable where thread_id = m.thread_id)

Or this:

select m.thread_id, m.id, m.senderid, m.recipientid, 
       (select username from usertable where id = m.senderid) as from_name,
       (select username from usertable where id = m.recipientid) as to_name
from msgtable m
where m.id = (select MAX(id) from msgtable where thread_id = m.thread_id)

Why were the user tables left joined? Can a message be missing a from or to?..

Fosco
  • 38,138
  • 7
  • 87
  • 101
  • Thanks a million, I tried both options - first option about 1.5s, second option about 2s. anything else i can do to bring it down more ? – Sherif Buzz Oct 26 '10 at 16:56
  • @Sherif well, do you really need ALL of the threads at once?... Is there a datetime column that could be used to cut down on the required data? – Fosco Oct 26 '10 at 16:59
  • @Forsco, actually this query is translated into is a select count(*) of query by a paging class - yes i need all threads as this is for an admin function... – Sherif Buzz Oct 26 '10 at 17:17
  • @Sherif so do you need the real data or just the count? Or is the result cached and then paginated? – Fosco Oct 26 '10 at 17:21
  • @Sherif Ok, just checking. Glad we were able to cut the time down, not sure I could do much more for you. – Fosco Oct 26 '10 at 18:09
0

The biggest problem is that you have no usable indexes on msgtable. Create an index on at least senderid and recipientid, and it should help the speed of your query, as it will limit the number of results needing to be scanned.

Glen Solsberry
  • 11,960
  • 15
  • 69
  • 94