15

The following query is intended to receive a list of unread messages by user. It involves 3 tables: recipients contains a relation of users to message IDs, messages contains the messages themselves, and message_readers contains a list of which users have read which messages.

The query reliably takes 4.9 seconds - this is seriously hurting our performance, and is especially worrisome since we hope the database will eventually be several orders of magnitude larger. Granted, it's an inherently heavy query, but the data set is tiny, and intuitively it seems that it should be much faster. The server has enough memory (32gb) that the entire database should be loaded in RAM at all times, and there's nothing else running on the box.

The tables are all tiny:

recipients: 23581
messages: 9679
message_readers: 2685

The query itself:

SELECT 
    m.*
FROM 
    messages m
INNER JOIN recipients r ON r.message_id = m.id
LEFT JOIN message_readers mr ON mr.message_id = m.id
WHERE
    r.id = $user_id
    AND (mr.read_by_id IS NULL OR mr.read_by_id <> $user_id)

The explain plan is pretty straightforward:

+----+-------------+-------+--------+-----------------------------------+-----------------------------------+---------+--------------------------------+-------+-------------+
| id | select_type | table | type   | possible_keys                     | key                               | key_len | ref                            | rows  | Extra       |
+----+-------------+-------+--------+-----------------------------------+-----------------------------------+---------+--------------------------------+-------+-------------+
|  1 | SIMPLE      | r     | ref    | index_recipients_on_id            | index_recipients_on_id            | 768     | const                          | 11908 | Using where |
|  1 | SIMPLE      | m     | eq_ref | PRIMARY                           | PRIMARY                           | 4       | db.r.message_id                |     1 | Using index |
|  1 | SIMPLE      | mr    | ALL    | NULL                              | NULL                              | NULL    | NULL                           |  2498 | Using where |
+----+-------------+-------+--------+-----------------------------------+-----------------------------------+---------+--------------------------------+-------+-------------+

There IS an index on message_readers.read_by_id, but I guess it can't really use it because of the IS NULL condition.

I'm using all default settings except for the following:

key_buffer=4G
query_cache_limit = 256M
query_cache_size = 1G
innodb_buffer_pool_size=12G

Thanks!

levand
  • 8,440
  • 3
  • 41
  • 54
  • Good question, almost all available information given :) I could be mistaken, but how are the indexes on the mr table? Probably nothing, but still :) – Nanne Jun 27 '11 at 18:58
  • 2
    If you use WHERE with count() the query will go through all the records in the table even though you need just the count of IDs. http://www.mysqlperformanceblog.com/2006/12/01/count-for-innodb-tables/ There are a lot of similar articles out there. – AR. Jun 27 '11 at 18:59
  • How big is each table involved in this? Left Joins include all rows from a result set and can be slow no matter how you configure things. – S.Lott Jun 27 '11 at 18:59
  • Your query doesn't retrieve list of messages, but rather the number of messages. Is it definitely the right query? – Tom Chantler Jun 27 '11 at 18:59
  • @Dommer, @AR - yes, the actual query gets the data, not just the count. I pasted in this form because it's easier for me to play around with. The query time is almost identical to the SELECT * form that the app is actually using. – levand Jun 27 '11 at 19:11
  • @Nanne, message_readers has a valid index on the read_by_id column, and no other indexes (beyond it's primary key, an auto-incrementing column set up by Rails. – levand Jun 27 '11 at 19:12
  • What's the query time if you exclude message reader table entirely? Fast? Then reform your join as a subquery – Bob Probst Jun 27 '11 at 19:19
  • Have you tried each of our queries with retrieving the actual IDs rather than doing `COUNT(*)`, as per AR's advice above? – Tom Chantler Jun 27 '11 at 19:43
  • 1
    @TrickyNixon, if I exclude the message_reader table it blazes at 0.15 seconds. But using a AND NOT EXISTS with a subquery like the one in user569090's answer slows it down to 17 seconds. – levand Jun 27 '11 at 19:45
  • @Domner, for each query I've also tried `SELECT m.id` as well as `SELECT count(m.id` and the perf is approximately the same. – levand Jun 27 '11 at 19:46
  • Have you tried my final version which gets rid of the `OR` clause? – Tom Chantler Jun 27 '11 at 19:49

6 Answers6

4

Assuming that message_readers is a subset of recipients, I recommend making the following changes:

  1. Get rid of the message_readers table and replace it with a flag on the recipients table. This will eliminiate the null check and remove a join.

  2. It probably already is, but make sure your clustered index for recipients is id, message_id rather than message_id, id, since nearly all searches for messages will be based on the recipients.

Here is the SELECT that results:

SELECT
    r.whatever,
    m.whatever,
    -- ...
FROM
    recipients r
    INNER JOIN messages m ON m.id = r.message_id
WHERE
    r.id = $user_id
    AND r.read_flag = 'N'

UPDATE

Here is the correct version of your query using the existing scheme:

SELECT
    r.whatever,
    m.whatever,
    -- ...
FROM
    recipients r
    INNER JOIN messages m ON r.message_id = m.id
    LEFT JOIN message_readers mr ON mr.read_by_id = r.id 
                                 AND mr.message_id = m.id
WHERE
    r.id = $user_id
    AND mr.read_by_id IS NULL

This assumes that your clustered indexes are what would be expected:

recipients: id, message_id
messages: id
message_readers: read_by_id, message_id
Jeffrey L Whitledge
  • 58,241
  • 9
  • 71
  • 99
  • Thanks... denormalization is definitely the next step if I can't get it faster. But I'm anxious to see if I'm doing something wrong first, since my intuition tells me that NO query should be that slow with that little data. – levand Jun 27 '11 at 19:51
  • 1
    @levand - Your query is slow, because it is an inversion of the normal join. You are selecting everything in message_readers that does NOT match the read_by_id, which should be the first column in the primary index. As it is, no index is being used on mr and the search for message_id (which is not independantly indexed (as it shouldn't be)) is creating a giant table-scanning cartisian mess! – Jeffrey L Whitledge Jun 27 '11 at 20:14
  • You, sir, win the prize. Adding those indexes and using the query form you mention brings the query time down to 0.15 seconds. Well done! Thank you! – levand Jun 27 '11 at 22:03
  • So further explication of this for my own reference: you switch which table you start with and which you bring in with the INNER JOIN, but that doesn't matter since inner joins are actually commutative. The key is that instead of making negative assertions about message_readers.read_by_id in the where clause, you make positive assertions about it in the join conditions. This means that I can utilize my indexes on message_readers. – levand Jun 27 '11 at 22:11
  • @levand - That is correct. I just rearranged the table join order so I could reason about it more clearly. It's the join columns that makes the difference. – Jeffrey L Whitledge Jun 28 '11 at 02:02
1

Assuming you just want the count as shown in your query), what happens if you change the joins like so?

I use MSSQL and this has the potential to speed it up. I've never used MySQL, but it should work, shouldn't it?

SELECT     count(m.id)
FROM       messages m
INNER JOIN recipients r ON r.message_id = m.id AND r.id = $user_id
LEFT JOIN  message_readers mr ON mr.message_id = m.id AND (mr.read_by_id IS NULL OR mr.read_by_id <> $user_id)

EDIT: What about this for a mad idea? I thought you could split out the OR into two separate left joins and then take the record if either of those returns something.

SELECT     count(m.id)
FROM       messages m
LEFT JOIN  recipients r ON r.message_id = m.id AND r.id = $user_id
LEFT JOIN  message_readers mr ON mr.message_id = m.id AND mr.read_by_id IS NULL
LEFT JOIN  message_readers mr2 ON mr2.message_id = m.id AND mr2.read_by_id <> $user_id
WHERE      COALESCE(mr.message_id, mr2.message_id) IS NOT NULL
Tom Chantler
  • 14,753
  • 4
  • 48
  • 53
  • Unfortunately I was only using the count(*) form so it didn't spit back tons of data to my console when experimenting with it. I will in fact need the actual data as well, not just the count. – levand Jun 27 '11 at 19:04
  • Remove `mr.read_by_id` from the JOIN criterion, as that asks the query to return *rows which do not, by definition, exist*. You only need that to filter post-join results in the WHERE clause. You just need to incorporate `read_by_id <> user_id` in there. – Dan J Jun 27 '11 at 19:07
  • @djacobson, that's the point... I want to find messages that have not been read, i.e, are not present in the message_readers table (or present with a different reader id). Using a left join ensures I get all relevant rows in recipients, and for each one, can tell if it was read or not. – levand Jun 27 '11 at 19:09
  • I had a case in SQL recently where the `INNER JOIN` was slowing it down loads. Sounds odd, but bear with me on this one... What happens if you change the `INNER JOIN` to a `LEFT JOIN` and then add `WHERE r.message_id IS NOT NULL` ? – Tom Chantler Jun 27 '11 at 19:18
  • @Dommer. If that were the case, I'd have to assume that your statistics were out of date and the optimizer was choosing a bad plan. That could be the case here too. Check statistics. – Bob Probst Jun 27 '11 at 19:23
  • @Dommer, that performs about the same as with the INNER JOIN. – levand Jun 27 '11 at 19:29
  • Okay, I think this is my last attempt... see the edit above! :-) – Tom Chantler Jun 27 '11 at 19:41
  • That last query has been running for over a minute now so... not faster. ;) Thanks for all your suggestions... even though I didn't figure out the problem, they helped me wrap my head around the query space. I think the problem's probably somewhat deeper in the database configuration, rather than the query itself... – levand Jun 27 '11 at 19:59
1

You can get rid of the IS NULL-condition when you rewrite your query like this:

SELECT 
    count(m.id)
FROM 
    messages m
INNER JOIN recipients r ON re.message_id = m.id
WHERE r.id = $user_id
  AND NOT EXISTS
         (SELECT mr.id 
            FROM message_readers mr 
           WHERE mr.message_id = m.id
             AND mr.read_by_id = $user_id)

Basically this reads like: get all messages for recipient where not in message_readers and describes the problem simpeler.

Jeff
  • 736
  • 5
  • 14
  • Curious know if this improves things. A subquery looks like it could help somewhere. The inequalities combined with an OR can slow things down. 5 seconds is still really long for this number of records no matter how it's written though. I suspect another culprit. – Bob Probst Jun 27 '11 at 19:16
  • This version of the query reliably takes 17 seconds... so, not an improvement. Thanks, though! – levand Jun 27 '11 at 19:25
  • Just an FYI for anyone who stumbles across this: The reason this didn't work is because using a subquery in the WHERE on MySQL causes it to re-run that subquery for every row. Since the original tables have 20,000 rows that's the equivalent of running a separate query 20,000 times, rather than reusing a data set like a JOIN would. – Zimzat May 16 '13 at 18:39
1

What's the query time for

select distinct message_id
  from message_readers
 where read_by_id <> $user_id

Note: The "is null" logic should be caught by this since null isn't equal to anything

If this is fast then try this:

SELECT count(m.id)
FROM messages m
INNER JOIN recipients r ON r.message_id = m.id
where r.id = $user_id
and m.id in (
    select distinct message_id
      from message_readers
     where read_by_id <> $user_id)

Original answer didn't work: Try including message_id and id in a covering index on recipients and see what happens.

Bob Probst
  • 9,533
  • 8
  • 32
  • 41
  • Created an index using `CREATE UNIQUE INDEX recipients_test_index ON recipients (message_id, id)`. That should do it, right? Query time went _up_ to 5.8 seconds. – levand Jun 27 '11 at 19:39
  • Also, I asked above about query performance if you omit the message_readers table entirely. What's the speed for joining just Recipient and Messages – Bob Probst Jun 27 '11 at 19:45
  • Yes, it showed up in the explain plan. Also, as I mentioned above, omitting the message_readers table entirely makes it very fast at 0.15 seconds. – levand Jun 27 '11 at 19:49
  • The query time for the first query you mention is blazing fast. The second query is fast but returns zero rows because it only returns messages that were address to my user that have been read by someone OTHER than my user. The most common case for an unread messages is that there is NO corresponding row in message_readers, and this query won't catch that. – levand Jun 27 '11 at 21:52
1

Unless I am missing something, you don't appear to need the messages table at all. What you really want is the number of message ids that appear for this user in recipients, and do not appear for this user in message_readers.

If I'm right above, you can accomplish what you want with a MINUS:

SELECT count(message_id)
  FROM (
        SELECT r.message_id  
          FROM recipients r 
         WHERE r.id = $user_id
        MINUS
        SELECT mr.message_id
          FROM message_readers mr
         WHERE mr.read_by_id = $user_id
       )

This avoids joins entirely. Now if you do indeed need data from the messages table for your production query, you can join the messages table to this subquery (or stick it in an IN clause).

It's possible that I'm off base here as my experience is in Oracle-land but MySQL supports MINUS so this is probably worth a shot.

Jon Riegel
  • 21
  • 2
  • Thanks for the tip... Unfortunately I only pasted in the count() version of the query, but in practice I actually need the full message row, m.*. Query time is the same for both forms, however, with my original query. I've edited the question to reflect this. I am interested in your MINUS approach, though... going to give that a shot. – levand Jun 27 '11 at 20:07
1

an comment count(m.id) means count not null values but m.id is never null so its extra. well try with that

SELECT count(*)
FROM 
messages m
INNER JOIN recipients r ON r.message_id = m.id  
left join 
(
    select m.id
    messages m
    INNER JOIN message_readers mr 
    ON mr.message_id = m.id     
    and (mr.read_by_id <> $user_id or mr.read_by_id IS NULL)        
)as sub 
on sub.id = m.id        
WHERE r.id = $user_id

one doubt maybe is correct in you business logic why all user can read incomming messages (mr.read_by_is null ) and why an message can be read for the others or do not specific receiver (mr.read_by_id <> $user_id)

its a pool, I guess

one better approach is change the inner in subquery by an exists. see that "mr.read_by_id IS NULL" is not neccesary that is if mr_read_by_id is null "so means what " mr.read_by_id = $user_id " is false"

SELECT count(*)
FROM 
messages m
INNER JOIN recipients r ON r.message_id = m.id  
left join 
(
    select m.id
    messages m
            where not exists(select * from message_readers mr 
    where mr.message_id = m.id      
    and mr.read_by_id = $user_id)
)as sub 
on sub.id = m.id        
WHERE r.id = $user_id
Carlos Cocom
  • 937
  • 1
  • 8
  • 23