MySQL - WHERE IN causes 100% cpu and no results

Question

This query takes a split second

select * from tbl1 limit 0,1

This query takes a second

SELECT Distinct(col2) FROM tbl2 WHERE col3 = 2

This query eats 100% cpu and takes 190 seconds duration (0s fetch) to return the result

select * from tbl1 WHERE ID IN (SELECT Distinct(col2) FROM tbl2 WHERE col3 = 2) limit 0,1

I am trying to run this query on the full data set (not just limiting it to one record)

What could be causing the performance problem, my table structures?

[The `IN` gets repeatedly re-evaluated for each row in the outer query.](http://stackoverflow.com/questions/3417074/why-would-an-in-condition-be-slower-than-in-sql/3417190#3417190) — Martin Smith, Oct 24 '11 at 14:39
I thought it might get repeatedly get re-evaluated, is there a quicker way to achieve this without 'where in'? - I need to change the select to an update when I correct the performance issue — Chris, Oct 24 '11 at 14:41

Sean Bright · Accepted Answer · 2011-10-24T14:50:31.350

3

Subqueries in MySQL are notoriously slow. You can speed this up with a join:

SELECT A.*
FROM
    tbl1 A
    INNER JOIN (
        SELECT DISTINCT col2
        FROM
            tbl2
        WHERE
            col3 = 2
    ) X ON X.col2 = A.ID
LIMIT 0, 1

To update tbl1:

UPDATE
    tbl1 A
    INNER JOIN (
        SELECT DISTINCT col2
        FROM
            tbl2
        WHERE
            col3 = 2
    ) X ON X.col2 = A.ID
SET
    A.SomeCol = 'value'

edited Oct 24 '11 at 14:50

answered Oct 24 '11 at 14:39

Sean Bright

118,630
17
138
146

Thanks Sean, I have been using the select to try and figure out how long it takes just to select one row using my method, but really i need to do a mass update, is it possible to use the join method within an update? – Chris Oct 24 '11 at 14:44
With the assumption that `tbl1.ID` is unique or PK, then this is correct. – ypercubeᵀᴹ Oct 24 '11 at 14:46
Chris, assuming you want to update a column on tbl1, sure. I'll update my example. – Sean Bright Oct 24 '11 at 14:49
I will try this and see if it works; UPDATE FROM tbl1 AS a INNER JOIN ( select distinct(col2) from tbl2 where col3 = 2 ) x on x.col2 = a.id SET a.processed = 0 – Chris Oct 24 '11 at 14:50
Thanks Sean, youve been a great help, I am ever so glad that there is a better way of doing this :) – Chris Oct 24 '11 at 14:52
The join method takes a split second and appears to have worked perfectly :D – Chris Oct 24 '11 at 14:55

score 0 · Answer 2 · answered Oct 24 '11 at 14:44

Add an index on tbl2, on col3 or (even better) on (col3,col2).

Then use this (no need for the DISTINCThere):

SELECT * 
FROM tbl1 
WHERE ID IN (SELECT col2 FROM tbl2 WHERE col3 = 2) 
LIMIT  0,1

or even better, to avoid the IN (SELECT ...), this version:

SELECT * 
FROM tbl1 t
WHERE EXISTS 
      ( SELECT * 
        FROM tbl2 t2
        WHERE t2.col3 = 2 
          AND t2.col2 = t.ID
      ) 
LIMIT  0,1

MySQL - WHERE IN causes 100% cpu and no results

2 Answers2