3

I'm looking for a way to optimize one SQL query that I have. I'm trying to get how many poems with a certain genre.

Query looks like this:

SELECT
    COUNT(*)  
FROM
    `poems`
WHERE `id` IN (    
                  SELECT `poem_id`
                  FROM `poems_genres`  
                  WHERE `genre_title` = 'derision'
              )
       AND `status` = 'finished';

It takes too long (about 6-10 seconds), because it can't use indexes (because of IN() I think?). Is there a way to rewrite this query in different way to get the same result faster?

Silver Light
  • 44,202
  • 36
  • 123
  • 164

2 Answers2

11

MySQL has a problem with in where it repeatedly re-evaluates uncorrelated sub queries as though they were correlated. Does rewriting as a join improve things?

SELECT
    COUNT(distinct p.`id`)  
FROM `poems` p
JOIN `poems_genres` pg
ON  p.`id` = pg.`poem_id`  
WHERE pg.`genre_title` = 'derision' AND p.`status` = 'finished';

If not then according to this article (see the section "How to force the inner query to execute first") wrapping it up in a derived table might help.

SELECT
    COUNT(*)  
FROM
    `poems`
WHERE `id` IN
(
 select  `poem_id` from ( SELECT `poem_id`
                  FROM `poems_genres`  
                  WHERE `genre_title` = 'derision') x

) AND `status` = 'finished';
Community
  • 1
  • 1
Martin Smith
  • 438,706
  • 87
  • 741
  • 845
1

You could also use a EXISTS clause and correlate on the id and poem_id fields

SELECT
    COUNT(*)  
FROM
    `poems` p1
WHERE EXISTS
(
 SELECT `poem_id`
                  FROM `poems_genres`  pg
                  WHERE `genre_title` = 'derision'  and p1.id = pg.poem_id

) AND `status` = 'finished';

Difference between this and IN is it will try to use Indexes instead doing full table scan.

Greg
  • 1,671
  • 2
  • 15
  • 30