1

I'm still having problems understanding how to read, understand and optimize MySQL explain. I know to create indices on orderby columns but that's about it. Therefore I am hoping you can help me tune this query:

EXPLAIN
SELECT specie.id, specie.commonname, specie.block_description, maximage.title,
       maximage.karma, imagefile.file_name, imagefile.width, imagefile.height,
       imagefile.transferred
FROM specie
INNER JOIN specie_map ON specie_map.specie_id = specie.id
INNER JOIN (
    SELECT *
    FROM image
    ORDER BY karma DESC
) AS maximage ON specie_map.image_id = maximage.id
INNER JOIN imagefile ON     imagefile.image_id = maximage.id
                        AND imagefile.type = 'small'
GROUP BY specie.commonname
ORDER BY commonname ASC
LIMIT 0 , 24 

What this query does is to find the photo with the most karma for a specie. You can see the result of this live:

http://www.jungledragon.com/species

I have a table of species, a table of images, a mapping table in between and an imagefile table, since there are multiple image files (formats) per image.

Explain output:

enter image description here

For the specie table, I have indices on its primary id and the field commonname. For the image table, I have indices on its id and karma field, and a few others not relevant to this question.

This query currently takes 0.8 to 1.1s which is too slow in my opinion. I have a suspicion that the right index will speed this up many times, but I don't know which one.

Peter Lang
  • 54,264
  • 27
  • 148
  • 161
Fer
  • 4,116
  • 16
  • 59
  • 102
  • Are you looking for only 1 entry per species? and that would be the one with the max karma value? I would start with an index on species, karma but don't see a reference to the species to the image file. – DRapp Mar 25 '12 at 12:10
  • Try putting an index on every column that's involved in a `JOIN`. From what I see there, putting an `INDEX` on `specie_map.specie_id` and `imagefile.image_id` should speed it up – Will Warren Mar 25 '12 at 13:40
  • @Drapp: this concerns a LIST of species, where for each specie the best image is selected. – Fer Mar 25 '12 at 14:01
  • @Ferdy, per my comment, I didn't see an association to species to karma. If you could list the basic structure of each tale, and show some sample data would help get this resolved for you. – DRapp Mar 25 '12 at 14:10
  • Also, you have a group by on common name of species. But if you have many "IDs" per a single common name species, do you ONLY want ONE for that overall species? You'll need to clarify as they are two different queries. – DRapp Mar 25 '12 at 14:57

3 Answers3

1

It would be better if you could provide the table structures and indexes. I came up with this alternative, it would be nice if you could try this and tell me what happens (I am curious!):

SELECT t.*, imf.* FROM (
  SELECT s.*, (SELECT id FROM image WHERE karma = MAX(i.karma) LIMIT 1) AS max_image_id 
  FROM image i 
  INNER JOIN specie_map smap ON smap.image_id = i.id
  INNER JOIN specie s ON s.id = smap.specie_id
  GROUP BY s.commonname 
  ORDER BY s.commonname ASC
  LIMIT 24
) t INNER JOIN imagefile imf
ON t.max_image_id = imf.image_id AND imf.type = 'small' 
georgepsarakis
  • 1,927
  • 3
  • 20
  • 24
  • Thanks for thinking along. This is the result of the above query: #1242 - Subquery returns more than 1 row – Fer Mar 26 '12 at 07:18
  • Now it does produce results, thanks. I think though that the above answer that is without a subquery is a lot faster. – Fer Mar 26 '12 at 15:33
1

The real problem is that there is no need to optimize MySQL explain. There is usually a query (or several queries) that you want to be efficient and EXPLAIN is a way to see if the execution of the query is going to happen as you expect it to happen.

That is you need to understand how the execution plan should look like and why and compare it with results of the EXPLAIN command. To understand how the plan is going to look like you should understand how indexes in MySQL work.

In the meantime, your query is a tricky one, since for efficient index using it has some limitations: a) simultaneous ordering and by a field from one table, and b) finding the last element in each group from another (the latter is a tricky task as itself). Since your database is rather small, you are lucky that you current query is rather fast (though you consider it slow).

I would rewrite the query in a bit hacky manner (I assume that there is at least one foto for each specie):

SELECT
   specie.id, specie.commonname, specie.block_description,
   maximage.title, maximage.karma,
   imagefile.file_name, imagefile.width, imagefile.height, imagefile.transferred
FROM (
    SELECT s.id,
           (SELECT i.id
            FROM specie_map sm
            JOIN image i ON sm.image_id = i.id
            WHERE sm.specie_id = s.id
            ORDER BY i.karma DESC
            LIMIT 1) as image_id
    FROM specie s
    ORDER BY s.commonname
    LIMIT 0, 24
) as ids
JOIN specie
  ON ids.id = specie.id
JOIN image as maximage
  ON maximage.id = ids.image_id
JOIN imagefile
  ON imagefile.image_id = ids.image_id AND imagefile.type = 'small';

You will need the following indexes:

  • (commonname) on specie
  • a composite (specie_id, image_id) on specie_map
  • a composite (id, karma) on image
  • a composite (image_id, type) on imagefile

Paging now should happen within the subquery.

The idea is to make complex computations within a subquery that operates with ids only and join for the rest of the data at the top. The data would be ordered in the order of the results of the subquery.

Community
  • 1
  • 1
newtover
  • 31,286
  • 11
  • 84
  • 89
1

I think you'd go a great way by getting rid of the subquery. Look at the first and last rows of the "explain" result - it's copying the entire "image" table to a temporary table. You could obtain the same result by replacing the subquery with INNER JOIN image and moving ORDER BY karma DESC to the final ORDER BY clause:

SELECT specie.id, specie.commonname, specie.block_description, maximage.title,
       maximage.karma, imagefile.file_name, imagefile.width, imagefile.height,
       imagefile.transferred
FROM specie
INNER JOIN specie_map ON specie_map.specie_id = specie.id
INNER JOIN image AS maximage ON specie_map.image_id = maximage.id
INNER JOIN imagefile ON     imagefile.image_id = maximage.id
                        AND imagefile.type = 'small'
GROUP BY specie.commonname
ORDER BY commonname ASC, karma DESC
LIMIT 0 , 24 
gcbenison
  • 11,723
  • 4
  • 44
  • 82
  • Perfect. This speeds up the query by a factor 4! I still struggly to understand why though, as in writing similar questions I often had problems grouping by a maximum result, somehow orderby did not work for me then, which is why I went for the subquery. – Fer Mar 26 '12 at 07:17