1

I have the following two (simplified for the sake of example) tables in my MySQL db:

DESCRIBE appname_item;

-----------------+---------------+------+-----+---------+----------------+
| Field           | Type          | Null | Key | Default | Extra          |
+-----------------+---------------+------+-----+---------+----------------+
| id              | int(11)       | NO   | PRI | NULL    | auto_increment |
| name            | varchar(200)  | NO   |     | NULL    |                |
+-----------------+---------------+------+-----+---------+----------------+

DESCRIBE appname_favorite;

+---------------+----------+------+-----+---------+----------------+
| Field         | Type     | Null | Key | Default | Extra          |
+---------------+----------+------+-----+---------+----------------+
| id            | int(11)  | NO   | PRI | NULL    | auto_increment |
| user_id       | int(11)  | NO   | MUL | NULL    |                |
| item_id       | int(11)  | NO   | MUL | NULL    |                |
+---------------+----------+------+-----+---------+----------------+

I'm trying to get a list of items ordered by the number of favorites. The query below works, however there are thousands of records in the Item table, and the query is taking up to a couple of minutes to complete.

SELECT `appname_item`.`id`, `appname_item`.`name`, COUNT(`appname_favorite`.`id`) AS `num_favorites` 
FROM `appname_item` 
LEFT OUTER JOIN `appname_favorite` ON (`appname_item`.`id` = `appname_favorite`.`item_id`) 
GROUP BY `appname_item`.`id`, `appname_item`.`name` 
ORDER BY `num_favorites` DESC;

Here are the results of EXPLAIN, which provides some insight as to why the query is so slow (type "ALL", "using temporary", and "using filesort" should all be avoided if possible.)

+----+-------------+--------------------+------+-----------------------------+-----------------------------+---------+-------------------------------+------+---------------------------------+
| id | select_type | table              | type | possible_keys               | key                         | key_len | ref                           | rows | Extra                           |
+----+-------------+--------------------+------+-----------------------------+-----------------------------+---------+-------------------------------+------+---------------------------------+
|  1 | SIMPLE      | appname_item       | ALL  | NULL                        | NULL                        | NULL    | NULL                          |  574 | Using temporary; Using filesort |
|  1 | SIMPLE      | appname_favorite   | ref  | appname_favorite_67b70d25   | appname_favorite_67b70d25   | 4       | appname.appname_item.id       |    1 |                                 |
+----+-------------+--------------------+------+-----------------------------+-----------------------------+---------+-------------------------------+------+---------------------------------+

I know that the easiest way to optimize the query is to add an Index, but I can't seem to figure out how to add an Index for a Count() query that involves a JOIN and an order_by. I should also mention that I am running this through the Django ORM, so would prefer to not change the sql query and just work on fixing and fine tuning the db to run the query in the most effective way.

I've been trying to figure this out for a while, so any help would be much appreciated!

UPDATE

Here are the indexes that are already in the db:

+--------------------+------------+-----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table              | Non_unique | Key_name                    | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------------------+------------+-----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| appname_favorite   |          0 | PRIMARY                     |            1 | id          | A         |         594 |     NULL | NULL   |      | BTREE      |         |
| appname_favorite   |          1 | appname_favorite_fbfc09f1   |            1 | user_id     | A         |          12 |     NULL | NULL   |      | BTREE      |         |
| appname_favorite   |          1 | appname_favorite_67b70d25   |            1 | item_id     | A         |         594 |     NULL | NULL   |      | BTREE      |         |
+--------------------+------------+-----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
rolling stone
  • 12,668
  • 9
  • 45
  • 63

3 Answers3

2

Actually you can't avoid filesort because the count is determined at the calculation time and is unknown in the index. The only solution I can imagine is to create a composite index for table appname_item, which may help a little or not, depending on your particular data:

ALTER TABLE appname_item ADD UNIQUE INDEX `item_id_name` (`id` ASC, `name` ASC);
Karolis
  • 9,396
  • 29
  • 38
  • thanks @Karolis. i tried adding the index above, but it didn't seem to have any effect. to answer your question, the id's in the `appname_favorite` table do have gaps (users can unfavorite items they had previously favorited). – rolling stone Aug 23 '11 at 05:34
1

There is nothing wrong with your query - it looks good.

It could be the the optimizer has out-of-date info about the table. Try running this:

ANALYZE TABLE <tableaname>;

for all tables involved.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
0

Firstly, for the count() function, you can check this answer to know more detail: https://stackoverflow.com/a/2710630/1020600

For example, using MySQL, count(*) will be fast under a MyISAM table but slow under an InnoDB. Under InnoDB you should use count(1) or count(pk)

If your storage engines is MYISAM and if you want to count on row (i guess so), use count(*) is enough.

From your EXPLAIN, I found there's no Key for appname_item, if i try to add a condition

where `appname_item`.`id` = `appname_favorite`.`item_id`

then the "key" appears. so funny but it's work.

The final sql like this

explain SELECT `appname_item`.`id`, `appname_item`.`name`, COUNT(*) AS `num_favorites` 
FROM `appname_item` 
LEFT OUTER JOIN `appname_favorite` ON (`appname_item`.`id` = `appname_favorite`.`item_id`) 
where `appname_item`.`id` = `appname_favorite`.`item_id`
GROUP BY `appname_item`.`id`, `appname_item`.`name` 
ORDER BY `num_favorites` DESC;

+----+-------------+------------------+--------+---------------+---------+---------+-------------------------------+------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key
| key_len | ref | rows | Extra
| +----+-------------+------------------+--------+---------------+---------+---------+-------------------------------+------+----------------------------------------------+ | 1 | SIMPLE | appname_favorite | index | item_id | item_id | 5 | NULL | 2312 | Using index; Using temporary; Using filesort | | 1 | SIMPLE | appname_item | eq_ref | PRIMARY | PRIMARY | 4 | test.appname_favorite.item_id | 1 | Using where
| +----+-------------+------------------+--------+---------------+---------+---------+-------------------------------+------+----------------------------------------------+

On my computer, table appname_item has 1686 rows and appname_favorite has 2312 rows, old sql takes from 15 to 23ms. new sql takes 3.7 to 5.3ms

Community
  • 1
  • 1
Hắc Huyền Minh
  • 1,025
  • 10
  • 13