4

I want to do a count like this (as an example, not really counting dogs):

SELECT COUNT(*)
FROM dogs AS d INNER JOIN races AS r ON d.race_id = r.race_id
    LEFT INNER colors AS c ON c.color_id = r.color_id
WHERE d.deceased = 'N'

I have 130,000 dogs in a MyISAM table. Races has 1,500 records and is an InnoDB table with 9 columns, colors has 83 records and is also InnoDB and has two columns (id, name). The *_id columns are all primary keys, I have indices on the 'foreign' keys dogs.race_id and races.color_id and I have an index on dogs.deceased. None of the mentioned columns can be NULL.

# mysql --version
mysql  Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) using readline 5.2

Now the thing is: In my PhpMyAdmin this query takes 1.8 secs (with SQL_NO_CACHE) with a count result of 64,315. Changing COUNT(*) to COUNT(d.dog_id) or COUNT(d.deceased) also takes the query to run for 1.8 secs with the same result.

But when I remove the COUNT() and just do SELECT * or SELECT dog_id, it takes about 0.004 secs to run (and then counting the result with something like mysql_num_rows()).

How can this be? And how can I make the COUNT() work faster?

Edit: Added an EXPLAIN below

EXPLAIN SELECT COUNT(*)
FROM dogs AS d INNER JOIN races AS r ON d.race_id = r.race_id
    INNER JOIN colors AS c ON c.color_id = r.color_id
WHERE d.deceased = 'N'

Gives me:

+----+-------------+-------+-------+------------------+----------+---------+----------------------+------+-------------+
| id | select_type | table | type  | possible_keys    | key      | key_len | ref                  | rows | Extra       |
+----+-------------+-------+-------+------------------+----------+---------+----------------------+------+-------------+
|  1 | SIMPLE      | c     | index | color_id         | color_id | 4       | NULL                 |   83 | Using index | 
|  1 | SIMPLE      | r     | ref   | PRIMARY,color_id | color_id | 4       | database.c.color_id  |   14 | Using index | 
|  1 | SIMPLE      | d     | ref   | race_id,deceased | race_id  | 4       | database.r.race_id   |  123 | Using where | 
+----+-------------+-------+-------+------------------+----------+---------+----------------------+------+-------------+
7ochem
  • 2,183
  • 1
  • 34
  • 42
  • possible duplicate of [COUNT(\*) vs. COUNT(1) vs. COUNT(pk): which is better?](http://stackoverflow.com/questions/2710621/count-vs-count1-vs-countpk-which-is-better) – John Woo Apr 09 '13 at 09:51
  • Yes, I've read that post. This question is more like 'SELECT COUNT(*) vs. SELECT *' – 7ochem Apr 09 '13 at 09:56

3 Answers3

1

The MySQL Optimizer does a full table scan only if it is needed because a column can be NULL which means if the column is not defined as NOT NULL there can be some NULL values in it and so MySQL have to perform table scan to find out. If your column d.dog_id nullable? try to run the count on another column which is not nullable, this should provide you a better performance than count(*).

CloudyMarble
  • 36,908
  • 70
  • 97
  • 130
0

Try to set an index on dogs.deceased and use SELECT COUNT(*) ... USE INDEX my_index_name.

mrks
  • 8,033
  • 1
  • 33
  • 62
  • I already had an index on dogs.deceased. I changed the query to `SELECT COUNT(*) FROM dogs AS d USE INDEX (idx_deceased) LEFT JOIN ...` but again: "Query took 2.9 sec". Thanks for the suggestion anyway! – 7ochem Apr 09 '13 at 10:01
0

Create indexes to make your counting faster:

CREATE INDEX ix_temp ON dogs (d.race_id)
INCLUDE (columns needed for the query)
Nik Drosakis
  • 2,258
  • 21
  • 30
  • Are you sure you can create an index over multiple tables? `... ON dogs (d.race_id,r.race_id) ...` d.race_id is in the dogs table, but r.race_id is in the races table. – 7ochem Apr 09 '13 at 10:38