I have a table of about 360,000 records and here's performing a query on two indexed fields:
SELECT COUNT(*)
FROM emails
WHERE
department_id IN(1,2,3,4)
AND category_id IN (5,6,7,8)
(Time: 0.9624802)
id: 1
select_type: SIMPLE
table: emails
type: range
possible_keys: emails_department_id_idx,emails_category_id_idx
key: emails_category_id_idx
key_len: 5
ref: NULL
rows: 54018
Extra: Using where
So only one index is being used there. (I can get an index merge to work when using simpler comparisons or range criteria, but I need to do checks like this against a list of IDs).
Here I created two new tables to map this relationship, and using JOIN's I replicated the same results:
SELECT COUNT(*)
FROM emails
LEFT JOIN email_to_department ON (email_to_department.email_id = emails.id AND email_to_department.department_id IN (1,2,3,4))
LEFT JOIN email_to_category ON (email_to_category.email_id = emails.id AND email_to_category.category_id IN (5,6,7,8))
WHERE
email_to_department.department_id IS NOT NULL
AND email_to_category.category_id IS NOT NULL
(Time: 0.5217777)
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: email_to_category
type: range
possible_keys: PRIMARY,category_id
key: category_id
key_len: 4
ref: NULL
rows: 61282
Extra: Using where; Using index
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: email_to_department
type: ref
possible_keys: PRIMARY,department_id
key: PRIMARY
key_len: 4
ref: testdb.email_to_category.email_id
rows: 1
Extra: Using where; Using index
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: emails
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: testdb.email_to_category.email_id
rows: 1
Extra: Using index
3 rows in set (0.38 sec)
So each query now uses an index, and trims almost half the time off. Is this bad design? Should I write the rest of these relationships in the same way?
If I add more criteria in the same way, the JOIN'd version of the query seems to get faster and the other remains more or less the same.
Doing a simple query on just a single indexed field is very very fast of course:
SELECT COUNT(*)
FROM emails
WHERE department_id IN(1,2,3,4)
Is there another strategy I might use to make these kinds of queries faster still? There are other properties that need to be filtered on as well, and in different combinations, so creating multi-column indexes won't really help.