I have two tables
- Export
Name Type Collation Attributes Null Default Extra
id int(10) utf8_unicode_ci UNSIGNED No None AUTO_INCREMENT
email varchar(150) utf8_unicode_ci No None
city_id int(11) utf8_unicode_ci Yes NULL
Indexes
Type Unique Packed Column Cardinality Collation Null
id BTREE Yes No id 769169 A No
email_index BTREE Yes No email 769169 A No
city_id_index BTREE No No city_id. 6356 A Yes
- Export history
Name Type Collation Attributes Null Default Extra
id int(10) utf8_unicode_ci UNSIGNED No None AUTO_INCREMENT
email varchar(255) utf8_unicode_ci No None
Indexes
Type Unique Packed Column Cardinality Collation Null
id BTREE Yes No id 113887 A No
email_index BTREE No No email 113887 A No
I need to get a top city ids which have the most emails (users). Also there is export_history table. I need to exclude emails from the results.
The end query looks like
Main query
SELECT COUNT(city_id) as city_count, city_id
FROM export e
WHERE NOT EXISTS (
SELECT * FROM export_history ehistory
WHERE e.email = ehistory.email
)
GROUP BY city_id
ORDER BY city_count DESC
LIMIT 5
Exec time is ~7 sec. The problem is that it takes about that much to execute.
Explain shows:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY e index NULL city_id_index 5 NULL 769169 Using where; Using temporary; Using filesort
2 DEPENDENT SUBQUERY ehistory ref email_index email_index 767 e.email 1 Using where; Using index
Please take into account that these two queries work very fast > 0.01 sec
Query 1
SELECT COUNT(city_id) as city_count, city_id
FROM export
GROUP BY city_id
ORDER BY city_count DESC
LIMIT 5
Exec time is ~0.1 sec
Query 2
SELECT *
FROM export e
WHERE NOT EXISTS (
SELECT * FROM export_history ehistory
WHERE e.email = ehistory.email
)
Exec time is ~0.02 sec
Could you please recommend any suggestions to improve performance of the main query?