0

I have a fairly simple table, containing 500 million+ rows. Very simple queries are taking over 2-3.5 minutes. I have an index on the field in the WHERE statement.

I am wondering what I can do to optimize this table and/or query?

THE QUERY & RESULTS

mysql> SELECT COUNT(emails_id) AS count FROM person_deliveries WHERE DATE(date) = '2021-08-23' ;
+--------+
| count  |
+--------+
| 539438 |
+--------+
1 row in set (2 min 20.05 sec)

EXPLAIN QUERY

mysql> EXPLAIN SELECT COUNT(emails_id) AS count FROM person_deliveries WHERE DATE(date) = '2021-08-23' ;
+----+-------------+-------------------+------------+-------+---------------+----------------------+---------+------+-----------+----------+--------------------------+
| id | select_type | table             | partitions | type  | possible_keys | key                  | key_len | ref  | rows      | filtered | Extra                    |
+----+-------------+-------------------+------------+-------+---------------+----------------------+---------+------+-----------+----------+--------------------------+
|  1 | SIMPLE      | person_deliveries | NULL       | index | NULL          | campaigns_id | 4       | NULL | 454956815 |   100.00 | Using where; Using index |
+----+-------------+-------------------+------------+-------+---------------+----------------------+---------+------+-----------+----------+--------------------------+

SHOW CREATE TABLE

-------------------------------------------------------------------------------------------------------+
| Table             | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| person_deliveries | CREATE TABLE `person_deliveries` (
  `emails_id` int unsigned NOT NULL,
  `campaigns_id` int NOT NULL,
  `date` datetime NOT NULL,
  `vmta` varchar(255) DEFAULT NULL,
  `ip_address` varchar(15) DEFAULT NULL,
  `domain` varchar(255) DEFAULT NULL,
  UNIQUE KEY `person_campaign_date` (`emails_id`,`campaigns_id`,`date`),
  KEY `ip_address` (`ip_address`),
  KEY `domain` (`domain`),
  KEY `campaigns_id` (`campaigns_id`),
  KEY `date` (`date`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Thank you in advance!

syntax
  • 21
  • 3
  • While I believe this is certainly on-topic here, at this scale, you may find it more beneficial to delete this post and re-ask on Stack Overflow’s sister site, [Database Administrators](https://dba.stackexchange.com) – esqew Aug 27 '21 at 20:06
  • there are already some about that theme but see https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html#function_count – nbk Aug 27 '21 at 20:15
  • Count operations can be unexpectedly slow on large sets. Possibly relevant/related: https://stackoverflow.com/q/511820/1370000 – Kevin Driessen Aug 27 '21 at 20:31
  • 3
    I think the use of `DATE(date) = '2021-08-23' ` is the problem, it reduces the benefits of the index. Try `date >= '2021-08-23 00:00:00' AND date <= '2021-08-23 59:59:59'` and let us know. – Silvanu Aug 27 '21 at 21:02
  • https://en.wikipedia.org/wiki/Sargable - as commented by Silvanu, anytime you use a function against a value, you loose optimization of the index. – DRapp Aug 27 '21 at 22:56
  • @Silvanu is absolutely correct, when i modified and used `date >= '2021-08-23 00:00:00' AND date <= '2021-08-23 23:59:59' ` the query took less than 1 second :) – syntax Aug 28 '21 at 02:36

1 Answers1

2

As Silvanu & DRapp commented, my use of the date() function was slowing the query down.

mysql> SELECT COUNT(emails_id) AS count FROM person_deliveries WHERE date >= '2021-08-23 00:00:00' AND date <= '2021-08-23 23:59:59' ;
+--------+
| count  |
+--------+
| 539438 |
+--------+
1 row in set (***0.47 sec***)
Paul T.
  • 4,703
  • 11
  • 25
  • 29
syntax
  • 21
  • 3
  • This answer is correct, but it's always safer to change the last predicate to: `AND date < '2021-08-24 00:00:00'`. – The Impaler Aug 28 '21 at 20:34