75

How do I create an index on the date part of DATETIME field?

mysql> SHOW COLUMNS FROM transactionlist;
+-------------------+------------------+------+-----+---------+----------------+
| Field             | Type             | Null | Key | Default | Extra          |
+-------------------+------------------+------+-----+---------+----------------+
| TransactionNumber | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| WagerId           | int(11)          | YES  | MUL | 0       |                |
| TranNum           | int(11)          | YES  | MUL | 0       |                |
| TranDateTime      | datetime         | NO   |     | NULL    |                |
| Amount            | double           | YES  |     | 0       |                |
| Action            | smallint(6)      | YES  |     | 0       |                |
| Uid               | int(11)          | YES  |     | 1       |                |
| AuthId            | int(11)          | YES  |     | 1       |                |
+-------------------+------------------+------+-----+---------+----------------+
8 rows in set (0.00 sec)

TranDateTime is used to save the date and time of a transaction as it happens

My Table has over 1,000,000 records in it and the statement

SELECT * FROM transactionlist where date(TranDateTime) = '2008-08-17' 

takes a long time.

EDIT:

Have a look at this blog post on "Why MySQL’s DATETIME can and should be avoided"

fancyPants
  • 50,732
  • 33
  • 89
  • 96
Charles Faiga
  • 11,665
  • 25
  • 102
  • 139
  • 7
    warning comment for the link you suggested a look: The post is written with such an excitement and rage that it almost borders on the point of childishness. And the writer is not beating back any criticism , while still mentioning that he stands behind what he said , yet his point is falling slender with each. But still , not a waste of time, if you read the comments. – kommradHomer Jan 24 '13 at 10:05

13 Answers13

72

If I remember correctly, that will run a whole table scan because you're passing the column through a function. MySQL will obediently run the function for each and every column, bypassing the index since the query optimizer can't really know the results of the function.

What I would do is something like:

SELECT * FROM transactionlist 
WHERE TranDateTime BETWEEN '2008-08-17' AND '2008-08-17 23:59:59.999999';

That should give you everything that happened on 2008-08-17.

Michel de Ruiter
  • 7,131
  • 5
  • 49
  • 74
Michael Johnson
  • 2,287
  • 16
  • 21
  • 1
    i used to think of this usage as just for a short cut for 'YYYY-MM-DD 00:00:00' – kommradHomer Jan 24 '13 at 10:06
  • 3
    I know this is an old answer, but I feel compelled to point out that as MySQL uses string comparison for `DATETIME`; your query returns the correct results and does not include rows with `TranDateTime=2008-08-18 00:00:00`. – Arth Jul 30 '14 at 09:17
  • 1
    Arth, do you have a source that MySQL uses string comparison? Was this the case in older versions? This definitely isn't true for MySQL 5.7. Try: create table foobar ( mytime timestamp ); insert into foobar (mytime) values ('2008-08-18 00:00:00'); select * from foobar where mytime between '2008-08-17 00:00:00' and '2008-08-18 23:59:59'; – andreas Mar 08 '18 at 09:10
  • Is `BETWEEN` faster than using `where TranDateTime >= '2008-08-17' and TranDateTime < '2008-08-18'`? – Chloe Feb 19 '19 at 20:45
  • 1
    This is not the correct answer, the question was about indexing, not selecting. See the answer with a [generated column](https://stackoverflow.com/questions/95183/how-does-one-create-an-index-on-the-date-part-of-datetime-field-in-mysql/63677615#44548610) instead. – ΔO 'delta zero' Dec 05 '20 at 22:03
  • Sometimes you have to look past the words of the question and provide what the op really wants. This answer is excellent because it makes use of a BTREE index on the existing attribute & stops full table scans, and accomplishes exactly what the op actually wants very easily. – Mark B Sep 15 '22 at 10:17
17

Another option (relevant for version 5.7.3 and above) is to create a generated/virtual column based on the datetime column, then index it.

CREATE TABLE `table` (
`my_datetime` datetime NOT NULL,
`my_date` varchar(12) GENERATED ALWAYS AS (DATE(`my_datetime`)) STORED,
KEY `my_idx` (`my_date`)
) ENGINE=InnoDB;
Martin Burch
  • 2,726
  • 4
  • 31
  • 59
Liran Brimer
  • 3,418
  • 1
  • 28
  • 23
  • 1
    why is stored and not virtual ? –  Jul 24 '17 at 16:34
  • 1
    If you want to index it must be stored.. Without index it can be virtual – Liran Brimer Jul 25 '17 at 05:59
  • 1
    thx, i imagined, tho i got confused with this article https://www.percona.com/blog/2016/03/04/virtual-columns-in-mysql-and-mariadb/ –  Jul 25 '17 at 10:38
  • This should be the correct answer, I found [date index outperforms a datetime index](https://stackoverflow.com/questions/65162398/how-to-make-use-of-mysqls-functional-index-on-a-datetime-column), even with BTREE. – ΔO 'delta zero' Dec 05 '20 at 21:58
  • By the way, nowadays InnoDB supports indexes on VIRTUAL columns too. – ΔO 'delta zero' Dec 05 '20 at 21:58
17

I don't mean to sound cute, but a simple way would be to add a new column that only contained the date part and index on that.

Mike Tunnicliffe
  • 10,674
  • 3
  • 31
  • 46
  • Yup -- and add a column with just the time part, and eliminate the DATETIME altogether. – JBB Sep 18 '08 at 18:25
  • my current solution is add another field call ‘date’ and when I update the TranDateTime the date is also update. I now have an index on the ‘date’ and query is a lot faster by my table increased in size by +-5% – Charles Faiga Sep 18 '08 at 18:34
10

You can't create an index on just the date part. Is there a reason you have to?

Even if you could create an index on just the date part, the optimiser would probably still not use it for the above query.

I think you'll find that

SELECT * FROM transactionlist WHERE TranDateTime BETWEEN '2008-08-17' AND '2008-08-18'

Is efficient and does what you want.

MarkR
  • 62,604
  • 14
  • 116
  • 151
5

I don't know about the specifics of mySql, but what's the harm in just indexing the date field in its entirety?

Then just search:

 select * from translist 
     where TranDateTime > '2008-08-16 23:59:59'
        and TranDateTime < '2008-08-18 00:00:00'

If the indexes are b-trees or something else that's reasonable, these should get found quickly.

Clinton Pierce
  • 12,859
  • 15
  • 62
  • 90
  • You can use `>= '2008-08-16' and ... < '2008-08-18'`. The time is assumed to be `00:00:00`. – Chloe Feb 19 '19 at 20:54
  • You mean: >= '2008-08-17' and ... < '2008-08-18'. The time is assumed to be 00:00:00 – A K Feb 20 '19 at 14:22
2

The one and good solution that is pretty good working is to use timestamp as time, rather than datetime. It is stored as INT and being indexed good enough. Personally i encountered such problem on transactions table, that has about million records and slowed down hard, finally i pointed out that this caused by bad indexed field (datetime). Now it runs very quick.

Valentin Rusk
  • 630
  • 5
  • 13
2

Valeriy Kravchuk on a feature request for this very issue on the MySQL site said to use this method.

"In the meantime you can use character columns for storing DATETIME values as strings, with only first N characters being indexed. With some careful usage of triggers in MySQL 5 you can create a reasonably robust solution based on this idea."

You could write a routine pretty easy to add this column, and then with triggers keep this column synced up. The index on this string column should be pretty quick.

Ray Jenkins
  • 123
  • 3
  • 10
1

I don't know about the specifics of mySQL, but what's the harm in just indexing the date field in its entirety?

If you use functional magic for * trees, hashes, ... is gone, because for obtaining values you must call the function. But, because you do not know the results ahead, you have to do a full scan of the table.

There is nothing to add.

Maybe you mean something like computed (calculated?) indexes... but to date, I have only seen this in Intersystems Caché. I don't think there's a case in relational databases (AFAIK).

A good solution, in my opinion, is the following (updated clintp example):

SELECT * FROM translist 
WHERE TranDateTime >= '2008-08-17 00:00:00.0000'
  AND TranDateTime < '2008-08-18 00:00:00.0000'

Whether you use 00:00:00.0000 or 00:00 in my opinion makes no difference (I've generally used it in this format).

Troy Alford
  • 26,660
  • 10
  • 64
  • 82
antonia007
  • 11
  • 1
1

datetime LIKE something% will not catch the index either.

Use this: WHERE datetime_field >= curdate();
That will catch the index,
and cover today:00:00:00 up to today:23:59:59
Done.

Dr. Tyrell
  • 2,829
  • 1
  • 16
  • 9
0

If modifying the table is an option, or you're writing a new one, consider storing date and time in separate columns with respective types. You get performance by having a much smaller key space, and reduced storage (compared to a date-only column derived from a datetime). This also makes it feasible to use in compound keys, even before other columns.

In OP's case:

+-------------------+------------------+------+-----+---------+----------------+
| Field             | Type             | Null | Key | Default | Extra          |
+-------------------+------------------+------+-----+---------+----------------+
| TransactionNumber | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| WagerId           | int(11)          | YES  | MUL | 0       |                |
| TranNum           | int(11)          | YES  | MUL | 0       |                |
| TranDate          | date             | NO   |     | NULL    |                |
| TranTime          | time             | NO   |     | NULL    |                |
| Amount            | double           | YES  |     | 0       |                |
| Action            | smallint(6)      | YES  |     | 0       |                |
| Uid               | int(11)          | YES  |     | 1       |                |
| AuthId            | int(11)          | YES  |     | 1       |                |
+-------------------+------------------+------+-----+---------+----------------+
Walf
  • 8,535
  • 2
  • 44
  • 59
0

What does 'explain' say? (run EXPLAIN SELECT * FROM transactionlist where date(TranDateTime) = '2008-08-17')

If it's not using your index because of the date() function, a range query should run fast:

SELECT * FROM transactionlist where TranDateTime >= '2008-08-17' AND TranDateTime < '2008-08-18'

nathan
  • 4,571
  • 2
  • 27
  • 28
  • 1
    If you use date() you won't hit the index. Mysql can't use indexes inside function calls like that. – JBB Sep 18 '08 at 18:26
0

Rather than making an index based on a function (if that is even possible in mysql) make your where clause do a range comparison. Something like:

Where TranDateTime > '2008-08-17 00:00:00' and TranDateTime < '2008-08-17 11:59:59')

This lets the DB use the index on TranDateTime (there is one, right?) to do the select.

Justsalt
  • 1,886
  • 3
  • 17
  • 24
-1

Create a new fields with just the dates convert(datetime, left(date_field,10)) and then index that.

McGarnagle
  • 101,349
  • 31
  • 229
  • 260
Mari
  • 1