-2

I´ve got a table with 41,000,000 rows with the following structure: enter image description here

What is the best index for date to enable a quick filtering like this

select count(*) from market_db_candle where date >= 2018-09-05 and date <= 2018-09-09;

gies0r
  • 4,723
  • 4
  • 39
  • 50
  • There is an excellent Community wiki describing how indexes work at https://stackoverflow.com/questions/1108/how-does-database-indexing-work?rq=1. Understanding how they work will help you determine which indexes to best meet your need. In the end though, there are so many variables, that the best thing to do is implement an index, update statistics, and check your query performance and explain plans. – Sam M Sep 09 '18 at 21:49
  • 1
    Another tip: Get rid of `id` and make `PRIMARY KEY(symbol_id, date, time)`. What is `tf`? Do you really need 8-byte `DOUBLEs`? – Rick James Sep 30 '18 at 01:08

1 Answers1

1

For the query you show, you can't do better than the index on date which you already have. The query will examine all rows that match the date range, but at least it won't examine any rows outside the date range. If that makes the query examine thousands or millions of rows, that can't be helped. It has to examine the rows to count them.

If you need a query that counts the rows in less time, one strategy is to create a second table stores one row per date, and the count associated with that date.

CREATE TABLE market_db_candle_count_by_day (
  date DATE PRIMARY KEY,
  count INT UNSIGNED NOT NULL
);

Populate it with the counts by day:

INSERT INTO market_db_candle_count_by_day
SELECT date, COUNT(*)
FROM market_db_candle
GROUP BY date;

Then you can query the SUM of counts:

select sum(count) as count from market_db_candle_count_by_day 
where date >= '2018-09-05' and date <= '2018-09-09';

It's up to you to keep the latter table in sync, and update it when necessary.

PS: Put date literals inside single-quotes.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828