Comparison of TIMESTAMP column for large table

Question

I have a MySQL(InnoDB) table in which I have a large number of rows(a couple of million). I'm doing queries as such:

SELECT  SQL_CALC_FOUND_ROWS `a` 
FROM  `logs`  
WHERE  `connect_timestamp`  > 10000 
ORDER  BY  `connect_timestamp`  DESC 
LIMIT 1

I have a normal index added to the column, however a query like this takes up to 20 seconds, is there a better way?

you want to have an index on the `connect_timestamp` column. — Martin, Mar 31 '16 at 14:18
Martin, when I say normal, I mean nothing fancy, ADD KEY `connect_timestamp` (`connect_timestamp`) — Saulius Antanavicius, Mar 31 '16 at 14:21
How many results do these queries generally return? are you selecting a few 100 rows or a few million? — Jester, Mar 31 '16 at 14:26
@SauliusAntanavicius: And what would an "abnormal" index look like? — spencer7593, Mar 31 '16 at 14:43
@SauliusAntanavicius can you output the structure of the table for us, and edit and insert this into your question, that would be very helpful. cheers — Martin, Mar 31 '16 at 14:47
Also in addition to what martin asked, what is your goal with these rows? what are you retrieving from them? — Jester, Mar 31 '16 at 14:50
As illustrated here - https://www.percona.com/blog/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/ - `SQL_CALC_FOUND_ROWS` is quiet taxing on the system performance, it would probably be faster for you to use a seperate query for counting rows. — Martin, Mar 31 '16 at 14:54

score 1 · Answer 1 · edited May 23 '17 at 11:50

1

Edit based on comments from spencer7593 and Martin:

A simple count + select query might be much faster than one SQL_CALC_FOUND_ROWS. see: Which is fastest? SELECT SQL_CALC_FOUND_ROWS FROM `table`, or SELECT COUNT(*)

I suggest running both your original query and:

SELECT  count(*)
FROM  `logs`  
WHERE  `connect_timestamp`  > 10000

plus:

SELECT  `a`
FROM  `logs`  
WHERE  `connect_timestamp`  > 10000 
ORDER  BY  `connect_timestamp`  DESC 
LIMIT 1

Best even to run all with EXPLAIN added to measure (and add) the runtimes and see the difference, you can also add SQL_NO_CACHE to simulate a first run. see: https://www.percona.com/blog/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/

If that doesn't help at all i suggest to look into the following:

Things you can try:

Index the column which is used for searching (you seem to have already done so)
Make a view for specific queries that are to be executed often.
Try caching the specific table if the server has memory for it.
Also like Martin said in the comments, Put EXPLAIN in front of the query to see which part of the query is taking up all the time. Maybe there is something you can change about it.

Those are the things i can come up with.

edited May 23 '17 at 11:50

Community

answered Mar 31 '16 at 14:31

Jester

1

isn't the `a` a column that is being called by the `select`? – Martin Mar 31 '16 at 14:45
if not, then surely a `COUNT` SQL would be better than OPs current SQL? – Martin Mar 31 '16 at 14:46
Oh, my bad :) renaming the selected variable in this case, but renaming the result definitely won't slow the query down i think. I'll remove it, thanks! – Jester Mar 31 '16 at 14:48
2

This answer might also mention the potential performance impact of **`SQL_CALC_FOUND_ROWS`**, and suggest that running two separate queries... 1) "`SELECT a ... ORDER BY ... LIMIT 1`" and 2) "`SELECT COUNT(*)`" may be significantly faster than a single query. – spencer7593 Mar 31 '16 at 14:54
Oh i didn't even delve too deep into that (never used it myself) you're right. i think you're right, a simple count might perform much better – Jester Mar 31 '16 at 14:57
I remain unconvinced. I tried 5 cases in my code of `SQL_CALC_FOUND_ROWS`. All were at least as fast as the 2-query approach; most were faster. I am using InnoDB, not MyISAM (as the old Percona blog was). – Rick James Mar 31 '16 at 23:51

score 0 · Answer 2 · answered Mar 31 '16 at 23:12

INDEX(connect_timestamp, a)

This will be a "covering" index, thereby speeding up the SQL_CALC_FOUND_ROWS as well as the SELECT ... LIMIT 1.

It will have to scan from the end of the index all the way back to 10000. How many rows is that?

If connect_timestamp is some types of CHAR instead of some type of INT, then you have another problem. Please provide SHOW CREATE TABLE.

2 Answers2