1

I have an ever-increasing database table of ~8 million rows running, which my application regularly fetches data from. However, the query has suddenly started locking up the entire system. There are tons of mysqld processes clogging up all CPU cores.

Could it be the ever-increasing size of the database? Or is there something within the query below that could cause it to run for so long? The UNIX_TIMESTAMP, for example? It's an excerpt from the slow query log. The query is executed every minute, and always had a query time of around ~7.

# Query_time: 6.839524  Lock_time: 0.000170 Rows_sent: 277  Rows_examined: 7989334
FROM (
    SELECT @row := @row + 1 AS `row`, `timestamp`, `price`
    FROM (
        SELECT @row := 0
    ) `derived_1`, `items`
    WHERE `price` IS NOT NULL
        AND `timestamp` >= UNIX_TIMESTAMP(NOW() - INTERVAL 1 DAY)
        AND `currency` = 'EUR'
        AND `type` = 'icon'
    ORDER BY `timestamp` ASC
) `derived_2`
WHERE `row` % 8 = 0;

It's a bit hard to just try it out, as it's a production environment. I can't reproduce the issue on my development environment either.

If you need any additional information, please let me know!

Thanks a bunch in advance!

Thierry
  • 101
  • 1
  • 7

3 Answers3

1

Create an index, so the DBMS can find the records quickly. This index should contain the columns of the WHERE clause, starting with those where you compare with =.

CREATE idx ON items (type, currency, price, timestamp);

This even happens to be a covering index, i.e. it contains all columns you are using in the query. So the DBMS won't have to read the table even, as it can get all data from the index itself.

Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
  • That is my favourite answer! I didn't think of covering indexes at first! – nCessity Jun 23 '17 at 10:10
  • @nCessity: Thanks. Your remark as to selectivity is valid, though, too. So it may be worth trying two covering indexes, one starting with the timestamp, one not, and then looking which index the DBMS picks. – Thorsten Kettner Jun 23 '17 at 10:15
1

The best index is this:

INDEX(type, currency,  -- The two columns tested via '=' (in either order)
      timestamp,       -- then this, for a range test
      price)           -- finally, the rest of the columns used

That will be a "covering" index, so it won't have to bounce between the index BTree and the data BTree. (The other suggested indexes will be slower because of this.)

Hopefully, that will make the query run faster than 7 seconds.

It is unsafe to run something "every minute". What will happen if, for whatever reason, one instance fails to finish within a minute? Probably the multiple invocations will stumble over each other, and possibly get slower and slower. This may have been clogging you experienced.

To avoid that, have a job running continually -- calculate the query, then sleep, say, 53 seconds. That will (based on current timings) very closely approximate once a minute. Or it could pause just enough time to reach the next minute (but not less than 0 seconds).

Rick James
  • 135,179
  • 13
  • 127
  • 222
0

An index on timestamp should bring you better performance. Especially, because the timestamp column's values are probably well distributed. Indexes on currency and type will possibly give another performance boost. But maybe you don't even notice it, because, There are just a few different currencies and data types (compared to 8 million rows)

CREATE INDEX idx_timestamp ON items (timestamp);

And yes, that will take some time!

The UNIX_TIMESTAMP(NOW() - INTERVAL 1 DAY) is not the problem as it is calculated only once.

nCessity
  • 735
  • 7
  • 23
  • Thanks a bunch! I'll definitely try that out. Out of curiosity: Why is only `timestamp` useful to index, and not `currency` and `type` as well? Does it have to do with their invariability? – Thierry Jun 23 '17 at 09:58
  • Also, does indexing a table of this size cause issues with inserting/updating/... rows? Should I account for such downtime? – Thierry Jun 23 '17 at 10:03
  • 1
    @Thierry: Don't worry about this. Yes, inserts, updates and deletes take a little longer, but we are talking about micro seconds here, I guess. Mass updates should usually not be necessary, mass inserts usually don't happen, so it's only mass deletes that may occur and suffer a little, but that's the price to pay :-) – Thorsten Kettner Jun 23 '17 at 10:07
  • @Thierry: I collected some links that I find very helpful with all these questions: https://stackoverflow.com/questions/1108/how-does-database-indexing-work https://planet.mysql.com/entry/?id=661727 https://www.simple-talk.com/sql/learn-sql-server/sql-server-index-basics/ – nCessity Jun 23 '17 at 10:08
  • Thanks a bunch, @nCessity! – Thierry Jun 27 '17 at 12:28