0

Brief: Is there any way to improve the performance of table scans on InnoDB tables?

Please, do not suggest adding indexes to avoid table scans. (see below)

innodb_buffer_pool_size sits at 75% of server memory (48 GB/64GB) I'm using the latest version of Percona (5.7.19) if that changes anything

Longer: We have 600Gb of recent time series data (we aggregate and delete older data) spread over 50-60 tables. So most of it is "active" data that is regularly queried. These tables are somewhat large (400+ numeric columns) and many queries run against a number of those columns (alarming) which is why it is impractical to add indexes (as we would have to add a few dozen). The largest tables are partitioned per day.

I am fully aware that this is an application/table design problem and not a "server tuning" problem. We are currently working to significantly change the way these tables are designed and queried, but have to maintain the existing system until this happens so I'm looking for a way to improve things a bit to buy us a little time.

We recently split this system and have moved a part of it to a new server. It previously used MyISAM, and we tried moving to TokuDB which seemed appropriate but ran into some weird problems. We switched to InnoDB but performance is really bad. I get the impression that MyISAM is better with table scans which is why, barring any better option, we'll go back to it until the new system is in place.

Update

All tables have pretty much the same structure: -timestamp -primary key (varchar(20) field) -about 15 fields of various types representing other secondary attributes that can be filtered upon (along with an appropriately indexed criteria first) -And then about a few hundred measures (floats), between 200-400.

I already trimmed the row length as much as I could without changing the structure itself. The primary key used to be a varchar(100), all measures used to be doubles, many of the secondary attributes had their data types changed.

Upgrading hardware is not really an option.

Creating small tables with just the set of columns I need would help some processes perform faster. But at the cost of creating that table with a table scan first and duplicating data. Maybe if I created it as a memory table. By my estimate, it would take a couple of GB away from the buffer pool. Also there are aggregation processes that read about as much data from the main tables on a regular basis, and they need all columns.

There is unfortunately a lot of duplication of effort in those queries which I plan to address in the next version. The alarming and aggregation processes basically reprocess the entire day's worth of data every time some rows inserted (every half hour) instead of just dealing with new/changed data.

Like I said, the bigger tables are partitioned, so it's usually a scan over a daily partition rather than the entire table, which is a small consolation.

Implementing a system to hold this in memory outside of the DB could work, but that would entail a lot of changes on the legacy system and development work. Might as well spend that time on the better design.

The fact that InnoDB table are so much bigger for the same data as MyISAM (2-3x as big in my case) really hinders the performance.

Carl
  • 1
  • 2
  • 1
    Faster drives. Better IO. Better disk cache. Reduced row size. Jettison any and all columns not strictly necessary. Make a copy of the table that's more optimized for querying if you must. The answer is, as you know, indexes. Table scans will never, ever be fast. They'll just be less brutally slow if you try really hard. – tadman May 15 '18 at 00:43
  • How much of that data is part of the query process? Can you be more specific about the structure? You could have table A which has *only* the columns used when querying, and table B which has all the data. Querying table A will be faster, and then you can get the ID values to fetch from table B, something that a `JOIN` could do for you. – tadman May 15 '18 at 00:45
  • You could also load these columns into memory in some kind of persistent process that updates once in a while and query against that. Scanning through a few million items in an in-memory array is extremely quick, almost zero time, if you're doing simple filtering. Doing this in the database is a lot slower, necessarily, because of the relational overhead, MVCC concerns, and so on. – tadman May 15 '18 at 00:47
  • Thank for your suggestions, even if I'm not sure I can immediately use it, it always helps to bounce ideas around. I edited my question with additional info – Carl May 16 '18 at 02:29

2 Answers2

1

MyISAM is a little bit better at table-scans, because it stores data more compactly than InnoDB. If your queries are I/O-bound, scanning through less data on disk is faster. But this is a pretty weak solution.

You might try using InnoDB compression to reduce the size of data. That might get you closer to MyISAM size, but you're still I/O-bound so it's going to suck.

Ultimately, it sounds like you need a database that is designed for an OLAP workload, like a data warehouse. InnoDB and TokuDB are both designed for OLTP workload.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • Can't say I agree with your answer. [For me, reading MyISAM data off the SSD is almost 10 times faster than reading the same data in InnoDB from RAM (buffer pool).](https://stackoverflow.com/questions/68378496/why-is-innodb-so-painfully-slow-on-full-table-scans-even-though-entire-data-is-i) This is on a highend system. I just don't get it. – Alex K. Jul 14 '21 at 14:18
  • 1
    I posted a comment on your question you linked to. – Bill Karwin Jul 14 '21 at 14:45
0

It smells like a Data Warehouse with "Reports". By judicious picking of what to aggregate (selected of your Floats) over what time period (hour or day is typical), you can build and maintain Summary Tables that work much more efficiently for the Reports. This has the effect of scanning the data only once (to build the Summaries), not repeatedly. The Summary tables are much smaller, so the reports are much faster -- 10x is perhaps typical.

It may also be possible to augment the Summary tables as the raw data is being Inserted. (See INSERT .. ON DUPLICATE KEY UPDATE ..)

And use Partitioning by date to allow for efficient DROP PARTITION instead of DELETE. Don't have more than about 50 partitions.

Summary Tables

Time series Partitioning

If you would like to discuss in more detail, let's start with one of the queries that is scanning so much now.

In the various projects I have worked on, there were between 2 and 7 Summary tables.

With 600GB of data, you may be pushing the limits on 'ingestion'. If so, we can discuss that, too.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • I have found your website a few months ago and we have based a lot of our new design on it. Thank you for writing it. We have been using partitions for a while to facilitate maintenance, facilitate queries and also improve ingestion speed. Other planned improvements: -Use MySQL 8 and JSON for our data. The measures we receive change regularly. We have to ALTER TABLEs every few weeks. -Keep SUM and COUNT for aggregations (summary), but also MIN, MAX and also SUM_OF_SQUARES for incremental variance/stddev. – Carl May 25 '18 at 09:55
  • Also, normalize the h... out of it. This is a big gap in our current design. This will greatly reduce the size of our tables. I'm just not sure how I can apply these recommendations to our current design without doing a ton of work that would be put to better use on making the future design a reality. I was just looking for quick wins that I might have overlooked to buy time... I have managed to reduce the frequency of queries to help the server for now. – Carl May 25 '18 at 10:07
  • @Carl - "partitions .. improve ingestion speed" -- Please elaborate. That may be a benefit that I have yet to discover. – Rick James May 25 '18 at 16:15
  • @Carl - beware of "over-normalization": don't normalize narrow columns or columns with 'continuous' values (dates, floats, etc). General avoid normalizing things that you will be filtering on or sorting on. – Rick James May 25 '18 at 16:18
  • @Carl - I'm happy to hear that my website is appreciated. (You are the first to _mention_ using sum_of_squares.) – Rick James May 25 '18 at 16:19
  • @Carl - Are you familiar with `pt-online-schema-change`? It may be the 'best' way to retrofit the schema changes for normalization. There can be some tricky things if the columns are being _updated_ during the schema change. (But that's another discussion.) – Rick James May 25 '18 at 16:23
  • Ingestion speed: We have found that when our tables get big, ingestion speed slows down. My guess is that this is due to the fact that our primary key is a varchar and a timestamp(in that order), so inserts are not sequential. When the tables are partitioned, the engine deals with smaller data/indexes than with one big monolithic table. It would probably not happen if the primary key allowed for insertions at the end of the table, but it might still have some impact if other indexes are present... But I have yet to test. – Carl May 26 '18 at 16:30
  • "over-normalization": I only intend to normalize character fields that are repeated. I've yet to perform exhaustive tests of the new design with a large amount of data. My colleague was of the mind to normalize every single repeated which could lead to a bunch of tables and complicated join queries. I think there is a balance to be struck, but there will not be any "strings" in my main records, only timestamps, integers, and the json array of numbers. Because my dataset will definitely not fit into memory and I'm stuck on spinning disks, there is a significant gain to having small rows. – Carl May 26 '18 at 16:39
  • I tried pt-online-schema-change, I liked how it worked, but it was never able to complete due to the load on the server and the safeguards that would pause when there were too many processes. In the end. We simply ALTER TABLE and live with the consequences. For all its faults, the legacy system is designed to handle timeouts, failures and errors of all kind. – Carl May 26 '18 at 16:50
  • `PK(varchar, timestamp)` -- If there are only 'thousands' of distinct varchars, there will be thousands of "hot spots"; this should not slow things down. But with 'millions', there could be I/O to read-modify-write random blocks. – Rick James May 26 '18 at 17:46
  • In many (not all) cases, partitioning is just a tradeoff -- picking a partition versus 'smaller index'. – Rick James May 26 '18 at 17:46
  • @Carl - non-unique secondary indexes have only limited impact. See "Change buffering". – Rick James May 26 '18 at 17:48
  • There are around 65000 distinct VARCHARs in the worst table. The timestamps are also not UTC (another thing we will change) and come from 6 different time zones. INSERTs became progressively slower as the size of the table increased, after a couple of days, ingestion rate was about halved. By partitioning by day, ingestion rate would reset every morning when all insertions occurred in the fresh partition. – Carl May 29 '18 at 21:11
  • 65K is not bad. Let's see `SHOW CREATE TABLE`. – Rick James May 29 '18 at 23:15