4

So I have a table that's being used basically like a NoSQL setup. The structure is:

id bigint primary key data mediumblob modified timestamp

It has around 350k rows. The queries that run on it are all structured as follows:

select data from table where id=XXX;

The table engine is InnoDB. I'm noticing that sometimes queries run against this table are rather slow. Sometimes they take 3 seconds to run. The table is 3 GB on disk and I gave the innodb_buffer_pool_size 4G.

Is there anything I'm missing here? Are there any settings I can tweak to improve performance?

Edit: As requested explain output:

+----+-------------+----------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table    | type  | possible_keys | key     | key_len | ref   | rows | Extra |
+----+-------------+----------+-------+---------------+---------+---------+-------+------+-------+
|  1 | SIMPLE      | cache    | const | PRIMARY       | PRIMARY | 8       | const |    1 |       |
+----+-------------+----------+-------+---------------+---------+---------+-------+------+-------+

create table:

CREATE TABLE `cache` (
  `id` bigint(20) unsigned NOT NULL DEFAULT '0',
  `data` mediumblob,
  `modified` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Marc B
  • 356,200
  • 43
  • 426
  • 500
Jon
  • 2,085
  • 2
  • 20
  • 28

4 Answers4

6

There are two issues that I see here initially. First is that you have a query with a blob data type. This will cause speed issues when it comes to data retrieval. Second, you are using InnoDB, which is optimized for writing. This means that while it is probably the best choice overall, in extreme read situations it might be less performant than MyISAM. Neither of these issues are necessarily deal-killers but they do each add a performance hit. Beyond this, however, I'm not sure I can give you a good answer as to what you can do to better optimize without first having you do profiling. That is what I would recommend you do first. Profile your query to figure out what the execution plan is and then identify why the execution plan is so slow.

Here is a good "Top 10" list of MySQL optimizations. At least a couple apply in your situation directly:

http://20bits.com/articles/10-tips-for-optimizing-mysql-queries-that-dont-suck/

Here is another good optimization article that goes into server settings as well (for InnoDB specifically):

http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/

Based on the CREATE TABLE statement you provided, I did think of another thing that you should address (again, not a query-killer but it is another performance hit). Unless there is a business case for using a bigint for your ID field, choose an int instead. An int will allow 2.1 billion rows so you shouldn't run out of numbers. Making this switch will save you disk space and it will improve query performance. Here is an article about it:

http://ronaldbradford.com/blog/bigint-v-int-is-there-a-big-deal-2008-07-18/

IAmTimCorey
  • 16,412
  • 5
  • 39
  • 75
  • 2
    innodb isnt optimised just for writing - a well designed innodb table taking advantage of it's clustered index will out perform any other engine http://stackoverflow.com/questions/4419499/mysql-nosql-help-me-to-choose-the-right-one-on-a/4421601#4421601 – Jon Black Jun 02 '11 at 02:54
  • I'm not sure I agree about InnoDB. I have my settings fairly close to what is suggested by the second link. – Jon Jun 02 '11 at 03:03
  • Sorry, I think I was a bit unclear. InnoDB is better optimized for writing as compared to MyISAM, which is better optimized for reading. InnoDB is probably your best choice, I was just trying to say that from just a read perspective it isn't the best choice. – IAmTimCorey Jun 02 '11 at 03:46
  • 1
    Yeah I'm not a fan of the bigint either. It's not even possible to use up an int for what we're doing. It's not really my decision though. – Jon Jun 02 '11 at 13:00
  • @Jon - Understood. We all deal with issues like this. Quick question - you say you are treating this like a NoSQL file. Does this mean you are calling the query from an application? If so, is it local or across the network? Could the bottleneck be the transmission or processing as compared to the actual call itself? – IAmTimCorey Jun 02 '11 at 13:50
  • I think the issue has more to do with the percentage of the database that's in memory at any given point in time. I'm noticing that as the database has been running for a few days the slow queries drop off more and more. My threshold is now 1 second and I haven't seen one in over an hour. – Jon Jun 02 '11 at 19:37
  • its wrong to say that innodb is not better then myisam from just a read perspective,. consider this scenario where the table almost entirely fits the bufferpool then read in innodb are going to be far quicker then myisam,. myisam doesnt cache data,. while innodb caches both data and index,. – ovais.tariq Jun 03 '11 at 13:06
  • and there is another reason why queries might be slow "sometimes",. due to checkpointing,. do you have lots of insert/update/delete operations happening on this table? – ovais.tariq Jun 03 '11 at 13:07
0

Could you post your CREATE TABLE statement as well as the output of EXPLAIN select data from table where id=XXX? How is the io wait on the system?

My best guess is that you're IO bound and because the rows aren't all the same size, it's having to search through the data. You have enough memory that it should be able to keep the data cached. This link describes some low level profiling in MySQL that might be helpful.

http://dev.mysql.com/tech-resources/articles/using-new-query-profiler.html

Joshua Martell
  • 7,074
  • 2
  • 30
  • 37
0

Try using the minimum size of id as possible. If it's a numeric key that you know will never be larger than a few million, you could use a MEDIUMINT UNSIGNED and save yourself a byte for each record over an INT, which might speed up searches a little. Still, 3 GB is an awful lot for just 350,000 rows.

It sounds like you might also get some bang for your buck by using the partitioning feature to split your table up into logical units. You might want to Google "mysql vertical partitioning" in particular; if there are large columns that you don't access frequently, it would be much more efficient to move them out into a separate table and only query it when you need it.

King Skippus
  • 3,801
  • 1
  • 24
  • 24
  • Well the whole idea here was to make just one query to fetch data. It's just turning out to be much slower than I had hoped. Not sure the int size is going to do much for me. – Jon Jun 02 '11 at 02:58
  • Honestly? Probably not, but it might make a marginal improvement to your query speed. I really think that vertical partition is the way to go, since it sounds like your rows are relatively large. Unfortunately, MySQL doesn't support vertical partitioning natively, so you're stuck having to emulate it using multiple tables. I wish I could be more help. – King Skippus Jun 02 '11 at 03:16
0

Things I would look for:

  • when are the slow queries appearing?

    • is it after a fresh start of the DB? then this might be just a temporary problem - queries hitting in a cold cache
    • is it during DB dump/load? - then change your backup policies - use replication for example, or add more disk IO (adding more disks in RAID, change disks to SSD, repartition your system on multiple disks, etc)
    • is it during peak read/write times? replication might also help here - write into master and load balance the reads between master and slaves
  • Also - is that mediumblob really necessary there?
Tudor Constantin
  • 26,330
  • 7
  • 49
  • 72
  • They happen sporadically throughout the day. The DB has been up for awhile. We don't backup this data as it's just a cache that can be rebuilt. The medium blob *is* the table. The data in it is the whole purpose of having the table. – Jon Jun 02 '11 at 13:07