5

I have a MySQL m2.2xlarge instance on AWS. The MySQL data dir resides in the root EBS /. It is a single EBS not RAID. We have three main tables. One of them Table C, the largest in content, is used only the last days worth of data. The Insert rate in these tables is around 80.000 rows A DAY. The 3 tables have around 42 million rows. The innodb_buffer_pool_size has ~30GB of the instance RAM.

Table A is the most important, its data length is ~33GB and index ~11GB Table B has data length is ~8GB and index ~5GB

In our website, the two main queries (latency-wise) are like this:

SELECT * FROM TableA WHERE id in (.....)

SELECT * FROM TableB JOIN .... WHERE id in (.....)

In most pages the (...) will be some ~50 recent ids with these queries taking < 50 ms each. But in some other pages we hit older ids and the latency for these queries skyrocket to 500ms, 800ms, up to 1.5 secs.

I've done a test where, after a Mysql restart, I did a SELECT id FROM TableB to force index into cache/memory. The Table B query would still be slow. Then I did a SELECT * FROM TableB. And now with the whole table in cache/memory the queries become really fast (<50ms).

My question: > 500 ms, > 1000ms is a reasonable latency for a query that just retrieves rows by PRIMARY KEY? Even in a 42M table? Even when all rows are in disk? It seems too much for me.

Would moving MySQL data to ephemeral storage (/mnt) improve this? Would using Provisioned IOPS help with?

Felipe Hummel
  • 4,674
  • 5
  • 32
  • 35
  • Are you using a single EBS volume or a RAID'd set? Using /mnt can be somewhat risky without solid replication. You can test provisioned IOPS without that much cost to yourself. – datasage Feb 12 '13 at 18:48
  • Single EBS. I guess I cant move to provisioned IOPS without dumping the whole dataset to a new EBS volume. So I'd like first some knowledge if it is worth the work of moving. – Felipe Hummel Feb 12 '13 at 20:04
  • 1
    I think you can snapshop your volume and create a new one from the snapshop with provisioned IO. Then attach it to a new instance to test it. The nice thing about ec2 is the flexibility to provision new instances to test performance changes without having to commit to it. – datasage Feb 12 '13 at 20:20

2 Answers2

8

Disclaimer: I'm no expert on (My)SQL performance at all, just commenting on the AWS aspects of your use case.

With that out of the way, there are several questions to address, first and foremost:

Would moving MySQL data to ephemeral storage (/mnt) improve this?

I've provided an answer for the identical question Will moving data from EBS to ephemeral storage improve MySQL query performance?, please check it out for some important details - TL;DR: You most definitely don't want to do that if you have any durability needs (except if you exactly know what you are doing), and performance gains via ephemeral storage claimed in the past are also dubious at best, if not plain wrong from today's perspective.

Would using Provisioned IOPS help with?

Absolutely, Provisioned IOPS Volumes are specifically designed to meet the needs of I/O-intensive workloads, particularly database workloads, that are sensitive to storage performance and consistency in random access I/O throughput, see the post Fast Forward - Provisioned IOPS for EBS Volumes for a general introduction.

  • Please note that these ideally (but not necessarily) go hand in hand with EBS-Optimized Instances, which use an optimized configuration stack and provides additional, dedicated capacity for EBS I/O. This optimization provides the best performance for your EBS volumes by minimizing contention between EBS I/O and other traffic from your Amazon EC2 instance.

  • Specifically you'll want to read through the dedicated section Increasing EBS Performance, which addresses how to look at the I/O performance you require and your options for increasing EBS performance to meet those requirements with RAID and/or Provisioned IOPS depending on your use case.

My question: > 500 ms, > 1000ms is a reasonable latency for a query that just retrieves rows by PRIMARY KEY? Even in a 42M table? Even when all rows are in disk? It seems too much for me.

As mentioned I can't judge the values as such, however, given your specification you seem to have memory contention, insofar the m2.2xlarge instance features 'only' 34.2 GiB of memory and you are allocating ~30GB for the innodb_buffer_pool_size already - this seem to be a bit high to me given other memory requirements of the OS and/or MySQL, so there might already be swapping involved, which would perfectly explain the cache/memory warming behavior you are experiencing.

  • As a general recommendation for database workloads it seems to be the biggest bang for the buck by far these days to simply ensure your dataset fits entirely into RAM, which is easier than ever with the plethora of instance types (if at all feasible in the first place).

Finally I recommend to read the very recent post about Improving PostgreSQL performance on AWS EC2 - the recommendations there primarily address the AWS side of things as well and do apply to MySQL too accordingly; section Durable databases pretty much summarizes my suggestions above:

For a durable database where you care about your data, what you want instead of a high I/O instance is an EBS Optimized instance, which has guaranteed network bandwidth to the EBS storage servers. Use EBS volumes with provisioned IOPs and, for best results, stripe a group of EBS volumes into a RAID10 array. See increasing EBS performance.

Community
  • 1
  • 1
Steffen Opel
  • 63,899
  • 11
  • 192
  • 211
0

If you IN-statement conatins SQL-subquery than EC2 instance might be very slow as by default it uses MySQL 5.5 (for mor detail look at MySQL is extremely slow on EC2)

Community
  • 1
  • 1
Roman Kh
  • 2,708
  • 2
  • 18
  • 16