2

I'm looking for some general advice on the best way to perform heavy crunching/data-mining on a database table, without affecting the performance of regular site queries on the table. Some of the calculations may involve joining several tables, and involve complex sorting and ordering. So "use better indexes" isn't always the solution.

This question isn't really specific. I'm looking for a general way to solve a problem that's come up many times over the years. So I don't have a specific table schema to show, a specific query to show. I've considered dumping the table first using mysqldump, and then re-importing the table under a different name, and then performing my heavy crunching on that temp table. My sysadmin hates the idea, so I'm looking for any other solutions people have come up with to deal with this type of problem.

mellowsoon
  • 22,273
  • 19
  • 57
  • 75
  • perhaps a view thats a duplicate of the table? – Lyuben Todorov Mar 28 '12 at 03:36
  • Replication. Replicate the table to another server and do the crunching there. It's basically the same as your idea with mysqldump, only you make MySQL do all the work for you. – N.B. Mar 28 '12 at 10:35
  • @N.B. Would you say it's okay if the replicated database I'm doing the crunching on is a slave that's being used in production? Or should it be an entirely different database that's only used for big crunching? – mellowsoon Mar 29 '12 at 15:55
  • It depends on what your needs really are. Nothing prevents you from replicating to multiple slaves and use them however you find suitable. Rule is, the less you do, the faster the system is. So if you run your data crunching on a separate physical computer, your overall system won't be affected. – N.B. Mar 29 '12 at 16:32

2 Answers2

2

If your "heavy crunching" is all read only and you are not doing anything that needs to be written back into your production data, use a Master/Slave replication and use the Slave for all your reporting and data analysis needs. The replication link will keep the values up to date on the Slave, and you can hit the Slave with as much load as you want without slowing down the Master which is serving your production system.

D Mac
  • 3,727
  • 1
  • 25
  • 32
0

If you want to avoid affecting performance of your production database, the only solution I have used previously is to run your queries on another database server.

I would take a backup of the entire database and then restore it on a separate server.

Obviously, you cannot do this if you want to analyze real-time data. But for most analysis, a snapshot from the previous day is sufficient.

Craig T
  • 2,761
  • 5
  • 25
  • 33
  • 1
    I'm curious how you might go about moving your production database to another database server. mysqldump following my importing into mysql is slooooooow. :) – mellowsoon Mar 28 '12 at 17:49
  • Using SQL Server it is relatively painless. Run a full backup. Copy database backup file to new server. Restore database from backup. The slowest part is the copy of the backup file. This is all dependent on the size of your database. – Craig T Mar 28 '12 at 21:44