2

I am helping a customer migrate a PHP/MySQL application to AWS. One issue we have encountered is that they have architected this app to use a huge number of databases. They create a new DB (with identical schema) for each user. They expect to have tens of thousands of users.

I don't know MySQL very well, but this setup does not seem at all good to me. My only guess is that the developers did this so they could avoid having tables with huge amounts of data. However I can only think of drawbacks (maintaining this system will be a nightmare, very difficult to extend, difficult to scale, etc..).

Anyhow, is this type of pattern commonly used within the MySQL community? What are the benefits, if any?

I am trying to convince them that they should re-architect the DB schema.

* [EDIT] *

In the meantime we know another drawback of this approach. We had originally intended to use Amazon RDS for data storage. However, RDS currently supports up to 30 databases per instance. So unfortunately RDS is now ruled out. The fact that RDS has this limit in place is already very telling, my interpretation is that having such a huge number of databases is not a common practice with MySQL.

Thanks!

santiagoIT
  • 9,411
  • 6
  • 46
  • 57
  • [Tangently related question on performance of many databases vs many tables.](http://stackoverflow.com/questions/696682/mysql-many-tables-or-many-databases) – Mike Jul 03 '13 at 22:07
  • @Mike: I imagine that the normalised alternative is not to create many tables within the same database, but rather to include additional columns within the existing tables to indicate with which user each record is associated. Therefore, I'm not convinced that that link is 100% relevant... – eggyal Jul 03 '13 at 22:09

2 Answers2

3

This is one of most horrible ideas I've ever read (and I've read many). For once the amounts of databases do not scale as well as tables in databases and on the other it would be impossible to connect users to each other or at least share common attributes and options. It essentially defeats the purpose of the database itself.

My advise here is rather outside of original scope: Your intuition knows more than you think, listen to it more!

havarc
  • 934
  • 5
  • 11
  • 1
    Do you have any experience or data to go with the idea of a "database not scaling as well as tables in the database"? [You can do cross-database joins and such just fine.](http://stackoverflow.com/questions/2132654/querying-multiple-databases-at-once) – Mike Jul 03 '13 at 22:06
  • 1
    Completely agree. Even separate tables per customer is bad and you seriously need to reevaluate your schema if you think this is a good idea. – Alvin Thompson Jul 03 '13 at 22:20
  • Granted, my SQL is rusted but even then it's still a bad idea as databases are designed to separate different branches of a problem, not different datasets. Also you'll have to manage user_ids as string within the the database-titles in opposite to faster integers and you may run into problems when deleting users. If large unique data per user comes in that threatens to overload the database then consider outsourcing these into separate data-files linked to by the DB. – havarc Jul 03 '13 at 22:20
0

This idea seems quite strange to me also! Databases are designed to handle large data sets after all! If there is genuine concern about the volume of data it is usually better practice to separate tables onto different databases - hosted on different physical servers as this allows you to spread the database level processes across hardware to boost performance

Also I don't know how they plan to host this application but many hosting providers are going to charge you per database instance!

Another problem this will give you is that it will make reporting more difficult - I wouldn't like to try including tables from 10,000 databases in a query!!

Lee Dyche
  • 129
  • 2
  • 12