0

In my company we have multiple database structure hosted in SQL Server. for e.g., whenever a new customer sign up with us, we create a new DB in SQL Server to maintain their data.

Right now we already have 2000+ DBs in our database server. We expect more customers to sign up in near future, which might even cross 5000+ count. Having DBs of 5000+ and increasing count of DBs might not be an advisable one, sometimes we run some task which will run across the DBs, and if we are going to run tasks across 5000+ DBs we will surely end up in performance issues.

What would be the alternative solution to avoid creating multiple DBs for each and every customer and also at the same time maintaining their data separately?

I am hearing about BigData and other DataBase solutions but could not get clear picture.

Can someone share some light on this?

Community
  • 1
  • 1
Matt
  • 79
  • 1
  • 4
  • 1
    Bigdata is a nice option but creating a database for each customer it can be difficult to handle all those dbase even with bigdata in future .. – Ankur Jyoti Phukan Dec 16 '15 at 19:02
  • For one customer database,on an average how many tables/records do you have? – nobody Dec 16 '15 at 21:39
  • "we will surely end up in performance issues" - try it and find out. I saw an article about testing pg with 1M databases, and it ran. Separating tenants into their own databases is good for security and easier to backup – Neil McGuigan Dec 16 '15 at 22:16

3 Answers3

0

If the databases have an identical schema you could combine them into one. That way each customer's table will now become a set of rows in the new database. A new customer will probably be a few new rows in the tables that store customer's profile.

You can use row level security for restricting access to customer's data:- https://msdn.microsoft.com/en-us/library/dn765131.aspxpx

For pros and cons of using this approach over your existing see: Pros/Cons Using multiple databases vs using single database and Single or multiple databases

Using other options provide great learning opportunity but may have a significant transition cost even if there were some that were indeed better.

Community
  • 1
  • 1
Himanshu
  • 2,384
  • 2
  • 24
  • 42
0

one solution I would suggest is to use prefix on the table name for each customer. you can then solve the security issue by limit per customer per set of tables.

the con is you will have to rewrite your application to use prefix to each table whenever it want to access it. If you have a lot of tables , that will be a problem.

I think this is how some multi Wordpress hosting site handle it database issue.

hzhsun
  • 114
  • 1
  • 5
0

you should consider if you just store the data and access it with simple querys or if you usually do complex query's, if you just store the data and access it with simple querys and your need are not 100% relational maybe you should consider to move part of your data to HDFS file system: https://en.wikipedia.org/wiki/Apache_Hadoop#HDFS . To process the data in hadoop there are many tools but the raising one for sure is spark: https://en.wikipedia.org/wiki/Apache_Spark

probably the best solution is to start move your historic data in HDFS just for storage and keep the rest as it is until you take confidence with the hadoop and spark paradigm

hadoop is a distributed , fault tollerant file system and spark is an engine for batch processing huge amount of unstructured or structured data, consider that data in hadoop are not structure usually so you have to change the way you process your data, if you want to still use sql I suggest to check Impala and Hive as well:

http://impala.io/

https://hive.apache.org/

Take a look at cloudera web site for a more structure IT solution instead of a lot of single tool that you will need to organize http://www.cloudera.com/content/www/en-us/solutions.html

They have a quick start VM to try all the hadoop ecosystem tools , probably thats the best way to start experimenting:

http://www.cloudera.com/content/www/en-us/downloads/quickstart_vms/5-4.html