1

We are an SaaS company that pulls client customer data from their storage solution of choice, and processes the data on our server, giving the client a login to view the customer data with our value-add.

Multiple Clients

As we are pulling via various APIs and sources, we store a local copy of the data on our MySQL server. Our current (admittedly flawed) architecture is to store the retrieved data in individual client tables that are identical in everything but name. This is a result of a) unexpected growth, and b) an initial attempt to completely segregate client data so that there is exactly 0% chance one client sees another's data from a shared client table (our clients are typically competitors). So we have:

table client_ABC
 - field A
 - field B
 - ...
 - field N

table client_XYZ
 - field A
 - field B
 - ...
 - field N

As we scale, the above is predictably falling on it's face - we are adding dozens of identical client tables, and any changes to the schema are a nightmare. I am leaning towards combining the data into a single table and adding a 'client' column but part two of this question potentially complicates things in that...

Inconsistent/Unique Client Data

Our second issue is that the data we are pulling has little in common on a client to client basis. Each client's data has a few common elements (name, email), but the rest of the data is different, in that some client's data has associated address info, some has a detailed purchase record, etc.

Our current solution is to have our client tables above each contain a number of generic 'meta' fields, which we then map on a client by client basis so that when our business logic is displaying a customer for client ABC we have:

customer ABC
 - name -> name
 - email -> email
 - street -> meta_1
 - city -> meta_2

For client XYZ, we might have:

customer XYZ
 - name -> name
 - email -> email
 - last purchase -> meta_1

As we add clients, we are finding those with non-flat customer data (i.e. a complete sales record), the above solution is failing, as we now need to add custom secondary client tables to store the additional data.

Keep in mind that all of this gets exposed through shared code/business logic to all clients.

One thought would be to store individual customer data in a secondary structure, like JSON, in a generic 'data column, so that we have something like:

table client
 - name = "Bill Smith"
 - email = "bsmith@example.com"
 - data = { "street": "123 Fake St", "city": "Big City"}

Issue here would be how do we do fullttext searches, indexing, etc.

Any suggestions on how to start to tackle these two related issues is appreciated!

Community
  • 1
  • 1
Unpossible
  • 10,607
  • 22
  • 75
  • 113
  • Are you sold on MySQL? Open to moving it all to another technology? – ryan1234 Jan 19 '13 at 23:43
  • Sure, only thing keeping us on MySQL is familiarity, but due to some mapping-related stuff we need to do, was actually looking at Postgres as a possible option. – Unpossible Jan 20 '13 at 03:43
  • 1
    I would check out Mongo and do a proof of concept against it. Especially if you have schemas that are going to be very different (your example of meta data fields). For indexing and searching, both Elastic Search and Solr support having JSON documents put in for searching, which would work well with a document database. – ryan1234 Jan 20 '13 at 04:04
  • Excellent idea. Will blueprint this out, thanks! – Unpossible Jan 20 '13 at 05:05
  • 1
    sounds like a good fit for a document storage solution like mongo, which ryan1234 suggested. CouchDB is another. Could also consider using a graph storage solution such as Neo4J. – djjolicoeur Jan 20 '13 at 05:40

1 Answers1

0

Per the suggestion from jsoft, we are going with Mongo - perfect solution for our variable client architecture requirements.

Some reasons for using Mongo include:

Community
  • 1
  • 1
Unpossible
  • 10,607
  • 22
  • 75
  • 113