Design a PolyGlot database on Google Cloud Platform

Question

I am designing a global ERP / scheduling system that has to be hosted on Google Cloud SQL and Google Datastore.

For the most part the data is strongly relational and not large in volume or volatile, so is a natural fit for a relational database.

The current design has addresses stored locally in the customer, contract, employee, site, workorder, contractedSites tables.. about 12 different places in total. I requested the DEV team modify the design to create an ADDRESS table that can be referenced by all other entities (with linking tables if necessary to provide flexiblity for address history, multiple addresses, etc).

THE QUESTION - As the address / contact details format wildly differs for all these scenarios, different countries and multiple phone numbers etc, I was thinking that the ADDRESS & CONTACTDETAILS should be moved to the DataStore and reference them by id's from the Cloud SQL platform. So the structure is completely flexible to accomodate the address formats of the world.

Does this sound sensible or is it over engineering the solution and should go with a generic address table in CLoud SQL only?

I have a concern as the bottleneck in the Google platform appears to be the single MySQL master.. This is a managed service limited to 16 vCPU's so I am attempting to move more functional areas into the Datastore.

Hope that makes sense!

update - we are allowing this, I had a scenario where I had a complex set of tables that housed rules reference data used to perform a number of calculations, so we kept these in CLoud SQL, but the output was a line by line payroll information that would be generating millions of rows a week, so this was moved to the datastore. — Chris Wood, Jul 24 '16 at 06:53

score 1 · Accepted Answer · answered Jun 09 '16 at 17:14

You certainly could do that and wouldn't be the first person to run a mixed database system. You will be adding complexity though and there are several things you should weigh before proceeding down that path.

You mention the 16 vCPU limit - it's unlikely that addresses would be the reason to be worried about this, although Cloud Datastore would scale more easily (as in, we do it for you)
For data with a more diverse schema requirements, since you are already using Could SQL would could simply use the JSON type to store it. As long as your are on Cloud SQL Second Gen.
Transactions: If you need operations that involve data in both Cloud SQL and Cloud Datastore to be transactional, you'll have to implement that yourself. If you can live with potential inconsistencies and/or clean-up, no problem.
Cloud Datastore would definitely allow you a lot more flexibility to store this type of data and query it efficiently at scale. It's the sort of use case it excels in.

thanks Dan, the reference to 16 vCPU was in reference to the fact that this is going to be a truely global system by the time it is fully rolled out not especially around addresses. I have some experience in the last with AWS managed databases hitting a wall. We are already using both CLoud SQL and the Datastore in anger. Pusing images, freetext, highly volatile data to the DataStore.. i'm just trying to move some of the key value pair type that flutuates alot there too... address, contact details, job descriptions and roles, that kind of stuff, protecting my 1 MySQL Master :) — Chris Wood, Jun 10 '16 at 08:32

Design a PolyGlot database on Google Cloud Platform

1 Answers1