I'm currently working on a system that in some cases will need to run on a local database during the day, and then replicated to a central server during the night. It cannot all run from one central database as the local sites are out of contact with it periodically. The data at the central server is for viewing and reporting only at head office, so nothing needs to be reverse replicated back to the site.
Each "site" is given a text based unique key (human generated). However, the thought of making every table in the database design reference the site key is not appealing.
Here's an example of a very cut back version of the schema without worrying about remote replication (which will work fine for the majority of clients) : -
(I'll only show the history table for the Area table, to keep things short) :
[Site]
SiteKey [PK] (Gauranteed 100% unique across all sites text based key)
[User]
SiteKey [FK -> Site]
UserID [PK]
[Area]
SiteKey [FK -> Site]
AreaID [PK]
Description
UpdatedDtm
UpdatedUserID [FK -> User]
[AreaHistory]
Site [FK -> Site]
AreaID [FK -> Area]
Description
UpdatedDtm
UpdatedUserID [FK -> User]
AuditedDtm
[Location]
AreaID [FK -> Area]
LocationID [PK]
Description
UpdatedDtm
UpdatedUserID [FK -> User]
[Sensor]
LocationID [PK / FK -> Location]
SensorNo [PK]
UpdatedDtm
UpdatedUserID [FK -> User]
[Reading]
LocationID [PK / FK -> Sensor]
SensorNo [PK / FK -> Sensor]
ReadingDtm [PK]
Which is fine, until I come to "merge" the database with the database at the central server. I'm obviously going to get clashes in the Location table because I'm mixing data with ID's generated at other sites.
The first way I thought around this problem was to do this:
gs short) :
[Location]
SiteKey [FK -> Location, FK -> User] ** ADDED THIS
AreaID [FK -> Area]
LocationID [PK]
Description
UpdatedDtm
UpdatedUserID [FK -> User]
[Sensor]
SiteKey [FK -> Location, FK -> User] ** ADDED THIS
LocationID [PK / FK -> Location]
SensorNo [PK]
UpdatedDtm
UpdatedUserID [FK -> User]
[Reading]
SiteKey [FK -> Sensor] ** ADDED THIS
LocationID [PK / FK -> Sensor]
SensorNo [PK / FK -> Sensor]
ReadingDtm [PK]
Basically, every table gets a SiteKey making each row unique to the site.
An alternative is this (using UUIDs in some places) : -
[User]
SiteKey [FK -> Site]
UserUUID [PK]
[Area]
SiteKey [FK -> Site]
AreaUUID [PK]
Description
UpdatedDtm
UpdatedUserUUID [FK -> User]
[AreaHistory]
Site [FK -> Site]
AreaUUID [FK -> Area]
Description
UpdatedDtm
UpdatedUserUUID [FK -> User]
AuditedDtm
[Location]
AreaUUID [FK -> Area]
LocationUUID [PK]
Description
UpdatedDtm
UpdatedUserUUID [FK -> User]
[Sensor]
LocationUUID [PK / FK -> Location]
SensorNo [PK]
UpdatedDtm
UpdatedUserUUID [FK -> User]
[Reading]
LocationUUID [PK / FK -> Sensor]
SensorNo [PK / FK -> Sensor]
ReadingDtm [PK]
Remember this is cut down, but it illustrates the problem.
Are there any alternatives I may be missing? I thought about ID remapping but that seemed to introduce new even worse nightmares.
The annoying part of this is that it's a very small number of cases that use this scenario, the rest are happy with one database serving many sites. However, the client wanting this scenario is the biggest ;-)
I thought maybe I could use UUID just for them, but that means creating exceptions everywhere so may as well just use them throughout (where neccessary) if I'm going to use them.
If it makes any difference, I'm using PostgreSQL.
PS, this discussion may be relevant to this case. I just keep wondering if there's another way considering I have a nice gauranteed unique SiteKey.