21

I'm facing the following challenge:

I have a bunch of databases in different geographical locations where the network may fail a lot (I'm using cellular network). I need to keep all the databases synchronized but there is no need to be in real time. I'm using Java but I have the freedom to choose any free database.

How can I achieve this?

philipxy
  • 14,867
  • 6
  • 39
  • 83
jassuncao
  • 4,695
  • 3
  • 30
  • 35

4 Answers4

10

I am not aware of any databases that will give you this functionality out of the box; there is a lot of complexity here due to the need for eventual consistency and conflict resolution (eg, what happens if the network gets split into 2 halves, and you update something to the value 123 while I update it on the other half to 321, and then the networks reconnect?)

You may have to roll your own.

For some ideas on how to do this, check out the design of Yahoo's PNUTS system: http://research.yahoo.com/node/2304 and Amazon's Dynamo: http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html

SquareCog
  • 19,421
  • 8
  • 49
  • 63
  • The Yahoo paper is very interesting. The idea of developing my own solution was already on my mind. I'd love to have something like GIT for databases – jassuncao Sep 24 '09 at 18:46
  • 2
    The thing about Git is, it makes you perform a manual merge when there are conflicting updates. That's generally not a viable option for databases.. So you need a consistency model that leads to as few surprises as possible. – SquareCog Sep 24 '09 at 19:13
  • [Mirror for the Yahoo link](http://web.archive.org/web/20090926073638/http://research.yahoo.com/node/2304) – joseLuís Oct 08 '19 at 10:10
4

Check out SymmetricDS. SymmetricDS is web-enabled, database independent, data synchronization/replication software. It uses web and database technologies to replicate tables between relational databases in near real time. The software was designed to scale for a large number of databases, work across low-bandwidth connections, and withstand periods of network outage.

chenson42
  • 1,108
  • 6
  • 13
  • Yeah. I already looked at it and give it a spin. Looks pretty god. Proabably is what's going to be used – jassuncao Jan 30 '10 at 00:43
0

I don't know your requirements or your apps, but this isn't a quick answer type of question. I'm very interested to see what others have to say. However, I have a suggestion that may or may not work for you, depending on your requirements and situation. particularly, this will not help if your users need to use the app even when the network is unavailable (offline access).

Keeping a bunch of small databases synchronized is a fairly complex task to do correctly. Is there any possibility of just having one centralized database, and either having the client applications connect directly to it or (my preferred solution) write some web services to handle accessing/updating data rather than having a bunch of client databases?

I realize this limits offline access, but there are various caching strategies you can use. (Which of course, leads you back to your original question.)

David
  • 72,686
  • 18
  • 132
  • 173