1

Refer to this video: https://youtu.be/NsI51Mo6r3o?t=18m48s

The video was dated in Sept. 2013. In technology term it is quite outdated. However, in the video it raised several challenges NuoDB had. I wonder did NuoDB improve on the aspects of:

  1. Race condition in the Join process. If nodes are joined in the wrong order, they'll end up in the quiet split-brain mode, and will lose data if rejoined later.
  2. Race conditions in database creation / schema operation
  3. Tricky to configure and start the system in an automated way
  4. When a node crashes, it does not bring back storage manager or transaction manager, means data can all of a sudden become less durable as you could have only 1 or 0 copy of the data.
  5. During a partition, transactions are blocked due to cpu/storage hauling resource
Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
Howard Lee
  • 977
  • 1
  • 11
  • 20

1 Answers1

1

Yeah, that was a while ago – but it was very helpful to our engineering team then. We did a lot of work to replicate those tests – and fix the problems they exposed. It’s all written up in a series of blog posts. The best place to start is here:

http://dev.nuodb.com/techblog/network-failure-handling-roundup

It’s the umbrella post for the others that build up to the full response

This next post was added a little later so it’s not linked in the above series, but it is still relevant:

http://dev.nuodb.com/techblog/testing-network-failure-aws

And with specific regard to your fourth point, about restarting crashed processes, NuoDB now has the concept of a Managed Database; that just means it has a defined SLA it will adhere to automatically - from Single Host, through Minimally Redundant and Multi-Host to Geo-Distributed. That means the database will restart or replace lost processes automatically to continue to meet its SLA. And you can change the SLA while the database is running.

Dai Clegg
  • 26
  • 1