6

I'm reading an document about repair in Cassandra, it says

The comparison begins with the top node of the Merkle tree. If no difference is detected, the process proceeds to the left child node and compares and then the right child node.

However, Merkle tree's non-leaf nodes represents:

Each Parent node higher in the tree is a hash of its respective children. Because higher nodes in the Merkle tree represent data further down the tree, Casandra can check each branch independently without requiring the coordinator node to download the entire data set.

According to this, and other data structure articles I have found, they all indicates a following comparison deeper than the root only be proceed if two Merkle trees has the root that differs. I'm not sure if the document describes it correctly that I may understood something wrong, or it actually has an error?

snowmantw
  • 1,611
  • 1
  • 13
  • 25
  • This is a mistake in DataStax docs. I've send them request to correct it about 3 months ago, but they didn't. – nevsv Jun 28 '17 at 06:57
  • Wow, 3 months ago. I don't know how to send PR or issue to them, but if you have the link I can make new feedback about the issue. – snowmantw Jun 28 '17 at 07:34
  • I've found some email somewhere on their site. It was general email, not related to the documentation, so I've asked to forward it department in charge of it. Probably this is lost somewhere...You can try contacting there's CTO/Lead engineers by finding them on LinkedIn/Twitter. – nevsv Jun 28 '17 at 09:44

1 Answers1

7

There is a mistake in Datastax docs.

There's is a good explanation of Merkle tree's comparison:

http://distributeddatastore.blogspot.co.il/2013/07/cassandra-using-merkle-trees-to-detect.html

Many eventual consistency databases using Merkle tree for anti-entropy. You can review the documentation and explanation of it in Riak/DynamoDB documentation.

nevsv
  • 2,448
  • 1
  • 14
  • 21