1

How to find out when a Cassandra table becomes "eventually consistent"? Is there a definitive way to determine this at a given point in time? Preferably programatically through the Datastax driver API? I checked out the responses to the following related questions but there does not seem to be anything more concrete than "check the nodetool netstats output"

  1. Methods to Verify Cassandra Node Sync
  2. how do i know if nodetool repair is finished
  • 1
    Its eventually consistent system, which means that its never consistent at a given point of time for all the records in the system. – dilsingi Feb 08 '18 at 01:52
  • Let's say I update a Cassandra table once a year. I find it hard to believe that there is not a way to check if that table is consistent after that one update. – Sundar Venkataraman Feb 08 '18 at 10:18
  • You can read consistent result for any individual row that was written/updated. That’s the key for distributed system and not worry about all the rows in the entire table at all times. Yes if there was only one update in entire year, followed by a repair with no nodes going down, you will get the row consistently. But almost certain that no real time use case satisfies that boundary even close – dilsingi Feb 08 '18 at 13:57
  • If you write once a year, you can run a repair and it will report if its consistent for the snapshot in time the validation compaction ran (-seq for using literal snapshot/hardlinks for a specific point in time). This is not something you can do through the driver. Instead build around asumptions of time. ie if you run inc repairs every 30 min, then you know that at worst case your data will be consistent within an hour of writing. If its a full repair every 7 days, then it can take a week in worst case (avg likely sub 10ms same dc) – Chris Lohfink Feb 08 '18 at 16:12
  • https://www.youtube.com/watch?v=lwIA8tsDXXE – Chris Lohfink Feb 08 '18 at 16:15
  • @ChrisLohfink I think you hit the nail right on the head when you said "run a repair and it will report if its consistent for the snapshot in time the validation compaction ran". That is exactly what I am looking for. Could you let me know how to do this? I looked at **nodetool repair** documentation but there does not seem to be a way to specify the timestamp to verify consistency against? I am ok with settling for a command-line based solution if that will work properly. – Sundar Venkataraman Feb 09 '18 at 14:16
  • its not a normal request, so do a repair with -seq and from the logs it will print when it started and when it gets all the merkle trees it will report if theres any inconsistencies or not. – Chris Lohfink Feb 09 '18 at 17:26
  • @ChrisLohfink Thanks. This works for me! – Sundar Venkataraman Feb 11 '18 at 13:15

1 Answers1

0

If your system is always online doing operations then it may never become full consistent at single point of time untill you are on Consistency level "ALL".

Repairs process logs error in log file if it does not get reply from other replica nodes cause they were down/timeout etc. you can check the logs if no error WRT AntiEntropy/stream it means your system is almost consistence.

Payal
  • 564
  • 3
  • 12