1

I have a 2-node Cassandra cluster. The replication factor is 2. The client sends data to node 1 only. If 2 nodes are both running the data is replicated from node 1 to node 2. However, if I first start node 1 only, the client sends data to node 1 then stops to send data. After that I start node 2. I expect that the data is "late" (or asynchronously) replicated from node 1 to node 2 but it's not. How can I configure this worked?

My Cassandra version is 2.1.6.

duong_dajgja
  • 4,196
  • 1
  • 38
  • 65
  • Did you run `nodetool cleanup` on the first node? Or have you tried repairing the second node? – Aaron Jul 31 '15 at 04:44
  • I want this is done automatically. Is there any configuration parameters to do this? I mean there is node problem with the second node. I just want it to be replicated "missed" data from the first node. – duong_dajgja Jul 31 '15 at 04:48

2 Answers2

2

Whenever a node is down while a write happens which means it misses storing data, the coordinator will store a 'hint' so that the node will receive the data once it comes back online.

This hint doesn't stay forever though and will be discarded if the node is down too long. You can configure this time by the max_hint_window_in_ms in the cassandra.yaml. I believe the default for this is 3 hours. Increasing this timeout could resolve your issue.

Otherwise, the conflict will be resolved through a read repair when this row of data is requested. If you set a sufficient read consistency level, then this will be resolved before the result is returned to the client.

http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_about_hh_c.html

Alec Collier
  • 1,483
  • 8
  • 9
  • Actually the data was replicated in my cluster. However it was very slow. Is there any way to speed up the "late" replication in my question. – duong_dajgja Jul 31 '15 at 08:47
  • If the hint is lost, the replication has to be triggered by a repair. If you have strong consistency requirements, increase the consistency level on the read. – Alec Collier Aug 01 '15 at 14:00
0

First You have to set max_hint_window_in_ms . Second you have to set token no of both machine if at the insertion time row key's token does not find the right cassandra node then it will go another node . If you have two node cluster then make both nodes as seeds node also .