4

I have a nearly empty three node cassandra cluster using v2.0.9, and ran nodetool repair on one node

$ nodetool repair
[2014-10-11 13:11:17,862] Starting repair command #1, repairing 768 ranges for keyspace myks

This seems to hang forever (at least a few hours). However, if I look in /var/log/cassandra/system.log, I see the following:

$ date
Sat Oct 11 15:55:02 PDT 2014
$ grep 'session completed successfully' /var/log/cassandra/system.log | head -5
 INFO [AntiEntropySessions:1] 2014-10-11 13:11:21,023 RepairSession.java (line 282) [repair #c281ead0-5182-11e4-a96f-63ead2d0f3f0] session completed successfully
 INFO [AntiEntropySessions:2] 2014-10-11 13:11:23,544 RepairSession.java (line 282) [repair #c46020b0-5182-11e4-a96f-63ead2d0f3f0] session completed successfully
 INFO [AntiEntropySessions:3] 2014-10-11 13:11:25,631 RepairSession.java (line 282) [repair #c5df46a0-5182-11e4-a96f-63ead2d0f3f0] session completed successfully
 INFO [AntiEntropySessions:4] 2014-10-11 13:11:32,216 RepairSession.java (line 282) [repair #c71ef290-5182-11e4-a96f-63ead2d0f3f0] session completed successfully
 INFO [AntiEntropySessions:1] 2014-10-11 13:11:38,950 RepairSession.java (line 282) [repair #cb0b9610-5182-11e4-a96f-63ead2d0f3f0] session completed successfully
$ grep 'session completed successfully' /var/log/cassandra/system.log | tail -5
 INFO [AntiEntropySessions:3] 2014-10-11 13:30:49,527 RepairSession.java (line 282) [repair #7b9e2720-5185-11e4-a96f-63ead2d0f3f0] session completed successfully
 INFO [AntiEntropySessions:2] 2014-10-11 13:30:51,671 RepairSession.java (line 282) [repair #7cdbd740-5185-11e4-a96f-63ead2d0f3f0] session completed successfully
 INFO [AntiEntropySessions:4] 2014-10-11 13:30:56,139 RepairSession.java (line 282) [repair #7e232450-5185-11e4-a96f-63ead2d0f3f0] session completed successfully
 INFO [AntiEntropySessions:1] 2014-10-11 13:30:58,633 RepairSession.java (line 282) [repair #80ccc080-5185-11e4-a96f-63ead2d0f3f0] session completed successfully
 INFO [AntiEntropySessions:3] 2014-10-11 13:31:01,744 RepairSession.java (line 282) [repair #82497570-5185-11e4-a96f-63ead2d0f3f0] session completed successfully
$ grep 'session completed successfully' /var/log/cassandra/system.log | wc -l
357

This suggests that nodetool repair was moving swiftly along, completing the first 357 out of 768 (? I assume) repairs in 20 minutes and then became hung for some reason.

How can I diagnose this? Nothing immediately jumps out at me in the logs, but there's a lot of information in them.

jonderry
  • 23,013
  • 32
  • 104
  • 171
  • 1
    It's very important to tell us what exact version of C* you are using since the repair has been heavily modified between 1.2, 2.0 and 2.1 – Lyuben Todorov Oct 12 '14 at 04:52
  • I edited the question. It's 2.0.9 – jonderry Oct 13 '14 at 17:48
  • I answered a similar question about monitoring repair operations here: http://stackoverflow.com/questions/25064717/how-do-i-know-if-nodetool-repair-is-finished/25081283#25081283 – Aaron Oct 13 '14 at 18:37

0 Answers0