2

I do not quite understand what happens to the data in memtables if the instance fails? Since writes are first pushed to memory before they are written to disk, if an instance fails, do we lose all the information which the memtable has not written to disk yet? For example:

1) User 1 inserts something into Cassandra:
2) My application sees it has been inserted so it gives the user a notification it has been inserted.
3) The insert is inside the instance, and it fails before the commit log was full, so a flush did not occur to disk.

Did User 1 just lose his data?

user2924127
  • 6,034
  • 16
  • 78
  • 136

3 Answers3

2

It depends on how you have configured the commitlog. The default is periodic, which gives better performance but less durability, or you can set it to batch, which will not ack writes until they are written to disk. See more information here.

Community
  • 1
  • 1
Jim Meyer
  • 9,275
  • 1
  • 24
  • 49
0

The short answer is yes only if (as @Jim Meyer said) the commitlog_sync is set to periodic and all replicas crash within commitlog_sync_period_in_ms after receiving the write.

Eugen Constantin Dinca
  • 8,994
  • 2
  • 34
  • 51
0

If data is written to commitlog, then it is for sure never lost (unless disk is corrupt). Data which is not flushed will be replayed from commit log when the node is up again.

Periodic/batch options impact when data is written to the commit log. if the node fails before writing to commit log, yes data might be lost only if you are in a single replication factor. If the replication factor is more than one, then all of the nodes have to go down simultaneously for the data to be lost.