5

I want to perform a transaction with multiple write operations (~5 inserts/updates to different tables) in Cassandra but if any of them fail, then the rest should not be written (either rollback each operation or fail the whole transaction).

Please let me know what is the proper approach to perform this in Cassandra and how to do it (an example will be welcomed).

georgeliatsos
  • 1,168
  • 3
  • 15
  • 34

1 Answers1

2

Yes, you can use the logged batch functionality to accomplish this atomically. Note, that you do take a hit on performance. See the BATCH Statements documentation section of the C++ Driver.

Here is an example of how to do this in C++, taken from the documentation link above. It demos showing how to batch an INSERT, UPDATE and a DELETE together:

/* This logged batch will makes sure that all the mutations eventually succeed */
CassBatch* batch = cass_batch_new(CASS_BATCH_TYPE_LOGGED);

/* Statements can be immediately freed after being added to the batch */

{
   CassStatement* statement
      = cass_statement_new(cass_string_init("INSERT INTO example1(key, value) VALUES ('a', '1')"), 0);
   cass_batch_add_statement(batch, statement);
   cass_statement_free(statement);
}

{
   CassStatement* statement
      = cass_statement_new(cass_string_init("UPDATE example2 set value = '2' WHERE key = 'b'"), 0);
   cass_batch_add_statement(batch, statement);
   cass_statement_free(statement);
}

{
   CassStatement* statement
      = cass_statement_new(cass_string_init("DELETE FROM example3 WHERE key = 'c'"), 0);
   cass_batch_add_statement(batch, statement);
   cass_statement_free(statement);
}

CassFuture* batch_future = cass_session_execute_batch(session, batch);

/* Batch objects can be freed immediately after being executed */
cass_batch_free(batch);

/* This will block until the query has finished */
CassError rc = cass_future_error_code(batch_future);

printf("Batch result: %s\n", cass_error_desc(rc));

cass_future_free(batch_future);
Aaron
  • 55,518
  • 11
  • 116
  • 132
  • Hi Aaron, thank you for your thorough answer. Could it be a Cassandra transaction? Since you mentioned the performance hit on using the batch functionality of Cassandra, is there any alternative solution that has better performance? Thank you in advance. – georgeliatsos Aug 02 '16 at 09:09
  • @GeorgeL This is similar to the implementation of transactions in other databases. The main difference, is that in Cassandra you take a performance hit (about 30% or so) because it has to use a coordinator node to to talk to all of the nodes that may be impacted by your batch. Whereas in the RDBMS world, it is considered a performance improvement to batch multiple statements into a "transaction." You can do what's called an "unlogged batch" for less of a performance hit, but an unlogged batch cannot make any guarantees for atomicity if your batched statements affect multiple partitions. – Aaron Aug 02 '16 at 18:50