2

I am doing a proof of concept on Cassandra using the apache-cassandra-3.10 and CassandraCSharpDriver version 3.2.1.
I want to put in a large amount of Tick data into Cassandra with C#.
My current schema looks like this.

CREATE TABLE my_keyspace.ticks (
    instrumentcode int,
    timestamp timestamp,
    type smallint,
    exchange smallint,
    price decimal,
    volume int,
    PRIMARY KEY (instrumentcode, timestamp, type, exchange)
) WITH CLUSTERING ORDER BY (timestamp ASC, type ASC, exchange ASC);

I am using a prepared statement in the following way:

//setup
Cluster = Cluster.Builder().AddContactPoints("localhost").Build();
Session = Cluster.Connect("my_keyspace");
ps = Session.Prepare("Insert into ticks (instrumentcode, timestamp, type, exchange, price, volume) values(?,?,?,?,?,?)");
//repeated re-using the same prepared statement
var statement = ps.Bind(tickCassandra.Instrumentcode, tickCassandra.Timestamp, tickCassandra.Type, tickCassandra.Exchange, tickCassandra.Price, tick.Volume);
var x = Session.Execute(statement);

With this code I am stuck at an insert performance of around 600 inserts/second - both on my dev machine (i7) and my prod like machine (16 core beast).
Do you see any performance improvements in my schema or my C# code? Or do I just need to tweak more the Cassandra configuration?

weismat
  • 7,195
  • 3
  • 43
  • 58

1 Answers1

4

Try using:

//Execute a statement asynchronously
session.ExecuteAsync(statement);

This should be a huge boost (around 3-4 times) of what you have now.

Edit after comments:

You also need to be careful with retry and exception handling, once you move your app out from poc stage. There are couple of very good and helpful examples (suggestion of xmas79 - thanks!)

  1. https://stackoverflow.com/a/39643888/6570821
  2. https://stackoverflow.com/a/40524828/6570821
  3. https://stackoverflow.com/a/40794021/6570821
Community
  • 1
  • 1
Marko Švaljek
  • 2,071
  • 1
  • 14
  • 26
  • 2
    You should also suggest to apply some sort of backpressure, and to properly handle write timeouts, that will happen sooner or later. – xmas79 Apr 18 '17 at 08:23
  • Yes, once the app goes out of poc stage this should also be done. Thanks for reminding me. I guess It's o.k. if I just leave it with your comment. Just upvoted so that it stays on top should anybody else comment. – Marko Švaljek Apr 18 '17 at 08:28
  • Do you have any suggestion for a good code example for a proper handling of write timeouts? – weismat Apr 18 '17 at 08:41
  • I guess this question might be a good starting point: http://stackoverflow.com/questions/30698379/handling-exceptions-for-deferred-tasks-in-cassandra – Marko Švaljek Apr 18 '17 at 08:48
  • I answered a couple of times: 1. http://stackoverflow.com/a/39643888/6570821 2. http://stackoverflow.com/a/40524828/6570821 3. http://stackoverflow.com/a/40794021/6570821 – xmas79 Apr 18 '17 at 08:58
  • Thanks for the help. Will totally bookmark your answers for future reference ;) – Marko Švaljek Apr 18 '17 at 08:59