I need to do a batch INSERT
in Cassandra
using Python
.
I am using the latest Datastax
python driver.
The INSERTS
are batches of columns that will be in the same row. I will have many rows to insert, but chunks of the data will be in the same row.
I can do individual INSERTS
in a for loop
as described in this post:
Parameterized queries with the Python Cassandra Module
I am using parametrized query, values as shown in that example.
This did not help: How to multi insert rows in cassandra
I am not clear how to assemble a parameterized INSERT:
BEGIN BATCH
INSERT(query values1)
INSERT(query values2)
...
APPLY BATCH;
cursor.execute(batch_query)
Is this even possible? Will this speed up my INSERTS
? I have to do millions. Even thousands take too long.
I found some Java info:
http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0