1

I have several large same structured csv files(billion rows). I want to select same columns in all the csv files and then import them into a Cassandra table. What is the best way? The size of csv is beyond the power of COPY command. And I think for selecting columns is not suitable for SSTable(Cassandra bulk loader), right? My initial idea is using Python multiprocessing and batch insert in Cassandra Python Driver, But I don't know how to use multiprocessing and batch insert specifically.

Oak
  • 69
  • 11
  • Does this help? https://stackoverflow.com/questions/22920678/cassandra-batch-insert-in-python – Tarun Verma Aug 03 '17 at 10:41
  • I have checked this page, but it doesn't help a lot, because of lack of multiprocessing. Thanks anyway, good buddy. – Oak Aug 03 '17 at 11:20

0 Answers0