3

I have a question what if we try to insert same data to cassandra database . Here by same i mean a set of 100 rows is already present in cassandra database say in a test column family .Again if we try to insert the same 100 rows to cassandra database i.e. rows with same rowkey , will it be inserted again ? .

Cœur
  • 37,241
  • 25
  • 195
  • 267
user1278493
  • 165
  • 1
  • 3
  • 9

1 Answers1

5

It will not be duplicated, it will be overwritten unless you place the data in a different column family or keyspace, then you can duplicate it.

Docs:
The first column value in the VALUES list is the row key value to insert. List column values in the same order as the column names are listed in the INSERT list. If a row or column does not exist, it will be inserted. If it does exist, it will be updated.

Lyuben Todorov
  • 13,987
  • 5
  • 50
  • 69
  • Thank you for help. By any chance can we avoid overwritting , here i mean if there is only 1 new row to be inserted to cassandra and other 100 rows are duplicate , then it should take time to insert only that one new row and save time by not overwritting again the 100 rows. – user1278493 Aug 13 '13 at 10:06
  • Why insert the other 100? Why just not insert the one row? It makes no sense to insert 100 duplicate rows. – Lyuben Todorov Aug 13 '13 at 10:21
  • Say for example i have a small dataset , but that data keep gets updating with new data . So i will copy whole data set to cassandra each time since i am not aware of what new data is with me . – user1278493 Aug 13 '13 at 10:32
  • Smells like poor design, if you are worried about efficiency you should definitely track what is new and only commit that to cassandra. – Lyuben Todorov Aug 13 '13 at 10:34