Cassandra clustering key uniqueness

Question

In the book Cassandra the definitive guide it is said that the combination of partition key and clustering key guarantees a unique record in the data base... i understand that the partition key is the one that guarantees unique of record - the node where the record is stored. And the clustering key is for the sorting of the records. Can someone help me understand this? thank and sorry for the question...

Does this answer your question? [Difference between partition key, composite key and clustering key in Cassandra?](https://stackoverflow.com/questions/24949676/difference-between-partition-key-composite-key-and-clustering-key-in-cassandra) — Ersoy, May 31 '20 at 05:06

Ersoy · Answer 1 · 2020-05-31T06:10:12.257

Single partition key (without clustering key) is primary key which has to be unique.
A partition key + clustering key has to be unique but it doesn't mean that either partition key or a clustering key has to be unique alone.

You can insert

(a,b) (first record)
(a,c) (same partition key with the first record)
(d,b) (same clustering key with the first record)

When you insert (a,b) again then it will update the non primary key values for existing primary key.

In the following example userid is partition key and date is clustering key.

cqlsh:play> CREATE TABLE example (userid int, date int, name text, PRIMARY KEY (userid, date));
cqlsh:play> INSERT INTO example (userid, date, name) VALUES (1, 20200530, 'a');
cqlsh:play> INSERT INTO example (userid, date, name) VALUES (1, 20200531, 'a');
cqlsh:play> INSERT INTO example (userid, date, name) VALUES (2, 20200531, 'a');
cqlsh:play> SELECT * FROM example;

 userid | date     | name
--------+----------+------
      1 | 20200530 |    a
      1 | 20200531 |    a
      2 | 20200531 |    a

(3 rows)
cqlsh:play> INSERT INTO example (userid, date, name) VALUES (2, 20200531, 'b');
cqlsh:play> SELECT * FROM example;

 userid | date     | name
--------+----------+------
      1 | 20200530 |    a
      1 | 20200531 |    a
      2 | 20200531 |    b

(3 rows)
cqlsh:play>

I think the first part of this answer is wrong. The `primary key` (same as in RDBMS, every row in a table must have a unique primary key) is `partition key` + `clustering columns`. The `partition key` (which is not unique for each row and does not exist in RDBMS) is only used to decide which node the rows in a partition will be stored on. — phonaputer, Jun 03 '20 at 02:42
Actually, "wrong" is a bit harsh. I think the phraseology is pretty confusing though. — phonaputer, Jun 03 '20 at 02:49
Thank you for additional explanation @phonaputer. The comment i've made to the question has the link which covers almost all the cases of primary, partition, clustering, composite keys. Since OP didn't find it useful i tried to answer in a different way by examples. — Ersoy, Jun 03 '20 at 02:49

Cassandra clustering key uniqueness

1 Answers1