1

I want to use two fields as primary key (without clustering key).

PRIMARY KEY ((a, b)) => is that means a + b is the primary key, right? Or is it just partition key?

I'm confused

Aaron
  • 55,518
  • 11
  • 116
  • 132
sparkless
  • 265
  • 1
  • 3
  • 10

2 Answers2

4

A primary key definition of:

PRIMARY KEY ((a, b))

...sets both a and b as a composite partition key. In this scenario, there is no clustering key.

This definition:

PRIMARY KEY (a, b)

...uses a as the partition key and b as the clustering key.

For more info, I recommend Carlo's famous answer to this question:

Difference between partition key, composite key and clustering key in Cassandra?

Aaron
  • 55,518
  • 11
  • 116
  • 132
3

To add to Aaron's response, the brackets (( and )) combine the 2 columns into one partition key. This means that you need to provide both columns in your filter in order to query the table:

SELECT ... FROM ... WHERE a = ? AND b = ?

Neither of these queries are valid because they only filter on 1 of the 2:

SELECT ... FROM ... WHERE a = ?
SELECT ... FROM ... WHERE b = ?

For what it's worth, I've explained the terms "composite partition key" and "compound primary key" with some real examples to illustrate the differences in this post -- https://community.datastax.com/questions/6171/. Cheers!

Erick Ramirez
  • 13,964
  • 1
  • 18
  • 23
  • 1
    Excellent addendum. Thanks Erick! – Aaron Sep 03 '21 at 14:46
  • If we set `PRIMARY KEY a`, that means primary key and partition key are `a`. If we set PRIMARY KEY ((a, b))`, that means partition key and primary key are `a and b`. In that case if `a` is the same but `b` is different, these two data is stored in different partitions. Because `a and b` is a primary key (one row for each partition). If we set `PRIMARY KEY (a, b)`, that means partition key is `a` and clustering key is `b`. In that case if `a` is the same but `b` is different, these two data is stored in the same partition. Because the value of the partition key is the same. Am i right? – sparkless Sep 03 '21 at 20:40
  • No, this is incorrect -- "In that case if a is the same but b is different, these two data is stored in different partitions." The partition key `(a, b)` identify where the partition is stored -- they are not 2 different partitions. Have a look at the link I posted for explanation. Cheers! – Erick Ramirez Sep 04 '21 at 01:16
  • Your are right, but primary key is equal to the partition key if clustering key is not specified. am i wrong? `PRIMARY KEY ((a, b))` means each unique `a and b` is the unique partition and primary key. `PRIMARY KEY (a, b)` means data that has the same values of the `a` is goes to the same partition – sparkless Sep 04 '21 at 06:33
  • 1
    Correct, primary key = partition key is there is no clustering column. And yes, BOTH `a` AND `b` uniquely identify the partition in the cluster. There is no `a`, there is no `b` -- but BOTH `a,b` only. Cheers! – Erick Ramirez Sep 04 '21 at 07:15