Clustering order does not work with compound partition key

Question

With the following table definition:

CREATE TABLE device_by_create_date (
    year int,
    comm_nr text,
    created_at timestamp,
    PRIMARY KEY ((year, comm_nr), created_at)
) WITH CLUSTERING ORDER BY (created_at DESC)

Comm_nr is a unique identifier.

I would expect to see data ordered by created_at column, which is not the case when I add data.

Example entries:

Table CQL:

How can I issue select * from table; queries, which return data ordered by the created_at row?

score 4 · Accepted Answer · answered Feb 28 '18 at 17:52

4

TLDR: You need to create a new table.

Your partition key is (year, comm_nr). You're created_at key is ordered but it is ordered WITHIN that partition key. A query where SELECT * FROM table WHERE year=x AND comm_nr=y; will be ordered by created_at.

Additionally if instead of (year, comm_nr), created_at your key was instead year, comm_r, created_at even if your create table syntax only specifiied created_at as the having a clustering order, it would be created as WITH CLUSTERING ORDER BY (comm_nr DESC, created_at DESC). Data is sorted within SSTables by key from left to right.

The way to do this in true nosql fashion is to create a separate table where your key is instead year, created_at, comm_nr. You would write to both on user creation, but if you needed the answer for who created their account first you would instead query the new table.

answered Feb 28 '18 at 17:52

Highstead

2,291
3
26
30

Thank you! Lets say I want to display all devices from 2017 and 2018. In this case I would issue 2 queries - what about paging the result? PagingState is only available per ResultSet. Custom implentation? – Alex Tbk Feb 28 '18 at 18:04
You should be able to get away with `SELECT * FROM table WHERE year IN (2017, 2018) ...` This will be "1 query" from the application side and should be ordered in that fashion. Understand though that the 1 query will hit `*` nodes when it tries to achieve consistency. So your cluster will be doing more work than when on a regular query. – Highstead Feb 28 '18 at 18:13
2

Thats right, but wasnt the usage of IN discouraged? – Alex Tbk Feb 28 '18 at 18:15
Alternatively, you could instead do the coordination(custom implementation) of that sorting in the application layer as TYPICALLY the application layer will scale better than the Data layer. `IN` is discouraged because of that scaling issue above. You're putting more work on the coordinator node (memory pressure and network) and you're hitting more of the cluster with one query. – Highstead Feb 28 '18 at 18:16

Clustering order does not work with compound partition key

1 Answers1