0

BigTable-like databases store rows sorted by their keys.

Cassandra uses the combination of partition and clustering keys to keep the data distributed and sorted; Yet, you're able to select rows only by having the partition key!

How is Cassandra architectured to work this way?

For example, a way around this in RocksDB is, you can have one default column family by partition key and another with partition and clustering combination keys and iterate over sorted data and retrieve by default column family, which you end up with very high space complexity!

Update: I guess Cassandra tries to store each column in a different key, It starts by partition key and iterated over the different "column names" - perhaps a combination of others the clustering columns. Refer to the picture of underlying storage engine -.

SELECT * From authors WHERE name = 'Tom Clancy' AND year = '1993'. In a table where "name" is partition key and "year" and "title" are the clustering columns.

The visulatiation of Cassandra Storage Layer for the above query.

Koen J.
  • 1
  • 2

2 Answers2

2

All data in Cassandra are stored by partitions, so when you have condition only on partition key(s), then you retrieve all rows that have that partition keys - they are written one after another. You can find more information in the DSE Architecture guide.

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
  • I understand that. Cassandra to allow this duplication and to be in sorted order of clustering keys, I am guessing, tries to store each column in a different key in that order, so each row is a range query on columns. (please refer to the updated question with visualization that I have found). Am I correct? – Koen J. Nov 25 '18 at 19:01
  • Yes, something like - data is sorted by every clustering column before writing into the files... then Cassandra is able to do effective range queries, and even aggregations inside the same partition – Alex Ott Nov 25 '18 at 19:07
  • Ok. Based on the picture above and what we said, do you have any input on how does Cassandra understands the last column of a row - Specially NULL columns, considering they should not occupy any space. – Koen J. Nov 25 '18 at 19:30
  • Sorry, don’t understand you - you mean “how C* understands where to stop reading?” – Alex Ott Nov 25 '18 at 19:32
  • Yes, where does C* stop reading columns for a single row? Obliviously since each column is stored in a different key, I guess we have to stop where number of columns is reached, bu if we have NULL columns, how do we know we are at the of a row? – Koen J. Nov 25 '18 at 19:36
  • I don't remember precisely, but most probably it's looking for the same values of the clustering keys, and if they are changed for cell, then it means that it went to another row already... – Alex Ott Nov 26 '18 at 11:12
2

Cassandra has a partition key and a cluster key as you mentioned.

Here is a very short and clear explanation about the subject with good examples Datastax - The most important thing to know in Cassandra data modeling: The primary key.

The important take aways from this document are:

The first element in our PRIMARY KEY is what we call a partition key. The partition key has a special use in Apache Cassandra beyond showing the uniqueness of the record in the database. The other purpose, and one that very critical in distributed systems, is determining data locality.

Which explains how selecting rows only by having the partition key is part of Cassandra's design.

If the partition key has more than one column in its definition -

All columns listed after the partition key are called clustering columns. This is where we take a huge break from relational databases. Where the partition key is important for data locality, the clustering column specifies the order that the data is arranged inside the partition.

When clustering columns are designed correctly the read queries should take less time comparing to not defining the clustering columns.

Aside of the link above you can find really good explanation and examples in this stakoverflow question. (Difference between partition key, composite key and clustering key in Cassandra?).

Update:

The database stores and locates the data using a nested sort order. The data is stored in a hierarchy that the query must traverse. You have shared key for different values of the clustering columns. Take a look here: Clustering columns

Gal S
  • 980
  • 6
  • 17
  • I know that, what I am primarily talking about is the storage layer of Cassandra, please refer to the update, I am guessing Cassandra stores each column value in a different key. Am i correct? – Koen J. Nov 25 '18 at 19:03
  • The database stores and locates the data using a nested sort order. The data is stored in a hierarchy that the query must traverse. You have shared key for different values of the clustering columns. Take a look here: [Clustering columns](https://docs.datastax.com/en/dse/5.1/cql/cql/cql_using/whereClustering.html) – Gal S Nov 25 '18 at 20:51