6

It is known that a Cassandra partition has a theoretical limit of 2 billion cells. But how does that work in a situation like this one below:

create table table1 (
    some_id int PRIMARY KEY,
    some_name text
);

create table table2 (
    other_id int PRIMARY KEY,
    other_name text
);

Assume we have 1 billion cells in partition (some_id = 1) on table1. If we had another 1 billion cells in partition (other_id = 1) on table2, would those add up to the 2 billion theoretical limit?

In other words, are equal partition keys in different tables stored together?

L. Sandes
  • 73
  • 4
  • Possible duplicate of [Cassandra has a limit of 2 billion cells per partition, but what's a partition?](http://stackoverflow.com/questions/20512710/cassandra-has-a-limit-of-2-billion-cells-per-partition-but-whats-a-partition) – RussS Apr 18 '16 at 18:05
  • Thanks RussS for your comment. It is slightly different from that post but this difference is interesting for me. Every post I found stated what is a partition key and how they are distributed throughout the cluster but none of them ever brought the info requested in my post (I guess). At least it still concerns me. – L. Sandes Apr 18 '16 at 19:10
  • Different tables have different partitions even with the same token – RussS Apr 18 '16 at 19:26
  • That's it RussS, thanks! It was not clear for me until you answered (even after reading the post you suggested). – L. Sandes Apr 18 '16 at 19:42
  • If they were the same "partition" then the partition would have heterogeneous composition (if the table had a different structure). The storage under the hood is also segregated based on table. Eh I'll just format all this into another answer. – RussS Apr 19 '16 at 00:23

1 Answers1

7

Different tables have different partitions. This makes the structure of any particular partition homogenous (it will always follow the proscribed schema of a single table) which allows for optimizations.

If you look at the storage engine under the hood you'll see that every table even has it's own directory structure making it clear that a partition from one table will never interact with the partition of another. (see /var/lib/cassandra/)

RussS
  • 16,476
  • 1
  • 34
  • 62
  • 1
    For a single table, if I am using singe-node cassandra cluster, does adding 1 billion cells in partitionkey=a and 1 billion cells in partitionKey=b result in cassandra hitting that limit? – Apoorv Jun 30 '17 at 23:14
  • 1
    The theoretical limit is per partition not per cluster or node – RussS Jul 01 '17 at 00:20