Cassandra database model

Question

I just switched to Cassandra and I have such items to model:

1.House - Here is its view which has 2.city, 3.zip and 4.property type

I will also need all cities view and I need all property types and zips for search form completions (not implemented yet).

To the best of my knowledge I should create 4 tables and make Primary keys ..."foreign keys", should not I?

I have also small request - like Harry Truman “GIVE me a one-handed economist,” demanded a frustrated American president. “All my economists say, ‘on the one hand...on the other'” :-).

I do not have enough experience with Cassandra to make a choice if I get "or you can do in this way, or that way", so just give me 1 best schema and I will implement it.

Thank you

score 0 · Accepted Answer · answered Feb 08 '16 at 19:28

To the best of my knowledge I should create 4 tables and make Primary keys ..."foreign keys", should not I?

You definitely do not want to do that. First of all, foreign keys do not exist in Cassandra. Secondly, what you're talking about is modeling from a relational standpoint. With Cassandra, you don't want data for one query spread across multiple tables, because that could be spread across multiple nodes. And querying multiple nodes introduces more network time into the equation, which is slow.

In Cassandra, you want to take a query-based modeling approach. Sometimes that can mean one table for each query. Bearing that in mind, I am hearing that you need to query your properties two different ways:

By "house" (MLS?)
By city

Essentially, you should have a table to serve each of those queries:

CREATE TABLE housesbymls (
    mls text,
    city text,
    price bigint,
    propertytype text,
    state text,
    street text,
    year bigint,
    zip text,
    PRIMARY KEY (mls));

CREATE TABLE housesByCity (
  mls text,
  street text,
  city text,
  state text,
  zip text,
  propertyType text,
  price bigint,
  year bigint,
  PRIMARY KEY ((state,city),zip,mls));

After upserting some data, I can query by MLS:

aploetz@cqlsh:stackoverflow> SELECT * FROM housesByMLS WHERE mls='09110857';

 mls      | city     | price  | propertytype     | state | street                         | year | zip
----------+----------+--------+------------------+-------+--------------------------------+------+-------
 09110857 | Palatine | 104900 | Condominium Unit |    IL | 1025 North Serling Avenue, 211 | 1978 | 60067

(1 rows)

And I can query by state/city or state/city/zip:

aploetz@cqlsh:stackoverflow> SELECT * FROM housesByCity
    WHERE state='IL' AND city='Palatine';

or:

aploetz@cqlsh:stackoverflow> SELECT * FROM housesByCity
    WHERE state='IL' AND city='Palatine' AND zip='60067';

Both of those return:

 state | city     | zip   | mls      | price  | propertytype     | street                         | year
-------+----------+-------+----------+--------+------------------+--------------------------------+------
    IL | Palatine | 60067 | 09110857 | 104900 | Condominium Unit | 1025 North Serling Avenue, 211 | 1978

(1 rows)

The idea behind the PRIMARY KEY structure on this one, is that state and city make up the partitioning key (which helps Cassandra figure out where to put that row in the cluster) so they are both required. Then, as cities can have multiple zip codes, you can also focus your query on that. PRIMARY KEYs in Cassandra are unique, so I put mls on the end to ensure uniqueness.

Thank you SOO much, Aaron for your brilliant and comprehensive answer. This is exactly what I needed, I already went to incorrect way. I need some time to read your answer line by line... — Svitlana, Feb 08 '16 at 19:36
@Svitlana Glad I could help! Another post on SO that might help you, is this answer by Carlo about the different types of keys in Cassandra and their uses: http://stackoverflow.com/questions/24949676/difference-between-partition-key-composite-key-and-clustering-key-in-cassandra/24953331#24953331 — Aaron, Feb 08 '16 at 19:39
Thank you, Aaron. Will try to read and understand before implementations :-). Customers always want it to be done by tomorrow and usually I need to do it twice :-). Thank you for your time, Aaron! — Svitlana, Feb 08 '16 at 19:46
Aaron, thank you for your help. I read your article - We Shall Have Order.... maybe it's my women logic, but I cannot get how to handle the problem article described - CREATE TABLE house ( id text, since timestamp, built timestamp, mls text, beds text, baths text, type_id text, description text, image text, street text, city_id text, zip_id text, price bigint, PRIMARY KEY (id,since) )WITH CLUSTERING ORDER BY (since DESC); And for a query select * from house I want houses to sorted in desc order. Thx — Svitlana, Feb 24 '16 at 19:19

Cassandra database model

1 Answers1