42

I am new to cassandra. Here I have two tables EVENTS and TOWER. I need to join those for some queries. But I'm not enable to do it.

Structure of EVENTS table:

eid int PRIMARY KEY,
a_end_tow_id text,
a_home_circle text,
a_home_operator text,
a_imei text,
a_imsi text,

Structure of TOWER table:

 tid int PRIMARY KEY,
 tower_address_1 text,
 tower_address_2 text,
 tower_azimuth text,
 tower_cgi text,
 tower_circle text,
 tower_id_no text,
 tower_lat_d text,
 tower_long_d text,
 tower_name text,

Now, I want to join these table with respect to EID and TID so that I can fetch the data of both tables.

ssuperczynski
  • 3,190
  • 3
  • 44
  • 61
BlueShark
  • 497
  • 3
  • 9
  • 14

3 Answers3

91

Cassandra = No Joins. Your model is 100% relational. You need to rethink it for Cassandra. I would advice you take a look at these slides. They dig deep into how to model data for cassandra. Also here is a webinar covering the topic. But stop thinking foreign keys and joining tables, because if you need relations cassandra isn't the tool for the job.

But Why?
Because then you need to check consistency and do many other things that relational databases do and so you loose the performance and scalability that cassandra offers.

What can I do?
DENORMALIZE! Lots of data in one table? But the table will have too many columns!
So? Cassandra can handle a very large number of columns in a table.

The other thing you can do is to simulate the join in your client application. Match the two datasets in your code, but this will be very slow because you'll have to iterate over all your information.

Another way is to carry out multiple queries. Select the event you want, then the matching tower.

Lyuben Todorov
  • 13,987
  • 5
  • 50
  • 69
  • is it okay to store redundunt data in my cassandra table?For example i have a user_detail table and a comment table. Both the table have one common column user_id . what would be the better way of doing things.Should I store redundant data in my comment table so that I dont need to query the other table? – HIRA THAKUR Mar 25 '15 at 08:11
  • 1
    `Another way is to carry out multiple queries. Select the event you want, then the matching tower.` Does that mean doing two SELECT query e.g SELECT id from table1 and then doing SELCT col1, col2 FROM table2 where col3 = ? Is that as per data model in Cassandra ? – Manish Kumar Jul 22 '15 at 13:14
7

There are a couple of ways that you can join tables together in Cassandra and query them. But of course you have to rethink the data model part.

  1. Use Apache Spark’s SparkSQL™ with Cassandra (either open source or in DataStax Enterprise – DSE).
  2. Use DataStax provided ODBC connectors with Cassandra and DSE.
Shay Rojansky
  • 15,357
  • 2
  • 40
  • 69
Mayank Raghav
  • 640
  • 1
  • 7
  • 17
0

PlayOrm is a good option for doing joins on scalable systems with a special Scalable SQL language in which you can join partitions (ie. you never want to join 1 billion rows with another billion rows). It has tons of noSQL patterns and is a complete break from hibernate and JPA to mimic noSQL patterns with client side joins when needed.

Dean Hiller
  • 19,235
  • 25
  • 129
  • 212