0

I created database "Movies" with three column families:

CREATE TABLE movies (
    movie_id int primary key,
    title text,
    avg_rating decimal,
    total_ratings int,
    genres set<text>
);

# shows all ratings for specific movie
CREATE TABLE ratings_by_movie (
    movie_id int,
    user_id int,
    rating decimal,
    ts int,
    primary key(movie_id, user_id)
);

# show all ratings of specific user
CREATE TABLE ratings_by_user (
    user_id int,
    movie_id int,
    rating decimal,
    ts int,
    primary key(user_id, movie_id)
); 

Is it possible to make the following queries?

  1. Show the movie with the most reviews
  2. Show all movies with the average rating >= 4
  3. Show 100 best movies based on their ratings
KTBFFH
  • 65
  • 1
  • 8
  • Show the query you have attempted so far – piyushj May 16 '16 at 10:44
  • 1. In PostgreSQL i can do something like this: `select movie_id, count(rating) as c from movierating group by (movie_id) order by c desc limit 1;` But I dont know how I can use count across the specific column in Cassandra (column rating - in my case) 2. I have no idea how i can calculate average value in Cassandra. – KTBFFH May 16 '16 at 11:16

1 Answers1

1

Cassandra = No Joins. Your model is 100% relational. You need to rethink it for Cassandra. I would advice you take a look at these slides. They dig deep into how to model data for cassandra. Also here is a webinar covering the topic. But stop thinking foreign keys and joining tables, because if you need relations cassandra isn't the tool for the job.

But Why?

Because then you need to check consistency and do many other things that relational databases do and so you loose the performance and scalability that cassandra offers.

What can I do?

DENORMALIZE! Lots of data in one table? But the table will have too many columns! So? Cassandra can handle a very large number of columns in a table.

For more details check: How to do a join queries with 2 or more tables in cassandra cql

Community
  • 1
  • 1
piyushj
  • 1,546
  • 5
  • 21
  • 29
  • Thank you for your answer! So my queries are not possible in the way I created my tables (without using Spark)? – KTBFFH May 18 '16 at 22:18