1

I'm trying to build a news feed system using Cassandra, I was thinking of using a fan out approach wherein if a user posts a new post, I'll write a new record in all of his friends' feed table. The table structure looks like:

CREATE TABLE users (
   user_name TEXT,
   first_name TEXT,
   last_name TEXT,
   profile_pic TEXT,
   PRIMARY KEY (user_name)
); 

CREATE TABLE user_feed (
   user_name TEXT,
   posted_time TIMESTAMP,
   post_id UUID,
   posted_by TEXT, //posted by username
   posted_by_profile_pic TEXT,
   post_content TEXT,
   PRIMARY KEY ((user_name), posted_time)
) WITH CLUSTERING ORDER BY(posted_time desc);

Now, I can get a feed for a particular user in a single query all fine. What if the user who has posted a feed updates his profile pic. How do I go about updating the data in user_feed table?

gameOne
  • 609
  • 1
  • 10
  • 22
  • Your `users` table shows a primary key of `user_id` but theres no column for `user_id` ? – markc May 23 '17 at 11:09
  • sorry, the primary key is user_name – gameOne May 23 '17 at 11:10
  • So you'd write to both tables on a given event I would say, if the user updates their profile picture then it would also go into the `user_feed` table as a "profile updated" event would it not? – markc May 23 '17 at 11:13
  • No, I would have to update posted by user data based on user_name and posted_time which i wouldn't have all the time – gameOne May 23 '17 at 11:41

1 Answers1

2

You can use batch statements to achieve atomicity at your updates. So in this case you can create a batch with the update on tables users and user_feed using the same user_name partition key:

BEGIN BATCH
    UPDATE users SET profile_pic = ? WHERE user_name = ?;
    UPDATE user_feed SET posted_by_profile_pic = ? WHERE user_name = ?;
APPLY BATCH;

Take a look at CQL Batch documentation

Arthur Landim
  • 384
  • 2
  • 9
  • In the user_feed table, the user_name column is of the user whose feed are to be populated and not the user who has posted the post. The profile pic column is of the post owner, that wouldn't work. – gameOne May 23 '17 at 13:25
  • 1
    If you dont want to change your schema, I would go to another approach: storing the most updated picture with the same path (//current_profile_pic), so you wont need to keep updating the records. – Arthur Landim May 23 '17 at 14:01
  • That's a good idea, but works only for that case, say, the user edits the post and the post content has to be updated in all of his friends feed column. I'm willing to change the schema in case you can suggest a better one. – gameOne May 23 '17 at 14:06
  • If, somehow, during the post update, you can get the list of target users (to update the feed) you could create an index for `post_id` (or put as clustering column) and update with those info. You need to consider the workload to decide if the secondary index or clustering column. IMHO, with the few details that I have, you could try a clustering column approach as secondary index would be spread into multiple nodes (check http://docs.datastax.com/en/cql/3.3/cql/cql_using/useSecondaryIndex.html) – Arthur Landim May 23 '17 at 14:19
  • I'm not sure if i can add that in the clustering column, even if i did, i would need to know timestamp of the post – gameOne May 23 '17 at 14:36
  • If you have a third table with the author's posts, then you have the post_time. I thought that you had this third table as you were talking about post content editing. – Arthur Landim May 23 '17 at 14:59
  • Another approach would be to only store the post_id and get the content from the other table on the client side. But as I told you, it will depends on your workload/usage/etc. – Arthur Landim May 23 '17 at 15:00
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/144954/discussion-between-arthur-landim-and-gameone). – Arthur Landim May 23 '17 at 15:02
  • Seems possible, lemme design and get back – gameOne May 23 '17 at 15:13
  • There is a limitation on batch. By default If your batch size is greater than or equal to 50kb then your batch will fail. **Remember That** @gameOne Check this https://stackoverflow.com/a/34701936/2320144 – Ashraful Islam May 23 '17 at 18:27