2

I need help figuring out the best way to use firebase for my usecases. I'm making a twitter clone but unlike FireFeed my posts are mutable.

My database is structured like so:

Rooms, essentially a feed of posts you can subscribe to. Members of a room may write posts to that room:

  1. rooms/{roomId}/posts/{postId}/ -> a summary of a post with attributes like title, number of comments, likes, etc
  2. rooms/{roomId}/users/{userId}/ -> list of all the users subscribed to this room
  3. rooms/{roomId}/info/ -> room size, type, name, etc

Users, global list of all the users:

  1. users/{userId}/feed/{postId} -> an aggregate of all rooms/{roomId}/posts/ from all rooms a user is subscribed to
  2. users/{userId}/rooms/{roomId} -> list of all the rooms a user is subscribed to

Posts, global list of actual post content:

  1. posts/{postId}/comments/{commentId} -> nested comments
  2. posts/{postId}/users/{userId} -> users who want push notifications on this post
  3. posts/{postId}/info/ -> number of comments, likes, author, date posted, etc
  4. posts/{postId}/content/ -> actual content

Schools, lists of rooms:

  1. schools/{schoolId}/rooms/{roomName}/info -> list of of rooms in a school. Room names are unique within a school
  2. schools/{schoolId}/admins/{userId} -> list of admins that have different authorisation rules

Now, the first two use cases are fine. The confusion comes with how to go about making and editing references to posts. My use cases are:

Use case 0: User wants to create a room.

  1. Multi location update a room object to /rooms/ and appropriate room summary object to schools/{schoolId}/rooms/, and users/{userId}/rooms/.

Use case 1: User wants to subscribe to a room.

  1. Run a transaction on rooms/{roomId}/. Add {userId} to rooms/{roomId}/users/ and increment room size by 1.
  2. On transaction complete, take that snapshot and do a multi location update. Copy rooms/{roomId}/posts over to users/{userId}/feed, add {roomId} to user/{userId}/rooms/ and write new size to /schools/{schoolId}/rooms/{roomName}/info.
  3. If multiwrite fails then remove {userId} from rooms/{roomId}/users/.

Use case 2: User wants to make a post to a room.

  1. Retrieve list of users from rooms/{roomId}/users
  2. Multi location update all user feeds and global post table with post summaries and post object.

The problem with this is if another user joins this room right after our user retrieves the list of users, then he's not part of the multi-location update to receive this new post.

I face a similar conundrum when I'm trying to update the number of comments on a post. If a post has 15 comments, and 2 people add a new comment to this post simultaneously, the global post object will correctly update to 17 when done in a transaction, but depending on the order of the fan-outs after, everyone's post summary object might have 16 comments. The same is true for the size of a room when a user joins a room.

How should I go about this? Is there a way to better model my data, or is there a way to correctly do multi location updates during a transaction?

Vinay Nagaraj
  • 1,162
  • 14
  • 26
  • There is no API to do a multi-location update that takes the current value of (some of) the nodes it updates. The common options are: 1) run the transaction high enough in the tree (this limits concurrency through), 2) change the data structure to push the shared data down in the tree, 3) use a trusted process (e.g. a small node.js server) to run the logic that would be in a transaction. – Frank van Puffelen Jun 06 '16 at 15:15
  • Also see my (slightly related) answer here: http://stackoverflow.com/questions/30693785/how-to-write-denormalized-data-in-firebase – Frank van Puffelen Jun 06 '16 at 15:15
  • @FrankvanPuffelen could you please elaborate on 2? How could I change the data structure? – Vinay Nagaraj Jun 13 '16 at 03:20

1 Answers1

0

When I see your Use cases, I think you're making your data model a bit too complicated.

Firebase for me it's like :

  1. no-sql data storage
  2. pub/sub server
  3. web-sockets enabled front-end server
  4. client-side library

This means I will not store this users/{userId}/feed/{postId} but instead when a client log in do the following :

  1. "connect" to every room the user subscribed in
  2. Fetch all recent post from subscribed rooms
  3. order the posts client-side

This way when someone make a new post/comment it will write at only one location rooms/{roomId}/posts/

I would also remove your counter of like/post/comment and when I want this information call the method numChildren to avoid unnecessary concurrent access.

0xCrema.eth
  • 887
  • 1
  • 9
  • 22
  • Hmm, I had previously structured my data as you said, but the client side re-ordering and pagination wasn't trivial. Getting a page of the most recent 100 posts would mean getting 100 posts from all the rooms to ensure they are ordered by recentness. This scales out of hand very quickly. Apart from that, I wanted to optimize for reads over writes. Tbh, I'm fine with the off by one errors and the inconsistency of the counts for different stats like likes, users, etc but i'm not fine with the possibility of a user missing out on a post of a room. – Vinay Nagaraj Jun 06 '16 at 13:04
  • I have to admit the 1st load might be tricky to achieve, but why not getting the most recent post from each room and show them to the client. Once your page/app is loaded with some posts, check if there is interesting older posts, like it is done in some chatApp where only the most recent messages are loaded and not all unread messages – 0xCrema.eth Jun 06 '16 at 13:16
  • If all rooms were approximately equally active then getting 100/n most recent posts from each room to show ~100 posts would work, but my domain does not let me make any such assumption about the activity of different rooms. Even if the rooms were equally active, each read requires one network RTT per room a user is subscribed to where as writes require just one. I'd rather have it the other way around and optimize for reads. – Vinay Nagaraj Jun 06 '16 at 13:46
  • I never said it was a perfect solution, just gave idea but you said "each read requires one network RTT per room a user is subscribed to where as writes require just one" No ! . Firebase works with Web socket. Any update on client is transmitted to server without initiating a new connection as a connection already exists. Server pushes the update to other connected (and interested) clients as the connection already exists. i.e : your loading time only relies on your data length the "RTT" is negligible with Firebase – 0xCrema.eth Jun 06 '16 at 14:08
  • Ah, I wasn't aware of that. Thanks! – Vinay Nagaraj Jun 06 '16 at 14:41