Optimizing Firebase data structure for two large paths

Question

I think I've wrapped my head around denormalization as a primary method of optimization when storing data in Firebase as mentioned in question like this one and in this blog post but I'm getting stuck on one small detail.

Assuming I have two things in my domain, users and posts as in the blog article I mentioned, I might have 20,000 users and 20,000 posts. Because I denormalized everything like a good boy, root/users/posts exists as does root/posts. root/users/posts has a set of post keys with a value of true so that I can get all post keys for a user.

users: {
    userid: {
        name: 'johnny'
        posts: {
            -Kofijdjdlehh: true,
            -Kd9isjwkjfdj: true
        }
    }
}

posts: {
    -Kofijdjdlehh: {
        title: 'My hot post',
        content: 'this was my content',
        postedOn: '3987298737'
    },        
    -Kd9isjwkjfdj: {
        title: 'My hot post',
        content: 'this was my content',
        postedOn: '3987298737'
    }
}

Now, I want to list the title of all posts a user has posted. I don't want to load all 20,000 posts in order to get the title. I can only think of the following options:

Query the root/posts path in some way using the subset of keys that are set to true in the root/users/posts path (if this is possible, I haven't figured out how)
Store the title in the root/users/posts so that each entry in that path has the title duplicated looking like this:

posts: { -Kofijdjdlehh: true }

becomes

posts: { -Kofijdjdlehh: { title: 'This was my content' } }

This seems reasonable, but I haven't seen a single example of doing this, so I'm concerned that it's some anti-pattern.
Another way I haven't been able to find

I appreciate any pointers you might have or documentation I might have missed on this use case.

Anid Monsur · Accepted Answer · 2015-10-12T23:37:25.600

Either are valid solutions. #1 would be more work for whoever is reading the data, while #2 would be more work when data is saved. Also for #2, you'd have to handle updates to post's titles, though this would be pretty easy with the new multi-path updates.

To implement #1, you'd have you essentially do two queries. Here's a really basic solution which only handles adding posts. It listens for posts being added to the user, and then hooks up a listener to each post's title.

var usersPosts = {};
ref.child('users').child(userId).child('posts').on('child_added', function(idSnap) {
    var id = idSnap.key();

    ref.child('posts').child(id).child('title').on('value', function(titleSnap) {
        usersPosts[id] = titleSnap.val();
    });

});

For a third solution, you could use firebase-util, which automagically handles the above scenario and more. This code would essentially do the same as the code above, except it comes with the bonus of giving you one ref to handle.

new Firebase.util.NormalizedCollection(
        [ref.child('users').child(userId).child("posts"), "posts"],
        [ref.child("posts"), "post"]
).select(
        {
            key: "posts.$value",
            alias: "x"
        },
        {
            key: "post.title",
            alias: "title"
        }
).ref();

Note that the x value will always be true. It's necessary to select that because firebase-util requires you to select at least one field from each path.

These seem to assume I'm already connected and waiting for data to change. What about when a user connects and I need to query all of the post titles at once for a given user that they've historically added? Can I somehow grab the subset of posts from the `posts` path that match any of the ids in the `user/posts` path? — sonicblis, Oct 13 '15 at 13:42
No, the two solutions I've written out do not assume you're connected and waiting for data to change. They retrieve what you're asking for *and* will continue to listen for more data. *child_added* is called for every child, old and new. *value* is called for the existing value first and then on every change. — Anid Monsur, Oct 13 '15 at 15:46
Ah, right right right. That's a tough paradigm to retain for me. Ok, I think this is starting to sink in. The looped data access is well trained out of me, so relearning that part of it is tough. I appreciate the help. — sonicblis, Oct 14 '15 at 16:37
@AnidMonsur can you provide the android (java) code to implement #1? Thank you! — Hon, Mar 11 '17 at 09:46

Optimizing Firebase data structure for two large paths

1 Answers1