0

I'm new in Node.js and Cloud Functions for Firebase, I'll try to be specific for my question.

I have a firebase-database with objects including a "score" field. I want the data to be retrieved based on that, and that can be done easily in client side.

The issue is that, if the database gets to grow big, I'm worried that either it will take too long to return and/or will consume a lot of resources. That's why I was thinking of a http service using Cloud Functions to store a cache with the top N objects that will be updating itself when the score of any objects change with a listener.

Then, client side just has to call something like https://myexampleprojectroute/givemethetoplevels to receive a Json with the top N levels.

Is it reasonable? If so, how can I approach that? Which structures do I need to use this cache, and how to return them in json format via http?

At the moment I'll keep doing it client side but I'd really like to have that both for performance and learning purpose.

Thanks in advance.

EDIT:

In the end I did not implement the optimization. The reason why is, first, that the firebase database does not contain a "child count" so I didn't find a way with my newbie javascript knowledge to implement that. Second, and most important, is that I'm pretty sure it won't scale up to millions, having at most 10K entries, and firebase has rules for sorted reading optimization. For more information please check out this link.

Also, I'll post a simple code snippet to retrieve data from your database via http request using cloud-functions in case someone is looking for it. Hope this helps!

// Simple Test function to retrieve a json object from the DB
// Warning: No security methods are being used such authentication, request methods, etc
exports.request_all_levels = functions.https.onRequest((req, res) => {
  const ref = admin.database().ref('CustomLevels');
  ref.once('value').then(function(snapshot) {
    res.status(200).send(JSON.stringify(snapshot.val()));
  });
});
Carles
  • 451
  • 5
  • 16

1 Answers1

2

You're duplicating data upon writes, to gain better read performance. That's a completely reasonable approach. In fact, it is so common in NoSQL databases to keep such derived data structures that it even has a name: denormalization.

A few things to keep in mind:

  • While Cloud Functions run in a more predictable environment than the average client, the resources are still limited. So reading a huge list of items to determine the latest 10 items, is still a suboptimal approach. For simple operations, you'll want to keep the derived data structure up to date for every write operation.
  • So if you have a "latest 10" and a new item comes in, you remove the oldest item and add the new one. With this approach you have at most 11 items to consider, compared to having your Cloud Function query the list of items for the latest 10 upon every write, which is a O(something-with-n) operation.
  • Same for an averaging operation: you'll find a moving average to be most performant, because it doesn't require any of the previous data.
Community
  • 1
  • 1
Frank van Puffelen
  • 565,676
  • 79
  • 828
  • 807
  • Thanks! So instead of using alternative js structures I just duplicate the top X in different a path of the database. I'll try this out! I'll update the post once it's done! – Carles Apr 20 '17 at 15:03