37

I have an application where new children get added to Firebase every 5 seconds or so. I have thousands of children.

On application load, I'd like to process the initial thousands differently from the subsequent children that trickle in every 5 seconds.

You might suggest I use value, process everything, and then use children_added. But I believe if the processing takes too long I have the potential to miss a point.

Is there a way to do this in Firebase that guarantees I don't miss a point?

Wilder Pereira
  • 2,249
  • 3
  • 21
  • 31
Keith Carter
  • 422
  • 1
  • 6
  • 18
  • 1
    If you add a timestamp to the children with `Firebase.ServerValue.TIMESTAMP`, this will be trivial with `startAt`. Did you try anything already? If so, can you share the code and where (you fear) it fails? – Frank van Puffelen Jan 16 '15 at 12:44
  • 1
    possible duplicate of [how to discard initial data in a Firebase DB](http://stackoverflow.com/questions/19883736/how-to-discard-initial-data-in-a-firebase-db) – Kato Jan 16 '15 at 16:00
  • 1
    See also: [How to retrieve only new data](http://stackoverflow.com/questions/18270995/how-to-retreive-only-new-data) – Kato Jan 16 '15 at 16:05
  • @FrankvanPuffelen how do you manage timestamp differences between the Firebase server's timestamp and whatever is requesting the data? – Keith Carter Jan 16 '15 at 16:32
  • See the first link that Kato posted. – Frank van Puffelen Jan 16 '15 at 17:10
  • I did. There is the possibility of missing data with that solution. See my comment on my answer. – Keith Carter Jan 16 '15 at 18:13

4 Answers4

69

Since child_added events for the initial, pre-loaded data will fire before the value event fires on the parent, you can use the value event as a sort of "initial data loaded" notification. Here is some code I slightly modified from another similar StackOverflow question.

var initialDataLoaded = false;
var ref = new Firebase('https://<your-Firebase>.firebaseio.com');

ref.on('child_added', function(snapshot) {
  if (initialDataLoaded) {
    var msg = snapshot.val().msg;
    // do something here
  } else {
    // we are ignoring this child since it is pre-existing data
  }
});

ref.once('value', function(snapshot) {
  initialDataLoaded = true;
});

Thankfully, Firebase will smartly cache this data, meaning that creating both a child_added and a value listener will only download the data one time. Adding new Firebase listeners for data which has already crossed the wire is extremely cheap and you should feel comfortable doing things like that regularly.

If you are worried about downloading all that initial data when you don't actually need it, I would follow @FrankvanPuffelen's suggestions in the comments to use a timestamped query. This works really well and is optimized using Firebase queries.

Community
  • 1
  • 1
jwngr
  • 4,284
  • 1
  • 24
  • 27
  • 1
    This is an excellent answer. I was unaware of the smart cacheing and haven't seen that mentioned anywhere else. Thank you Jacob. I do have two questions: what happens if that callback of value doesn't occur immediately or if the callback itself takes a long time to execute (due to data processing or something else)? Isn't there a possibility that a child could be added in that time? – Keith Carter Jan 17 '15 at 02:42
  • 2
    Firebase should guarantee that the `value` event will fire before any *new* `child_added` events fire, so you should be fine there. As for data processing of the callback, that shouldn't matter as long as you are checking the `initialDataLoaded` variable right away. If you check it later in the callback after a bunch of blocking code, it may have switched to `true`, which is what I think you are getting at. You should follow the usual JavaScript best practices here as that is not really a Firebase-specific question. – jwngr Jan 18 '15 at 00:16
  • @jacobawenger I keep reading that it is "by design" that Firebase doesn't have a way built to grab the initial data and then only grab what changed? Is that because by design 3 way data binding is suppose to already manage the real-time syncing? I'm just worried that I'm trying to sync things manually when Firebase is built to declarative handle that. – Nick Pineda Nov 26 '15 at 08:15
  • Bless you Jacob. No idea this was a thing. – Tyler McGinnis Dec 10 '15 at 08:42
  • 1
    In iOS at least the order of when the handlers are installed is important: if the value handler is installed before the add handler, the add callback is called nonetheless. – TheEye Feb 23 '16 at 13:58
  • Thank you! This solution is clever and super helpful! My use case is: On start of an HTTP server listening to my Firebase Realtime Database I want to fetch all the existing data at that reference and do something with it. Once I've read all existing data then I want to listen for new data and do something else with new data. This solution helps me do exactly that. I'd be interested to know more about the smart caching, too, because I was unaware of it. – Lucy Mar 26 '17 at 21:21
3

Added a createdAt timestamp in the database and limited the child_added with the current time and it seems working fine to me.

const onChildAdded = database()
  .ref(refURL)
  .limitToLast(constValues.LAST_MESSAGE_ADDED)
  .orderByChild('createdAt')
  .startAt(date.getTime())
  .on('child_added', (snapshot) => {
    parseMessage(snapshot);
  });

database()
  .ref(refURL)
  .limitToLast(constValues.LAST_MESSAGES_LIMIT)
  .once('value', (snapshot) => {
    parseInitialMessages(snapshot);
    setLoading(false);
  });
-1

Improved the answer from @jacobawenger to not use a global state variable

var ref = new Firebase('https://<your-Firebase>.firebaseio.com');

ref.once('value', function(snapshot) {
  // do something here with initial value

  ref.on('child_added', function(snapshot) {
    // do something here with added childs
  });

});
Tibi
  • 127
  • 1
  • 4
  • Does this not introduce a race condition? What if a child is added during the execution of: // do something here with initial value – Keith Carter Jun 14 '16 at 16:19
  • It looks like something I am trying, but I get duplicate data. I think the reason is that child_added adds ALL data, even from start - that's just the way it seems to work. So we need some check like in http://stackoverflow.com/a/27995609/129202 so that we don't add duplicate data. – Jonny May 22 '17 at 03:30
-2

There are two possible solutions to this problem, neither satisfying:

1) Timestamp your children and only request children that have a timestamp greater than a "now" value. This is not great because you may have a synchronicity issue between your application's "now" and the "now" on whatever is pushing the data to Firebase or the Firebase server value for "now".

2) Use value and child added. Any duplicates seen in child_added that had already been seen in value may be discarded. This is unuseable for large data sets as it requires downloading all historical data TWICE!

Why can't there be a command like "child_added" that DOESN'T give everything ever?

Keith Carter
  • 422
  • 1
  • 6
  • 18
  • 1
    Please don't post this as an answer, unless you think it is an answer. Kato gave two links to similar questions. If the answers given there don't work for you, edit you question (there's a handy edit link right under it) to include a minimal code sample that shows what you've done and describe what goes wrong with it. – Frank van Puffelen Jan 16 '15 at 17:08
  • I've summarized my takeaways from the two links Kato gave and posted it as an answer. Also: one of the solutions linked to has the exact problem I am trying to avoid: missing data for large data sets. I'll post the problem in that thread (http://stackoverflow.com/questions/19883736/how-to-discard-initial-data-in-a-firebase-db) – Keith Carter Jan 16 '15 at 18:08
  • Actually, I'm not allowed to comment because I am below 50 karma. Setting aside the issue of having to download the data twice, the problem is that with large data sets there is a gap in time between when child_added finishes and when value finishes. Since children added are only processed AFTER value finishes, you could miss data. – Keith Carter Jan 16 '15 at 18:12
  • If the "solutions" are no solutions to you, then don't post them as answers. You are always entitled to update/edit your question with additional info. – cfi Jan 16 '15 at 21:26
  • My question is adequately stated and no further explanation is needed. I've gone and summarized the solutions others have provided for this very problem and explained why they aren't perfect. If there is a "better" solution, anyone is welcome to provide it and I'll indicate it as correct. – Keith Carter Jan 16 '15 at 21:33