Will firebase time stamp each record for me? or do I need to write the time into each record?
Nope, but you can use Firebase.ServerValue.TIMESTAMP as mentioned in the docs. Firebase stores only what you ask it to store.
Is it best using the unix Epoch or a more understandable date time?
Use Firebase.ServerValue.TIMESTAMP (which is a Unix Epoch) for all datetimes (if possible). This ensures consistency and correctness when compared with using new Date().getTime()
or any other method which is dependent on the local machine's time (which is often wrong, so you'll end up with messed up data).
Unix Epochs are also integers which work very well with Firebase's querying abilities, specifically we can use .startAt()
and .endAt()
to fetch things from a specific date range (as we'll see below in the answer).
How should I structure this data in firebase?
The first question you need to ask is "how will I be consuming this data?" Firebase isn't a big SQL database where we can get our structure kind of right then lean on complex querying to make up for our mistakes.
When you build a structure in Firebase, ensure that it allows you to load your data in specifc way. This means that if you know you're going to have a list of room_id
s that you'll want to load data from, then your room structure should be based around those IDs.
Consider a structure like this for a simple chat room (we'll use $
notation to indicate wild cards).
{
"rooms": {
$room_id: {
"users": {
$user_id: true
},
"_meta": {
closed: Boolean
},
"messages": {
$message_id: {
"user_id": $user_id,
"text": ""
}
}
}
},
"users": {
$user_id: {...}
}
}
When a user with an id of abe
joins a room with a room_id
of room_one
, we know that they need to mark themselves as an active member of the chat room by setting the location /rooms/room_one/users/abe
to true
.
Our function to join a room would look like this.
function joinRoom(room_id) {
// We assume `ref` is a Firebase reference to the root of our Firebase
var roomRef = ref.child("rooms").child(room_id);
roomRef.child("users").child(myUserId).set(true);
return roomRef;
}
This is being specific. We're given some information and because our data structure is logical we can easily make assumptions about what data needs to be written without loading any data from Firebase.
This isn't good enough for your situation though, since you also want reporting. We'll incrementally improve our structure based on your needs
How many unique users have we seen from x to y date & time
Assuming you're talking on a per-room basis, this is an easy change.
{
"rooms": {
$room_id: {
"users": {
$user_id: true
},
"users_history": {
$push_id: {
user_id: ...,
timestamp: ...
}
},
"messages": {
$message_id: {...}
}
}
},
"users": {
$user_id: {...}
}
}
We add the /users/$room_id/users_history
location. This is a list of every time a user enters this room. We've added a bit of complexity, so our join room function would look like this.
function joinRoom(room_id) {
var roomRef = ref.child("rooms").child(room_id);
roomRef.child("users_history").push({
user_id: myUserId,
timestamp: Firebase.ServerValue.TIMESTAMP
});
roomRef.child("users").child(myUserId).set(true);
return roomRef;
}
Now we can easily report how many users have been in a room in a given time using a Firebase Query.
function roomVisitors(room_id, start_datetime, end_datetime) {
var roomRef = ref.child("rooms").child(room_id),
queriedRoomRef = roomRef
.orderByChild('timestamp')
.startAt(start_datetime.getTime())
.endAt(end_datetime.getTime());
// Assuming we use some ES6 promise library
return new Promise(function (resolve, reject) {
queriedRoomRef.once("value", function (users) {
/* Users will be a snapshot of all people who
came into the room for the given range of time. */
resolve(users.val());
}, function (err) {
reject(err);;
});
});
}
We'll talk about whether or not doing this is truly "specific" in a moment, but this is the general idea.
Time spent online for 1 user from x to y date & time
We haven't fleshed out our /users/$user_id
structure yet, but we'll have to do that here. In this situation the only information we'll have to look up a user's time spent online will be their user_id
. So we'll have to store this information under /user/$user_id
because if we stored it under /rooms/
we would have to load data for all the rooms and loop through it to find relevant user information and that's not very specific.
{
"rooms": {
$room_id: {
"users": {
$user_id: true
},
"users_history": {
$push_id: {
user_id: ...,
timestamp: ...
}
},
"messages": {
$message_id: {...}
}
}
},
"users": {
$user_id: {
"online_history": {
$push_id: {
"action": "", // "online" or "offline"
"timestamp": ...
}
}
}
}
}
Now we can build a ref.onAuth(func)
that tracks our time online.
var userRef;
ref.onAuth(function (auth) {
if (!auth && userRef) {
// If we haven no auth, i.e. we log out, cancel any onDisconnect's
userRef.onDisconnect().cancel();
// and push a record saying the user went offline
userRef.child("online_history").push({
action: "offline",
timestamp: Firebase.ServerValue.TIMESTAMP
});
} else if (auth) {
userRef = ref.child('users').child(auth.uid);
// add a record that we went offline
userRef.child('online_history').push({
action: "online",
timestamp: Firebase.ServerValue.TIMESTAMP
});
// and if the user disconnects, add a record of going offline
userRef.child('online_history').push().onDisconnect().set({
action: "offline",
timestamp: Firebase.ServerValue.TIMESTAMP
});
}
});
Using this method we can now write a function to loop through the online/offline log and add up time for a given range using the same method of querying used above, but I'll leave this as an exercise for the reader.
Notes about specificity and performance
Neither of the reporting functions are specific. When we're getting a list of users who visited a room in the first query, we're grabbing a big object filled with usernames and pulling all that data down then parsing it client-side, when what we really want is just an integer value of the number of unique visitors.
This is a situation where you really want to employ a NodeJS worker using the server-side SDK. This worker can sit and watch changes to your data structure and automatically summarize data as it changes so your client can then look at a location like /rooms/$room_id/_meta/analytics/uniqueVisitorsThisWeek
and simply get a number like 10
.
The point is, storage is cheap, summarizing and caching data like this is cheap, but only if it's done server-side. If you're not specific and you load too much and attempt to perform summarizing client side, you'll waste CPU cycles and bandwidth.
If you're ever loading data onto a client from Firebase and not displaying that data, you should be reworking your data structure to be more specific.