Some background:
My question is very similar to this clarification question about denormalization, but I want to change the situation a bit.
In the Considerations section of this blog post on denormalization, the Firebase people say the following about updating data.
Let’s discuss some consequences of a [denormalized data structure]. You will need to ensure that every time some data is created (in this case, a comment) it is put in the right places.
The example includes three paths, one to store the comment's data, and two paths under which to store pointers to that comment.
...
Modification of comments is easy: just set the value of the comment under /comments to the new content. For deletion, simply delete the comment from /comments — and whenever you come across a comment ID elsewhere in your code that doesn’t exist in /comments, you can assume it was deleted and proceed normally:
But this only works because, as the answer to the other question says,
The structure detailed in the blog post does not store duplicate comments. We store comments once under
/comments
then store the name of those comments under/links
and/users
. These function as pointers to the actual comment data.
Basically, the content is only stored in one location.
The question:
What if the situation were such that storing duplicate data is necessary? In that case, what is the recommended way to update data?
My attempt at an answer:
An answer to this question exists, but it is directed at MongoDB, and I'm not sure it quite addresses the issue in Firebase.
The most sensible way I could think of, just for reference, is as follows.
I have a helper class to which I give a catalog of paths in Firebase, which somewhat resembles a schema. This class has methods that wrap Firebase methods, so that I can perform writes and updates under all the paths specified by my schema. The helper class iterates over every path where there is a reference to the object, and at each location performs a write, update, or delete. In my case, no more than 4 paths exist for any individual operation like that, and most have 2.
Example:
Imagine I have three top-level keys, Users, Events, and Events-Metadata. Users post Images to Events, and both Events and Users have a nested record for all their respective Images. Events-Metadata is its own top-level key for the case where I want to display a bunch of events on a page, but I don't want to pull down potentially hundreds of Image records along with them.
Images can have captions, and thus, when updating an Image's caption, I should update these paths:
new Firebase("path/to/eventID/images/imageID/caption")
,
and
new Firebase("path/to/userID/images/imageID/caption")
I give my helper class both those of these paths and a wrapper method, so that anytime a caption is updated, I can call helperclass.updateCaption(imageObj, newCaptionData), and it iteratively updates the data at each path.
Images are stored with attributes including eventID, userID, and imageID, so that the skeletons of those paths can be filled in correctly.
Is this a recommended and/or appropriate way to approach this issue? Am I doing this wrong?