I need help in a scenario when we do multipath updates to a fan-out data. When we calculate the number of paths and then update, in between that, if a new path is added somewhere, the data would be inconsistent in the newly added path.
For example below is the data of blog posts. The posts can be tagged by multiple terms like “tag1”, “tag2”. In order to find how many posts are tagged with a specific tag I can fanout the posts data to the tags path path as well:
/posts/postid1:{“Title”:”Title 1”, “body”: “About Firebase”, “tags”: {“tag1:true, “tag2”: true}}
/tags/tag1/postid1: {“Title”:”Title 1”, “body”: “About Firebase”}
/tags/tag2/postid1: {“Title”:”Title 1”, “body”: “About Firebase”}
Now consider concurrently,
1a) that User1 wants to modify title of postid1 and he builds following multi-path update:
/posts/postid1/Title : “Title 1 modified”
/tags/tag1/postid1/Title : “Title 1 modified”
/tags/tag2/postid1/Title : “Title 1 modified”
1b) At the same time User2 wants to add tag3 to the postid1 and build following multi-path update:
/posts/postid1/tags : {“tag1:true, “tag2”: true, “tag3”: true}
/tags/tag3/postid1: {“Title”:”Title 1”, “body”: “About Firebase”}
So apparently both updates can succeed one after other and we can have tags/tag3/postid1 data out of sync as it has old title.
I can think of security rules to handle this but then not sure if this is correct or will work.
Like we can have updatedAt and lastUpdatedAt fields and we have check if we are updating our own version of post that we read:
posts":{
"$postid":{
".write":true,
".read":true,
".validate": "
newData.hasChildren(['userId', 'updatedAt', 'lastUpdated', 'Title']) && (
!data.exists() ||
data.child('updatedAt').val() === newData.child('lastUpdated').val())"
}
}
Also for tags we do not want to check that again and we can check if /tags/$tag/$postid/updatedAt is same as /posts/$postid/updatedAt.
"tags":{
"$tag":{
"$postid":{
".write":true,
".read":true,
".validate": "
newData.hasChildren(['userId', 'updatedAt', 'lastUpdated', 'Title']) && (
newData.child('updatedAt').val() === root.child('posts').child('$postid').val().child('updatedAt').val())”
}
}
}
By this “/posts/$postid” has concurrency control in it and users can write their own reads Also /posts/$postid” becomes source of truth and rest other fan-out paths check if updatedAt fields matches with it the primary source of truth path.
Will this bring in consistency or there are still problems? Or can bring performance down when done at scale?
Are multi path updates and rules atomic together by that I mean a rule or both rules are evaluated separately in isolation for multi path updates like 1a and 1b above?