0

I'm trying to determine how I should organise some user data that can be a mix of both public and private entities. The entities in question are Activity Stream events. Followers of the user can see their Activity Stream events - however the user can mark certain categories of events as 'public' or 'private' and therefore not share them with their followers.

One way to do it would be to create activity_streams_private and activity_streams_public paths that have respective security rules and then a Cloud Function could handle what is added and removed from activity_streams_public based on how the user updates their privacy settings.

"rules": {
    "activity_streams_settings": {
        "$userID": {
            "read": "auth.uid == $userID",
            "write": "auth.uid == $userID"
        }
    },
    "activity_streams_private": {
        // Users only write Activity Stream events here.
        "$userID": {
            "read": "auth.uid == $userID",
            "write": "auth.uid == $userID"
        }
    },
    "activity_streams_public": {
        "$userID": {
            // Every user can read public activity stream data
            "read": "auth.uid != null",

            // Only Cloud Functions can update what appears in
            // a user's public activity stream by determining
            // if an activity added to activity_streams_private
            // is public or private using the
            // activity_stream_settings data. Every settings
            // change would cause the Cloud Function to read
            // this entire subtree of data and write/delete a lot.
            "write": false,
        }
    }
}

Alternatively, you could keep all a user's public and private Activity Stream events in a path like activity_streams and then have query based security rules that only allow a follower to query Activity Stream events with a public=true property. This would still require the user's device (or a Cloud Function) to update all items in activity_streams to include/exclude the public=true property every time a privacy setting is changed.

"rules": {
    "activity_streams_settings": {
        "$userID": {
            "read": "auth.uid == $userID",
            "write": "auth.uid == $userID"
        }
    },
    "activity_streams": {
        "$userID": {
            // Only children with the 'visibility=public' property
            // can be read (or the owner of the activity stream).
            // This method will mean that the ID of each Activity
            // Stream event will be in the form of:
            // timestampDescending_eventUUID so that the events
            // are automatically sorted by newest first.
            "read": "auth.uid == $userID || (auth.uid != null && query.orderBy = 'visibility' && query.equalTo = 'public')",

            // Users can append Activity Stream events with a
            // preset visibility property (public or private)
            // but a Cloud Function will change the visibility
            // property on events if a user changes their settings.
            "write": "auth.uid == $userID",
        }
    }
}

I guess I'm asking what the drawbacks and benefits of each method are. Do query-based security rules slow down read speeds? Is either solution more scalable as the user-base grows? In a short ~5 minute session, a user will generate anywhere between 25 to 75 activity stream events.

My concern with doing activity_streams_public is that the writes may cause scalability issues at ~1M users even with infrequent privacy settings changes. Thoughts?

Socceroos
  • 424
  • 3
  • 9
  • This is a bit broad. Also a bit difficult to visualize, without actual rules and database structure to look at. – Doug Stevenson Jan 22 '19 at 23:20
  • @DougStevenson I've updated the question with hopefully some more clarification. Is that enough to explain the question of which method scales better and is considered as 'idiomatic NoSQL'? – Socceroos Jan 23 '19 at 00:30
  • The drawback of your second approach is that you can't query for only the public information. Having separate top-level nodes for the public and private information is the only way to implement this in a way that also allows users to query the public information. See https://stackoverflow.com/questions/38648669/firebase-how-to-structure-public-private-user-data, https://stackoverflow.com/questions/19891762/firebase-security-rules-public-vs-private-data – Frank van Puffelen Jan 23 '19 at 03:51
  • @FrankvanPuffelen Hmmm, I thought if a follower ran a query like `orderBy('visibility').equalTo('public')` then it would allow someone to query just the public activities in the stream as the rule allows this query? `"read": "auth.uid == $userID || (auth.uid != null && query.orderBy = 'visibility' && query.equalTo = 'public')"` I could have sworn I recently read that this was possible. I'll see if I can dig it up. – Socceroos Jan 23 '19 at 04:44
  • That means each child node in the stream is either entirely public or entirely private. Is that indeed the case? If so, I misunderstood the structure you're aiming for. – Frank van Puffelen Jan 23 '19 at 05:18
  • So, under `/activity_streams/[userID]` you would have a list of `activity` objects for a particular user. Each `activity` object contains, amongst other things, a property called `visibility` that can be either set to `public` or `private`. So when one of my followers queries _my_ `/activity_streams/[myUserID]` then they would use the query `orderBy('visibility').equalTo('public')` which would retrieve only those objects which have a public visibility, leaving out any others that have a private visibility. Perhaps I'm misunderstanding if this is even possible. I thought it was with query rules – Socceroos Jan 23 '19 at 06:10

0 Answers0