In the NoSQL world, we are structuring a database according to the queries that we want to perform.
What is the logic of having comments as a top-level collection as opposed to having comments as a subcollection under posts?
None is better than the other. However, there are some differences:
Is this not the better way to store hierarchical data?
There is no "perfect", "the best" or "the correct" solution for structuring a Cloud Firestore database. We always choose to create a structure for our database that satisfies our queries. So in your case, I would create a schema that looks like this:
Firestore-root
|
--- users (collection)
| |
| --- $uid (document)
| |
| --- //user fields.
|
--- posts (collection)
|
--- $postId (document)
|
--- uid: "veryLongUid"
|
--- //post fields.
|
--- comments (sub-collection)
|
--- $commentId (document)
|
--- uid: "veryLongUid"
|
--- //comment fields.
Using this schema you can:
- Get all users.
- Get all posts in the database.
- Get all posts that correspond to only a particular user.
- Get all comments of all posts, of all users in the database. Requires a collection group query.
- Get all comments on all posts that correspond to a particular user. Requires a collection group query.
- Get all comments of all users that correspond to a particular post.
- Get all comments of a particular user that correspond to a particular post.
Am I missing something?
If you think that all the comments of a post might fit into 1 MiB maximum limitation, then you should consider adding all comments into an array. If not, I highly recommend you read the following approach:
Where I have explained how can we store up to billions of comments and replies in Firestore.