Some rules of thumb that I have found useful are this:
If there is only one logical copy of a piece of information, it should be in one document (for example, if you have comments on a post, the simplest method is to embed them in the post)
If you would denormalize data in SQL land into some other table to avoid joins and whatnot, the same behavior applies in document storage: denormalize from one "main" location into copies in other locations. The copies should be thought of as copies, and not origin information, so they can be overwritten with future denormalization actions.
If you have to access one canonical set of data, like a user account, from multiple locations, store references as ObjectId
s in mongodb, then execute a second query for the related document. You must be aware in your application that the second query is not a join, and will not lock both documents to ensure consistency, so there may be inconsistencies in the results.
Essentially, you should think of your database as consistent at the document level. Any query of related documents may be inconsistent, so if you need consistency, you can denormalize that data into one document.
If you need the user account to be exactly consistent with your comments, you will have to copy the relevant information next to your comments at the same time that you write the comments into the document. This means you have to think about consistency at the application level, all the time. If not, as I suspect is the case, just issue another query for the user.
If you are concerned about performance in querying for data on all of the users that participate in your page, I would recommend copying over some data from the user account next to the comment, but only read from this copy - you should write to your original user accounts.
That's all that comes to mind for now, but I may edit as things occur to me :)