I read that embedding is better from a performance point of view: "If performance is an issue, embed." (http://www.mongodb.org/display/DOCS/Schema+Design) and most guides always say contains should be embedded.
However I am not sure this is the case. Suppose we have two objects: Blog and Post. Blog contains posts.
Now making all posts embedded in blog will have the following issues:
- Paging. Since it's not possible to filter embedded objects, we will always get all posts and need to filter them out in the application.
- Filtering. Same as before, when searching for word inside posts, it will not be possible to filter the embedded collection from MongoDB.
- Insert. I assume inserting to collection is faster than inserting to embedded object. Is this correct? this is written anywhere?
- Update. Same as before, inline updating field inside smaller document (Post) might be faster then inline updating the post inside big document of Blog. Is this correct?
Taking all of the above, I would go for having posts in a separate collection referencing Blog. Is this the correct conclusion?
(Note: Please do not factor document size limit in the response, let's assume each blog will have at most 1000 posts)