So there are some criteria for when to turn sub collections into their own separate collections, namely ask yourself this:
- Can this sub-collection grow unboundedly? If yes, then you need to have this as its own collection.
- Can these sub entities exist on their own? Or do they make sense on their own? If yes then this should most likely be its own collection.
- Can/Will another entity ever reference an item in this subcollection? For instance in your example an Event has multiple
Team
s, can that Team
be referenced by some other entity? If yes then this more likely needs to be its own collection.
- How often will you need to reference elements in this subcollection by themselves? For instance, if you constantly need to query a team from that event, then you will need to retrieve the WHOLE event just to get that one team.
The appropriate answer to your question is most likely a hybrid between both approaches, but to answer your assumptions first:
If I put Event, Team, IndividualCampaign, and Donor each in its own collection, getting this total would require multiple queries I assume.
You can use the Aggregation Framework
and rely on the $lookup
operator to join this in a single query. This could technically accomplish your requirement to fetch this total in a single roundtrip to the db.
If I put this all in one giant Event document with nested in arrays for the rest, this doesn't sound like a good idea as well.
Yep, it doesn't sound like a great idea to me either.
So then you have a few options:
- It sounds to me, without knowing much about your Domain, that Teams and Donors are important enough to merit their own collections.
- You can denormalize. In other words, at the expense of building fancier logic, you can have these items in their own collection and STILL keep SOME of their data denormalized in this event object.
For instance, you can have a totalDonations
field that you update every time something changes in the event's members, etc. But obviously you're in charge of maintaining consistency here.
Lastly, you should evaluate how relational your data is, if you see that you run into this problem for much of your domain, a Relation Database might be your best option.