0

Let's assume we have the following collections:

Users
{
  "id": MongoId,
  "username": "jsloth",
  "first_name": "John",
  "last_name": "Sloth",
  "display_name": "John Sloth"
}

Places
{
  "id": MongoId,
  "name": "Conference Room",
  "description": "Some longer description of this place"
}

Meetings
{
  "id": MongoId,
  "name": "Very important meeting",
  "place": <?>,
  "timestamp": "1506493396",
  "created_by": <?>
}

Later on, we want to return (e.g. from REST webservice) list of upcoming events like this:

[
   {
      "id": MongoId(Meetings),
      "name": "Very important meeting",
      "created_by": {
         "id": MongoId(Users),
         "display_name": "John Sloth",
      },
      "place": {
         "id": MongoId(Places),
         "name": "Conference Room",
      }
   },
   ...
]

It's important to return basic information that need to be displayed on the main page in web ui (so no additional calls are needed to render the table). That's why, each entry contains display_name of the user who created it and name of the place. I think that's a pretty common scenario.

Now my question is: how should I store this information in db (question mark values in Metting document)? I see 2 options:

1) Store references to other collections:

place: MongoId(Places)

(+) data is always consistent

(-) additional calls to db have to be made in order to construct the response

2) Denormalize data:

  "place": {
     "id": MongoId(Places),
     "name": "Conference room",
  }

(+) no need for additional calls (response can be constructed based on one document)

(-) data must be updated each time related documents are modified

What is the proper way of dealing with such scenario?

If I use option 1), how should I query other documents? Asking about each related document separately seems like an overkill. How about getting last 20 meetings, aggregate the list of related documents and then perform a query like db.users.find({_id: { $in: <id list> }})?

If I go for option 2), how should I keep the data in sync?

Thanks in advance for any advice!

jbacic
  • 53
  • 1
  • 1
  • 4

1 Answers1

0

You can keep the DB model you already have and still only do a single query as MongoDB introduced the $lookup aggregation in version 3.2. It is similar to join in RDBMS.

$lookup

Performs a left outer join to an unsharded collection in the same database to filter in documents from the “joined” collection for processing. The $lookup stage does an equality match between a field from the input documents with a field from the documents of the “joined” collection.

So instead of storing a reference to other collections, just store the document ID.

str
  • 42,689
  • 17
  • 109
  • 127
  • I wonder about the part "unsharded collection". To be honest, it will work for me now as I don't have any sharded collections. However, any suggestions what to do if the related collection is sharded? – jbacic Sep 27 '17 at 07:18
  • As far as I know, there no other way for sharded collections https://stackoverflow.com/questions/34633111/mongodb-to-use-sharding-with-lookup-aggregation-operator. But usually you have to store a loooot of data and have a loooot of users to justify sharding. – str Sep 27 '17 at 07:24