0

This question has previously been marked as a duplicate of this question I can with certainty confirm that it is not.

This is not a duplicate of the linked question because the elements in question are not an array but embedded in individual objects of an array as fields. I am fully aware of how the query in the linked question should work, however that scenario is different from mine.

I have a question regarding the $lookup query of MongoDb. My data structure looks as follows:

My "Event" collection contains this single document:

{
   "_id": ObjectId("mongodbobjectid..."),
   "name": "Some Event",
   "attendees": [
        {
           "type": 1,
           "status": 2,
           "contact": ObjectId("mongodbobjectidHEX1")
        },
        {
           "type": 7,
           "status": 4,
           "contact": ObjectId("mongodbobjectidHEX2")
        }
    ]
}

My "Contact" collection contains these documents:

{
    "_id": ObjectId("mongodbobjectidHEX1"),
    "name": "John Doe",
    "age": 35
},
{
    "_id": ObjectId("mongodbobjectidHEX2"),
    "name": "Peter Pan",
    "age": 60
}

What I want to do is perform an aggregate query with the $lookup operator on the "Event" collection and get the following result with full "contact" data:

{
   "_id": ObjectId("mongodbobjectid..."),
   "name": "Some Event",
   "attendees": [
        {
           "type": 1,
           "status": 2,
           "contact": {
              "_id": ObjectId("mongodbobjectidHEX1"),
              "name": "John Doe",
              "age": 35
           }
        },
        {
           "type": 7,
           "status": 4,
           "contact": {
              "_id": ObjectId("mongodbobjectidHEX2"),
              "name": "Peter Pan",
              "age": 60
           }
        }
    ]
}

I have done the same with single elements of "Contact" referenced in another document but never when embedded in an array. I am unsure of which pipeline arguments to pass to get the above shown result?

I also want to add a $match query to the pipeline to filter the data, but that is not really part of my question.

Community
  • 1
  • 1
Philipp Jahoda
  • 50,880
  • 24
  • 180
  • 187
  • 1
    Just to clarify, I marked it as duplicate because the answers in that dupe still apply to your question; you need to apply the `$unwind` pipeline to flatten the array of embedded documents for you to apply the `$lookup` operator on the "attendees.contact" field and then after the `$lookup`, pipe another `$unwind` and `$group` the resulting docs to get your desired output. The other answer also applies if you are using the upcoming MongoDB 3.4 – chridam Nov 15 '16 at 12:00
  • Okay, so what you are saying is that I should first `{"$unwind": "$attendees"}`, then `{ "$lookup" : {"from" : "Contact", "localField" : "attendees.contact","foreignField": "_id", "as" : "contactlist" }}`, then `{"$unwind": "$contactlist"}` and then `$group`? – Philipp Jahoda Nov 15 '16 at 12:07
  • Correct. A tip on debugging a pipeline; run the aggregation query at each pipeline stage, test whether the results tally for each step. So for example, in the aggregation above, first run `db.event.aggregate([ { "$unwind": "$attendees" } ]);` , check the result to see if the attendees array is deconstructed properly, add the next pipeline stage `db.event.aggregate([ { "$unwind": "$attendees" },{ "$lookup" : {"from" : "Contact", "localField" : "attendees.contact","foreignField": "_id", "as" : "contactlist" }} ]);`, run that and repeat the steps till you get to the final pipeline step. – chridam Nov 15 '16 at 12:14
  • Thank you very much, I managed to debug it to the last $unwind step and everything seems correct. At that stage the query on the "Events" collection which holds 1 document returns 2 documents (because of 2 attendees), each having a field "contactList" with the data of one contact. Would that be the expected outcome? Now I am unsure how to group this result to get the output described above – Philipp Jahoda Nov 15 '16 at 12:26
  • Yes, that's expected because of the initial `{"$unwind": "$attendees"}` pipeline where for each input document, it outputs `n` documents where `n` is the number of array elements and can be zero for an empty array. You can safely group all these documents by the main `_id` field i.e. `{ "$group": { "_id": "$_id", "name": { "$first": "$name" }, "attendees": { "$push": "$contactlist" } } }` – chridam Nov 15 '16 at 12:33
  • Thank you, that worked, really appreciate your help. I just had to modify your last "group" command a little bit: `"attendees" : { "$push": {"contact": "$contactlist"}}` to get the "contact" field. I suppose that should be fine. My last problem is that "type" and "status" of the attendee are not yet present in my output, how can I get those included in the output as well? – Philipp Jahoda Nov 15 '16 at 12:38
  • Thanks. The $first with the "name" worked fine, however "name" is related to "Event" whereas "type" and "status" are related to the individual "attendees". The query you mentioned adds a field "type" and "status" only to my "event", but of course not to the "attendees". Is there a way to map type and status to the individual attendees? – Philipp Jahoda Nov 15 '16 at 13:01
  • Do the same with the `name` field, use the [**`$first`**](https://docs.mongodb.com/v3.2/reference/operator/aggregation/first/#grp._S_first) operator as `{ "$group": { "_id": "$_id", "name": { "$first": "$name" }, , "attendees": { "$push": { "type": "$attendees.type", "status": "$attendees.status", "contact": "$contactlist" } } } }` – chridam Nov 15 '16 at 13:02
  • 1
    Thank you again for the effort, that is the correct answer. I will create a post and answer below. – Philipp Jahoda Nov 15 '16 at 13:26
  • In case you are not tired of me yet, I'd really appreciate if you take a look at this: http://stackoverflow.com/questions/40635501/mongodb-aggregation-query-with-group-and-push-into-subdocument :-) – Philipp Jahoda Nov 16 '16 at 15:20

1 Answers1

0

Try this one

 db.getCollection('Event').aggregate([{ "$unwind": "$attendees" },
        { "$lookup" : { "from" : "Contact", "localField" : "attendees.contact", "foreignField": "_id", "as" : "contactlist" } },
        { "$unwind": "$contactlist" },
         { "$project" :{
                    "attendees.type" : 1,
                    "attendees.status" : 1,
                    "attendees.contact" : "$contactlist",
                      "name": 1, "_id": 1
                       }
        },
        {
            "$group" : {
                _id : "$_id" ,
                "name" : { $first : "$name" }, 
                "attendees" : { $push : "$attendees" }
            }
        }
        ])
Parshuram Kalvikatte
  • 1,616
  • 4
  • 20
  • 40