2

I have a question regarding the $group argument of MongoDb aggregations. My data structure looks as follows:

My "Event" collection contains this single document:

{
   "_id": ObjectId("mongodbobjectid..."),
   "name": "Some Event",
   "attendeeContainer": {
       "min": 0,
       "max": 10,
       "attendees": [
            {
               "type": 1,
               "status": 2,
               "contact": ObjectId("mongodbobjectidHEX1")
            },
            {
               "type": 7,
               "status": 4,
               "contact": ObjectId("mongodbobjectidHEX2")
            }
        ]
    }
}

My "Contact" collection contains these documents:

{
    "_id": ObjectId("mongodbobjectidHEX1"),
    "name": "John Doe",
    "age": 35
},
{
    "_id": ObjectId("mongodbobjectidHEX2"),
    "name": "Peter Pan",
    "age": 60
}

What I want to do is perform an aggregate query on the "Event" collection and get the following result with full "contact" data:

{
   "_id": ObjectId("mongodbobjectid..."),
   "name": "Some Event",
   "attendeeContainer": {
       "min": 0, 
       "max": 10,
       "attendees": [
             {
                "type": 1,
                "status": 2,
                "contact": {
                   "_id": ObjectId("mongodbobjectidHEX1"),
                   "name": "John Doe",
                   "age": 35
                }
             },
             {
                "type": 7,
                "status": 4,
                "contact": {
                   "_id": ObjectId("mongodbobjectidHEX2"),
                   "name": "Peter Pan",
                   "age": 60
                }
             }
         ]
     }
}

The arguments I am using right now look as follows (shortened version):

"$unwind" : "$attendeeContainer.attendees",
"$lookup" : { "from" : "contactinfo", "localField" : "attendeeContainer.attendees.contact","foreignField" : "_id", "as" : "contactInfo" },
"$unwind" : "$contactInfo",
"$group"  : { "_id": "$_id", 
              "name": { "$first" : "$name" }, 
              ...
              "contact": { "$push": { "contact": "$contactInfo"} }
            }

However, this leads to the "contact" array being on "Event" level (because of the grouping) instead of one document of the array being at each "attendeeContainer.attendees". How can I push the "contact" array to be at "attendeeContainer.attendees"? (as shown in the desired output above)

I tried things like:

"attendeeContainer.attendees.contact": { "$push": { "contact": "$contactInfo"} }

But mongodb apparently does not allow "." at $group stage.

Philipp Jahoda
  • 50,880
  • 24
  • 180
  • 187
  • Wait! Did you just create a duplicate of your own [question](https://stackoverflow.com/questions/40609209/mongodb-lookup-query-with-multiple-fields-from-objects-array)? – styvane Nov 16 '16 at 15:42
  • No. I did not. The previous question was related to the $lookup query. This one relates to grouping and features a more complex scenario. – Philipp Jahoda Nov 16 '16 at 15:49

1 Answers1

3

Try running the following aggregation pipeline, the key is using a final $project pipeline to create the attendeeContainer subdocument:

db.event.aggregate([
    { "$unwind": "$attendeeContainer.attendees" },
    {
        "$lookup" : { 
            "from" : "contactinfo", 
            "localField" : "attendeeContainer.attendees.contact",
            "foreignField" : "_id", 
            "as" : "attendeeContainer.attendees.contactInfo" 
        }
    },
    { "$unwind": "$attendeeContainer.attendees.contactInfo" },
    {
        "$group": {
            "_id" : "$_id",
            "name": { "$first": "$name" },   
            "min" : { "$first": "$attendeeContainer.min" },
            "max" : { "$first": "$attendeeContainer.max" },
            "attendees": { "$push": "$attendeeContainer.attendees" }            
        }
    },
    {
        "$project": {
            "name": 1,
            "attendeeContainer.min": "$min",
            "attendeeContainer.max": "$min",
            "attendeeContainer.attendees": "$attendees"
        }
    }
])

Debugging Tips

Debugging the pipeline at the 4th stage, you would get the result

db.event.aggregate([
    { "$unwind": "$attendeeContainer.attendees" },
    {
        "$lookup" : { 
            "from" : "contactinfo", 
            "localField" : "attendeeContainer.attendees.contact",
            "foreignField" : "_id", 
            "as" : "attendeeContainer.attendees.contactInfo" 
        }
    },
    { "$unwind": "$attendeeContainer.attendees.contactInfo" },
    {
        "$group": {
            "_id": "$_id",
            "name": { "$first": "$name" },   
            "min" : { "$first": "$attendeeContainer.min" },
            "max" : { "$first": "$attendeeContainer.max" },
            "attendees": { "$push": "$attendeeContainer.attendees" }            
        }
    }/*,
    {
        "$project": {
            "name": 1,
            "attendeeContainer.min": "$min",
            "attendeeContainer.max": "$min",
            "attendeeContainer.attendees": "$attendees"
        }
    }*/
])

Pipeline result

{
    "_id" : ObjectId("582c789282a9183adc0b53f5"),
    "name" : "Some Event",
    "min" : 0,
    "max" : 10,
    "attendees" : [ 
        {
            "type" : 1,
            "status" : 2,
            "contact" : ObjectId("582c787682a9183adc0b53f3"),
            "contactInfo" : {
                "_id" : ObjectId("582c787682a9183adc0b53f3"),
                "name" : "John Doe",
                "age" : 35
            }
        }, 
        {
            "type" : 7,
            "status" : 4,
            "contact" : ObjectId("582c787682a9183adc0b53f4"),
            "contactInfo" : {
                "_id" : ObjectId("582c787682a9183adc0b53f4"),
                "name" : "Peter Pan",
                "age" : 60
            }
        }
    ]
}

and the final $project pipeline will give you the desired result:

db.event.aggregate([
    { "$unwind": "$attendeeContainer.attendees" },
    {
        "$lookup" : { 
            "from" : "contactinfo", 
            "localField" : "attendeeContainer.attendees.contact",
            "foreignField" : "_id", 
            "as" : "attendeeContainer.attendees.contactInfo" 
        }
    },
    { "$unwind": "$attendeeContainer.attendees.contactInfo" },
    {
        "$group": {
            "_id": "$_id",
            "name": { "$first": "$name" },   
            "min" : { "$first": "$attendeeContainer.min" },
            "max" : { "$first": "$attendeeContainer.max" },
            "attendees": { "$push": "$attendeeContainer.attendees" }            
        }
    },
    {
        "$project": {
            "name": 1,
            "attendeeContainer.min": "$min",
            "attendeeContainer.max": "$min",
            "attendeeContainer.attendees": "$attendees"
        }
    }/**/
])

Desired/Actual Output

{
    "_id" : ObjectId("582c789282a9183adc0b53f5"),
    "name" : "Some Event",
    "attendeeContainer" : {
        "min" : 0,
        "max" : 10,
        "attendees" : [ 
            {
                "type" : 1,
                "status" : 2,
                "contact" : ObjectId("582c787682a9183adc0b53f3"),
                "contactInfo" : {
                    "_id" : ObjectId("582c787682a9183adc0b53f3"),
                    "name" : "John Doe",
                    "age" : 35
                }
            }, 
            {
                "type" : 7,
                "status" : 4,
                "contact" : ObjectId("582c787682a9183adc0b53f4"),
                "contactInfo" : {
                    "_id" : ObjectId("582c787682a9183adc0b53f4"),
                    "name" : "Peter Pan",
                    "age" : 60
                }
            }
        ]
    }
}
chridam
  • 100,957
  • 23
  • 236
  • 235
  • Thank you. The `$project` argument puts the "attendees" in place. Is there any way to make the grouping "simpler". I actually always want to get **all the data** from both the "Event" document and the "Contact" document. Shown above is only a small portion of a full "Event" document (the actual is much larger), is there any way to perform the output I require without having to create a grouping for every single field? – Philipp Jahoda Nov 16 '16 at 15:47
  • Unfortunately within the `$group` pipeline you would need to specify the fields you need, similar to the SQL's GROUP BY clause. In SQL, you can't use GROUP BY unless you use any of the aggregation functions. The same way, you have to use an aggregation function in MongoDB as well. Unless, you can create the `$group` pipeline object dynamically outside the `aggregate()` function but this approach is a bit complex since you'd need a mechanism to get the list of keys needed first and that requires some mapReduce querying. – chridam Nov 16 '16 at 16:09
  • Okay, thanks, that's a bummer. My "Event" document has more than 50 fields so I guess grouping all the fields first and then projecting them is not really an option for me since that would result in a huge query. – Philipp Jahoda Nov 16 '16 at 16:16
  • Do you know if there is any way around this: `can't add an expression for field "eventAttendeeContainer" because there is already an expression for that field or one of its sub-fields`? I tried to project `"eventAttendeeContainer" : "$eventAttendeeContainer", "eventAttendeeContainer.attendees": "$attendees"` but it does not seem to work. – Philipp Jahoda Nov 16 '16 at 16:33