19

MongoDB aggregation gets exponentially complicated by the minute!

I am in so far as to $unwind a nested array and then perform a $lookup by the _id of each object from the unwinded nested array. My final attempt is to reverse the unwinding with $group. However, I am unable to reconstruct the original embedded array, with its original property name, along with the rest of the original immediate properties of each document.

Here is my attempt so far:

db.users.aggregate([
    {
        $unwind: "$profile",
        $unwind: {
            path: "$profile.universities",
            preserveNullAndEmptyArrays: true
        }
    },
    {
        $lookup: {
            from: "universities",
            localField: "profile.universities._id",
            foreignField: "_id",
            as: "profile.universities"
        }
    },
    {
        $group: {
            _id: "$_id",
            emails: { "$first": "$emails" },
            profile: { "$first": "$profile" },
            universities: { "$push": "$profile.universities" }
        }
    }
]).pretty()

What I get is something like this:

{
    "_id" : "A_USER_ID",
    "emails" : [
        {
            "address" : "AN_EMAIL_ADDRESS",
            "verified" : false
        }
    ],
    "profile" : {
        "name" : "NAME",
        "company" : "A COMPANY",
        "title" : "A TITLE",
        "phone" : "123-123-1234",
        "disabled" : false,
        "universities" : [
            {
                "_id" : "ID_1",
                "name" : "UNIVERSITY_NAME_1",
                "code" : "CODE_1",
                "styles" : {AN_OBJECT}
            }
        ]
    },
    "universities" : [
        [
            {
                "_id" : "ID_1",
                "name" : "UNIVERSITY_NAME_1",
                "code" : "CODE_1",
                "styles" : {AN_OBJECT}
            }
        ],
        [
            {
                "_id" : "ID_2",
                "name" : "UNIVERSITY_NAME_2",
                "code" : "CODE_2",
                "styles" : {AN_OBJECT}
            }
        ]
    ]
}

There are 2 issues with this result:

  1. The resulting universities is an array of arrays of one object each, since the $lookup returned a single element array for the original $profile.universities nested array. It should be just an array of objects.
  2. The resulting universities should take its original place as nested under profiles. I am aware why the original profile.universities is the way it is, because I am using the $first operator. My intent behind this is to retain all of the original properties of profile, in junction with retaining the original nested universities array.

Ultimately, what I need is something like this:

{
    "_id" : "A_USER_ID",
    "emails" : [
        {
            "address" : "AN_EMAIL_ADDRESS",
            "verified" : false
        }
    ],
    "profile" : {
        "name" : "NAME",
        "company" : "A COMPANY",
        "title" : "A TITLE",
        "phone" : "123-123-1234",
        "disabled" : false,
        "universities" : [
            {
                "_id" : "ID_1",
                "name" : "UNIVERSITY_NAME_1",
                "code" : "CODE_1",
                "styles" : {AN_OBJECT}
            },
            {
                "_id" : "ID_2",
                "name" : "UNIVERSITY_NAME_2",
                "code" : "CODE_2",
                "styles" : {AN_OBJECT}
            }
        ]
    }
}

Is there another operator that I can use instead of $group to achieve this? Or am I understanding the purpose of $group incorrectly?

Edit: This is the original post, for context: If Mongo $lookup is a left outer join, then how come it excludes non-matching documents?

Community
  • 1
  • 1
Sun Lee
  • 879
  • 2
  • 8
  • 17

1 Answers1

36

Because the $lookup operator produces an array field, you need to $unwind the new field before the $group pipeline to get the desired result:

db.users.aggregate([
    { "$unwind": "$profile" },
    { "$unwind": {
        "path": "$profile.universities",
         "preserveNullAndEmptyArrays": true
    } }, 
    { "$lookup": {
        "from": "universities",
        "localField": "profile.universities._id",
        "foreignField": "_id",
        "as": "universities"
    } },    
    { "$unwind": "$universities" },
    { "$group": {
        "_id": "$_id",
        "emails": { "$first": "$emails" },
        "profile": { "$first": "$profile" },
        "universities": { "$push": "$universities" }
    } },
    { "$project": {
        "emails": 1,  
        "profile.name" : 1,
        "profile.company": 1,
        "profile.title" : 1,
        "profile.phone" : 1,
        "profile.disabled": 1,          
        "profile.universities": "$universities"
    } }
]).pretty()
chridam
  • 100,957
  • 23
  • 236
  • 235
  • 5
    Thanks @chridam, the second `$unwind` and `$project` did the trick! I had to modify two parts: 1. change the second unwind to `"$unwind": {"path": "$profile.universities", "preserveNullAndEmptyArrays": true}`, and 2. change the `$group` "universities" to `"universities": { "$push": "$profile.universities" }` and I was able to get my results. – Sun Lee Sep 14 '16 at 16:53
  • Brilliant, thank you. I had multiple levels of useless arrays after looking up multiple levels, and unwinding that did the trick – vintagexav Nov 18 '19 at 15:30