This really is not a great structure to work with, and there are very good reasons for that. As such, doing a $lookup
here is not a simple task as there are various implications of the "nested arrays"
You basically want either
db.users.aggregate([
{ "$match": { "name": "Michael" } },
{ "$lookup": {
"from": "schools",
"localField": "starred",
"foreignField": "faculties.subjects.id",
"as": "subjects"
}},
{ "$addFields": {
"subjects": {
"$filter": {
"input": {
"$reduce": {
"input": {
"$reduce": {
"input": "$subjects.faculties.subjects",
"initialValue": [],
"in": { "$concatArrays": [ "$$value", "$$this" ] }
}
},
"initialValue": [],
"in": { "$concatArrays": [ "$$value", "$$this" ] }
}
},
"cond": { "$in": ["$$this.id", "$starred"] }
}
}
}}
])
Or with MongoDB 3.6 or greater maybe instead:
db.users.aggregate([
{ "$match": { "name": "Michael" } },
{ "$lookup": {
"from": "schools",
"let": { "starred": "$starred" },
"pipeline": [
{ "$match": {
"$expr": {
"$setIsSubset": [
"$$starred",
{ "$reduce": {
"input": "$faculties.subjects.id",
"initialValue": [],
"in": { "$concatArrays": [ "$$value", "$$this" ] }
}}
]
}
}},
{ "$project": {
"_id": 0,
"subjects": {
"$filter": {
"input": {
"$reduce": {
"input": "$faculties.subjects",
"initialValue": [],
"in": { "$concatArrays": [ "$$value", "$$this" ] }
}
},
"cond": { "$in": [ "$$this.id", "$$starred" ] }
}
}
}},
{ "$unwind": "$subjects" },
{ "$replaceRoot": { "newRoot": "$subjects" } }
],
"as": "subjects"
}}
])
Both approaches essentially rely on $reduce
and $concatArrays
in order to "flatten" the "nested array" content into a form that can be used for comparison. The main difference between the two being that prior to MongoDB 3.6 you are essentially pulling all the "possible" matches from the document before you can do anything about "filtering" the inner array entries to only those that match.
Failing having at least MongoDB 3.4 with the $reduce
and $in
operators, then you are essentially resorting to $unwind
:
db.users.aggregate([
{ "$match": { "name": "Michael" } },
{ "$lookup": {
"from": "schools",
"localField": "starred",
"foreignField": "faculties.subjects.id",
"as": "subjects"
}},
{ "$unwind": "$subjects" },
{ "$unwind": "$subjects.faculties" },
{ "$unwind": "$subjects.faculties.subjects" },
{ "$redact": {
"$cond": {
"if": {
"$setIsSubset": [
["$subjects.faculties.subjects.id"],
"$starred"
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}},
{ "$group": {
"_id": "$_id",
"id": { "$first": "$id" },
"name": { "$first": "$name" },
"starred": { "$first": "$starred" },
"subjects": { "$push": "$subjects.faculties.subjects" }
}}
])
Using of course the $redact
stage to filter the logical comparison since there is only $expr
with MongoDB 3.6 and $setIsSubset
to compare to an array of "starred"
.
Then of course due to all the $unwind
operations, you typically want to $group
in order to reform the array.
Or otherwise do the $lookup
from the other direction:
db.schools.aggregate([
{ "$unwind": "$faculties" },
{ "$unwind": "$faculties.subjects" },
{ "$lookup": {
"from": "users",
"localField": "faculties.subjects.id",
"foreignField": "starred",
"as": "users"
}},
{ "$unwind": "$users" },
{ "$match": { "users.name": "Michael" } },
{ "$group": {
"_id": "$users._id",
"id": { "$first": "$users.id" },
"name": { "$first": "$users.name" },
"starred": { "$first": "$users.starred" },
"subjects": {
"$push": "$faculties.subjects"
}
}}
])
The last form not really being ideal since you don't filter the "users" until after the $lookup
is done ( or technically speaking "during" the $lookup
really ). But at any rate, it needs to work with the whole "schools" collection first.
All forms return the same output:
{
"_id" : ObjectId("5aea649526a94676bb981df4"),
"id" : 1,
"name" : "Michael",
"starred" : [
1,
2
],
"subjects" : [
{
"id" : 1,
"name" : "sub1"
},
{
"id" : 2,
"name" : "sub2"
}
]
}
Where you have only the detail from the "subjects"
inner array from the related document which actually matched the values of "starred"
for the current user.
All of that said, it's not really a good idea to "nest arrays" with MongoDB. Before MongoDB 3.6 you cannot even do atomic updates of "nested arrays" and even with the changes that allow it, it is still "difficult" at best to do any query operations, and especially those involving joins and filtering.
It's a common "newbie" mistake to structure "nested arrays" since you seem to think you are "organizing" things better. But in truth it's more of an "anti-pattern" and you really should consider a "flatter" structure such as :
{
"_id" : ObjectId("5aea651326a94676bb981df5"),
"id" : 1,
"name" : "Uni",
"subjects" : [
{
"id" : 1,
"name" : "sub1",
"facultyId": 1000,
"facultyName": "faculty1"
},
{
"id" : 2,
"name" : "sub2",
"facultyId": 1000,
"facultyName": "faculty1"
},
{
"id" : 3,
"name" : "sub3",
"facultyId": 1000,
"facultyName": "faculty1"
}
]
}
Which is "much" easier to work with and of course perform "joins" on where required.