I'm using Mongo 3.4.10, and can not upgrade now.
I have a "left join" that need to be filtered (And in 3.4 I can't use uncorrelated sub-queries). For now, I use $addFields with $arrayElemAt and $filter, but this fail on large arrays (more than 2000 elements).
I tried to $unwind just after the $lookup as recommended in this SO answer but I can't manage to also get my primary elements that doesn't have a secondary.
My schema is as follow :
- A primary collection (let's call it 'primary'), with ids, and some more useful fields.
- A secondary collection, with link to the primary, which can contain a lot of elements for the same primary id. Most of them are irrelevants.
- Let's suppose that for each primary id, I have one or none relevant element in the secondary collection. If there is more, I would like to use the last one (Element have a timestamp field), but this can be overlooked for now.
Here is the first solution I used :
return db.primary.aggregate([
{
$match:{
'id':{
$in:queryParam.ids
},
}
},
{
$lookup:{
from:'secondary',
localField:'id',
foreignField:'primary_id',
as:'secondaries'
}
},
{
$addFields:{
relevantSecondary:{
$arrayElemAt:[
{
$filter:{
input:'secondaries',
as:'secondary',
cond:{
$eq:[
'$$secondary.relevant',
'true'
]
}
}
},
0
]
},
}
}
]);
This work nice, giving me the excepted results. Problem appear when I have a big number of elements in secondary for a given primary id. I get
"Total size of documents in secondary matching { $match: { $and: [ { primary_id: { $eq: \"XXXXX\" } }, {} ] } } exceeds maximum document size"
Here is what I tried, using $unwind first :
return db.primary.aggregate([
{
$match:{
'id':{
$in:queryParam.ids
},
}
},
{
$lookup:{
from:'secondary',
localField:'id',
foreignField:'primary_id',
as:'secondaries'
}
},
{
$unwind: {
path: '$secondaries',
preserveNullAndEmptyArrays: true
}
},
{
$match: {
$or: [
{
'secondaries.relevant': {$eq: 'true'}
},
{
'secondaries': []
}
]
}
}
]);
This work on large arrays, but I can't any more get the primary elements that doesn't have a matching secondary.
Any clue on how to make this request ?
I know the real solution would be to move to Mongo 4.0 but this is not an option in my context.
EDIT: Included @Fanpark suggestion to get primary with no secondary. Problem is that this still does not return primary that have only irrelevant secondary.