2

I'm using Mongo 3.4.10, and can not upgrade now.

I have a "left join" that need to be filtered (And in 3.4 I can't use uncorrelated sub-queries). For now, I use $addFields with $arrayElemAt and $filter, but this fail on large arrays (more than 2000 elements).

I tried to $unwind just after the $lookup as recommended in this SO answer but I can't manage to also get my primary elements that doesn't have a secondary.

My schema is as follow :

  • A primary collection (let's call it 'primary'), with ids, and some more useful fields.
  • A secondary collection, with link to the primary, which can contain a lot of elements for the same primary id. Most of them are irrelevants.
  • Let's suppose that for each primary id, I have one or none relevant element in the secondary collection. If there is more, I would like to use the last one (Element have a timestamp field), but this can be overlooked for now.

Here is the first solution I used :

return db.primary.aggregate([
    {
        $match:{
            'id':{
                $in:queryParam.ids
            },
        }
    },
    {
        $lookup:{
            from:'secondary',
            localField:'id',
            foreignField:'primary_id',
            as:'secondaries'
        }
    },
    {
        $addFields:{
            relevantSecondary:{
                $arrayElemAt:[
                    {
                        $filter:{
                            input:'secondaries',
                            as:'secondary',
                            cond:{
                                $eq:[
                                    '$$secondary.relevant',
                                    'true'
                                ]
                            }
                        }
                    },
                    0
                ]
            },
        }
    }
]);

This work nice, giving me the excepted results. Problem appear when I have a big number of elements in secondary for a given primary id. I get

"Total size of documents in secondary matching { $match: { $and: [ { primary_id: { $eq: \"XXXXX\" } }, {} ] } } exceeds maximum document size"

Here is what I tried, using $unwind first :

return db.primary.aggregate([
    {
        $match:{
            'id':{
                $in:queryParam.ids
            },
        }
    },
    {
        $lookup:{
            from:'secondary',
            localField:'id',
            foreignField:'primary_id',
            as:'secondaries'
        }
    },
    {
        $unwind: {
            path: '$secondaries',
            preserveNullAndEmptyArrays: true
        }
    },
    {
        $match: {
            $or: [
                {
                    'secondaries.relevant': {$eq: 'true'}
                },
                {
                    'secondaries': []
                }
            ]
        }
    }
]);

This work on large arrays, but I can't any more get the primary elements that doesn't have a matching secondary.

Any clue on how to make this request ?

I know the real solution would be to move to Mongo 4.0 but this is not an option in my context.

EDIT: Included @Fanpark suggestion to get primary with no secondary. Problem is that this still does not return primary that have only irrelevant secondary.

Azzabi Haythem
  • 2,318
  • 7
  • 26
  • 32
Kevin FONTAINE
  • 845
  • 8
  • 17
  • 1
    Your `$match` should be this `{ '$match': { '$or': [ { 'secondaries.relevant': { '$eq': 'true' } }, { 'secondaries': [] } ] }}` But I think even you have to use `$group` to reconstruct the `$unwind`ed array. – Ashh Jun 05 '19 at 09:44
  • @Fanpark The problem is that this match only the primary that have no secondary at all, but I struggle to get the ones that have only irrelevant secondaries. I don't need to re-group the array as I have only one relevant element. – Kevin FONTAINE Jun 05 '19 at 09:52
  • You can do this way `{ '$match': { '$or': [ { 'secondaries.relevant': { '$eq': 'true' } }, { $expr: { '$gte': [{ $size: '$secondaries' }, 0] }} ] }}` – Ashh Jun 06 '19 at 07:55
  • Sadly, $expr is only available in mongo 3.6 : https://stackoverflow.com/questions/49500536/mongodb-unknown-top-level-operator-expr-nodejs But I'm tr – Kevin FONTAINE Jun 06 '19 at 09:34
  • This give me every primary that have a secondary, but not the relevant one. I am kind of hopeless on this, and trying other way (by combining 2 queries), but without luck for now. – Kevin FONTAINE Jun 06 '19 at 10:01

0 Answers0