I have 2 collections like that, being A
a 1.5kk documents collection:
A
_id
relation: { alias, ref: 'B/<_id>' }
B
_id
name
I want to query for the elements in A
, that aren't in B
. This is what I get so far:
db.getCollection('A').aggregate([
{
$project: {
'relation.ref': {
$arrayElemAt: [
{ $split: ['$relation.ref', '/'] }, 1
]
},
a: "$$ROOT"
}
},
{
$lookup: {
from: 'B',
localField: 'relation.ref',
foreignField: '_id',
as: 'b'
}
}
]);
This does return things in this pattern:
{
_id,
relation.ref,
a: {},
b: [{}]
}
Well I was expecting to add a $match clause to include only the ones with b being an empty array. Something like ${match: { b: {$size: 0} } }
The problem is that this query seems to take too long. I never saw it working, thus I dont know it is the right way.
Is there anything to make it perform better or a different way to get this?
For example, if I force a wrong join, it will return an empty array on b
. But querying the size of this array is always slow, even when I query for {$size: 1}
, that is 99,9% of the results. I wonder if i could do something like this to select only the missing values during the join:
{$lookup.pipeline: [{$myself: null}] }
Ideas, please?
Thanks you