Let us consider I have two collections collection-A
and collection-B
they contain data like the following example
collection-A
{adId:1,adName:'x'}
{adId:2,adName:'y'}
{adId:3,adName:'z'}
.
.
collection-B [ this collection has some millions of rows for each
adId
]
{adId:1, ip:'1.2.3.4',viewCount:2}
{adId:1, ip:'1.2.3.5',viewCount:3}
{adId:2, ip:'1.2.3.6',viewCount:1}
{adId:3, ip:'1.2.3.4',viewCount:1}
.
.
Now suppose I want to query collection-A
such that I want the name of all rows in collection-A
who do not have the corresponding adId
row in collection-B
with specified Ip: 1.2.3.4
and count
more than 2.
I have written following MongoDB aggregate query for above condition which seems to work
[
{
$match: {}
},
{
$lookup: {
from:'collection-B',
localField:"adId",
foreignField: "adId",
as:'transaction'
}
},
{
$match:{
transaction:{
$not:{
$elemMatch:{
"ip":"1.2.3.4",
count:{$gte:2}
}
}
}
}
},
]
Above query works perfectly if the number of matching rows in collection-B
are some thousands but if the number of matching rows in collection-B
are in some lakhs/millions then I get the error saying MongoError: Total size of documents exceeds maximum document size
which I understand why because all the matched rows are combined into one array and the array is appended to the parent row of collection-A
which consequently exceeds max documents allowed size.
I found another solution to the problem which is to use $unwind
operator immediately after lookup which does exactly opposite that is, it adds parent collection-A
row to every corresponding row of collection-B
Using unwind solves the problem but in that case number of rows to be searched increases to such a large extent that searching takes some 30 seconds. Now I was thinking about any third approach like put search criteria in lookup operator itself but I don't know can I use search query inside lookup operator or not, I'm not even sure if my approach to the solution is correct.
Any comment or hint to the possible solution will be appreciated