I have 3 collections:
- vendor (_id, title)
- product (_id, title, vendorId)
- vendor.location (_id, coords, vendorId)
Every vendor has multiple locations (up to 10,000) and multiple products (up to 10,000).
I need to find all products near given location and sort them by distance.
What I have came up so far is:
1) Find all vendors near given location:
db.collection('vendor.locations').aggregate([
{
$geoNear: {
near: givenLocation,
distanceField: 'distance',
maxDistance: 1000,
limit: 1000,
spherical: true
}
},
{
$group: {
_id: "$vendor",
distance: {$min: "$distance"}
}
},
{
$sort: {distance: -1}
}
]);
2) Find all products for matched vendors with $in operator, but AFAIK it does not guarantee the order of the matched items.
db.collection('products').aggregate([
{
$match: {
vendor: {$in: vendorIds}
}
},
{
$sort: {} //is it possible to sort items by their position in vendorIds array?
},
{
$limit: limit
},
{
$skip: skip
}
)
So, basically the problem boils down to finding a way to sort products.
Another potential solution is to denormalize vendor location into product location, but the potential size of collection (and geo index) is worrying me. Collection of products can be up to 1M with up to 10K embedded locations array, so it would result in up 10B items in geo index. How bad is it?