My use case is to union two different collections in MongoDB which have approximately the same business meaning but are stored in two different collections. I found the answer by link
So to merge two collections and create view I use the next pipeline
// Here we do a union of the employees and freelancers using a single aggregation query.
db.createView(
"employeesAndFreelancers",
"employees",
[
{ $limit: 1 }, // 2. Keep only one document of the collection.
{ $project: { _id: '$$REMOVE' } }, // 3. Remove everything from the document.
// 4. Lookup collections to union together.
{ $lookup: { from: 'employees', pipeline: [{ $match: { department: 'sales' } }], as: 'employees' } },
{ $lookup: { from: 'freelancers', pipeline: [{ $match: { department: 'sales' } }], as: 'freelancers' } },
// 5. Union the collections together with a projection.
{ $project: { union: { $concatArrays: ["$employees", "$freelancers"] } } },
// 6. Unwind and replace root so you end up with a result set.
{ $unwind: '$union' },
{ $replaceRoot: { newRoot: '$union' } }
]);
Then I want to query the view with some simple query.
db.getCollection('employeesAndFreelancers')
.find({"updated" : { "$gte" : ISODate("2018-07-22T09:45:00.000Z")}})
.limit(5)
.sort({"updated": 1})
.explain("executionStats")
Source collection has the index on the updated
field. So query on the source collection will produce execution stats with index.
So my question how the query on view will behave if the size of my collections is 1 GB each or greater? Will it load the whole data to memory in process of execution of the pipeline? If yes, is it possible to improve it somehow?
MongoDB version is 4.0.