0

My use case is to union two different collections in MongoDB which have approximately the same business meaning but are stored in two different collections. I found the answer by link

So to merge two collections and create view I use the next pipeline

// Here we do a union of the employees and freelancers using a single aggregation query.
db.createView(
   "employeesAndFreelancers",
   "employees",
  [
    { $limit: 1 }, // 2. Keep only one document of the collection.
    { $project: { _id: '$$REMOVE' } }, // 3. Remove everything from the document.

    // 4. Lookup collections to union together.
    { $lookup: { from: 'employees', pipeline: [{ $match: { department: 'sales' } }], as: 'employees' } },
    { $lookup: { from: 'freelancers', pipeline: [{ $match: { department: 'sales' } }], as: 'freelancers' } },

    // 5. Union the collections together with a projection.
    { $project: { union: { $concatArrays: ["$employees", "$freelancers"] } } },

    // 6. Unwind and replace root so you end up with a result set.
    { $unwind: '$union' },
    { $replaceRoot: { newRoot: '$union' } }
  ]);

Then I want to query the view with some simple query.

db.getCollection('employeesAndFreelancers')
.find({"updated" : { "$gte" : ISODate("2018-07-22T09:45:00.000Z")}})
.limit(5)
.sort({"updated": 1})
.explain("executionStats")

The result is next enter image description here

Source collection has the index on the updated field. So query on the source collection will produce execution stats with index.

So my question how the query on view will behave if the size of my collections is 1 GB each or greater? Will it load the whole data to memory in process of execution of the pipeline? If yes, is it possible to improve it somehow?

MongoDB version is 4.0.

user2105282
  • 724
  • 1
  • 12
  • 26

1 Answers1

0

I tried to execute some performance tests, the result that the approach can't be used in production. The aggregation pipeline at the first step tries to do an in-memory join of two tables and then to perform the query. No indexes are involved.

user2105282
  • 724
  • 1
  • 12
  • 26