0

I have 2 schemas, schema A and B like the following:

const A = new Schema({
  paymentId: Number,
  date: Date,
  ...data
})

const B = new Schema({
  paidId: Number,
  date: Date,
  ...data
})

I want to return records from both A and B like it is a single table, where I can get records from both A and B that can be used with .sort(), .skip(), and .limit() functions ideally.

I could just do a .find() on both tables, concatenate them, and manually sort / skip / limit, but I find that highly inefficient.

EDIT: To clarify. It shouldn't matter if both collections are related or not. All I want is to query from both collections like both are in one collection.

For example, if I have the following documents

// Documents in A
{ date: '2020-01-01', A_id: 1 },
{ date: '2020-01-03', A_id: 2 },
{ date: '2020-01-05', A_id: 3 },

// Documents in B
{ date: '2020-01-02', B_id: 1 },
{ date: '2020-01-04', B_id: 2 },
{ date: '2020-01-06', B_id: 3 },

Doing a query with the options .sort('date').skip(0).limit(5) should result in the following:

// Documents in A and B
{ date: '2020-01-01', A_id: 1 },
{ date: '2020-01-02', B_id: 1 },
{ date: '2020-01-03', A_id: 2 },
{ date: '2020-01-04', B_id: 2 },
{ date: '2020-01-05', A_id: 3 },
edrichhans
  • 113
  • 1
  • 2
  • 11
  • How are these documents related, is it by fields paymentId and paidId ? Is MongoDB query fine? Sample JSON document will be helpful. – Mallik Oct 02 '20 at 04:47
  • @Mallik B contains an array of A's. But with the way I want to query them, it shouldn't matter if they are related or not. MongoDB query is fine. – edrichhans Oct 02 '20 at 06:05
  • thanks for providing a sample document. I understand with sample JSON documents. Please see my suggestion below. – Mallik Oct 02 '20 at 07:58

2 Answers2

1
//Can I suggest, $merge to merge these two independent collections into another different collection 
//and then use aggregation to doing sort(),skip() and limit()
> db.version();
4.2.6
> db.colA.find();
{ "_id" : ObjectId("5f76d969975ec8826bbcaab5"), "date" : "2020-01-01", "A_id" : 1 }
{ "_id" : ObjectId("5f76d969975ec8826bbcaab6"), "date" : "2020-01-03", "A_id" : 2 }
{ "_id" : ObjectId("5f76d969975ec8826bbcaab7"), "date" : "2020-01-05", "A_id" : 3 }
> db.colB.find();
{ "_id" : ObjectId("5f76d969975ec8826bbcaab8"), "date" : "2020-01-02", "B_id" : 1 }
{ "_id" : ObjectId("5f76d969975ec8826bbcaab9"), "date" : "2020-01-04", "B_id" : 2 }
{ "_id" : ObjectId("5f76d969975ec8826bbcaaba"), "date" : "2020-01-06", "B_id" : 3 }
> db.colA.aggregate([
{$match:{}},
{$merge:{into: "colAB"}}
]);
>db.colB.aggregate([
{$match:{}},
{$merge:{into: "colAB"}}
]);
> db.colAB.find();
{ "_id" : ObjectId("5f76d969975ec8826bbcaab5"), "A_id" : 1, "date" : "2020-01-01" }
{ "_id" : ObjectId("5f76d969975ec8826bbcaab6"), "A_id" : 2, "date" : "2020-01-03" }
{ "_id" : ObjectId("5f76d969975ec8826bbcaab7"), "A_id" : 3, "date" : "2020-01-05" }
{ "_id" : ObjectId("5f76d969975ec8826bbcaab8"), "B_id" : 1, "date" : "2020-01-02" }
{ "_id" : ObjectId("5f76d969975ec8826bbcaab9"), "B_id" : 2, "date" : "2020-01-04" }
{ "_id" : ObjectId("5f76d969975ec8826bbcaaba"), "B_id" : 3, "date" : "2020-01-06" }
> > db.colAB.aggregate([
... {$project:{_id:0}},
... {$sort:{date:1}},
... {$skip:0},
... {$limit:5}
... ]);
{ "A_id" : 1, "date" : "2020-01-01" }
{ "B_id" : 1, "date" : "2020-01-02" }
{ "A_id" : 2, "date" : "2020-01-03" }
{ "B_id" : 2, "date" : "2020-01-04" }
{ "A_id" : 3, "date" : "2020-01-05" }
Mallik
  • 336
  • 2
  • 6
  • Hi, thank you for answering. I think you misunderstood my question. the `pmt` and `paid` are not linked by their ids. Furthermore, it shouldn't matter if they have relation to each other. All I want to query are the documents from both collections sorted by date. Please see my edit in the question for clarification. – edrichhans Oct 02 '20 at 07:14
  • Thanks for this! So I'd have to delete `db.colAB` afterwards? How efficient is having to $merge every time, then get only the first 20 records, and then deleting the collection afterwards? Is there any way to improve performance? – edrichhans Oct 02 '20 at 08:48
  • Also, I was recently made aware of the `$unionWith` aggregation, which is perfect for my use case. However this is only available on MongoDB 4.4, which my cloud provider does not support for my specific subscription – edrichhans Oct 02 '20 at 08:49
  • @edirchhans: for performance improvement, if you have a large collection you can create an index on a date and try to $match documents with the date. You can use explain to check your query ````> db.colAB.explain("executionStats").aggregate([ ... {$match:{"date":{$gt:"2019-01-01"}}}, ... {$project:{_id:0, ... date:1} ... }, ... {$sort:{date:1}}, ... {$skip:0}, ... {$limit:5} ... ]);```` This plan will use index scan instead of collection scan. – Mallik Oct 02 '20 at 10:41
  • Is there a way to do this without having to use $merge to create a new collection? Seems your solution still needs to create the collection `colAB` which I would like to avoid if possible. – edrichhans Oct 02 '20 at 16:36
0

From https://stackoverflow.com/a/55289023/3793648:

Doing unions in MongoDB in a 'SQL UNION' fashion is possible using aggregations along with lookups, in a single query.

Something like this:

    db.getCollection("AnyCollectionThatContainsAtLeastOneDocument").aggregate(
    [
      { $limit: 1 }, // Reduce the result set to a single document.
      { $project: { _id: 1 } }, // Strip all fields except the Id.
      { $project: { _id: 0 } }, // Strip the id. The document is now empty.

      // Lookup all collections to union together.
      { $lookup: { from: 'collectionToUnion1', pipeline: [...], as: 'Collection1' } },
      { $lookup: { from: 'collectionToUnion2', pipeline: [...], as: 'Collection2' } },
      { $lookup: { from: 'collectionToUnion3', pipeline: [...], as: 'Collection3' } },

      // Merge the collections together.
      {
        $project:
        {
          Union: { $concatArrays: ["$Collection1", "$Collection2", "$Collection3"] }
        }
      },

      { $unwind: "$Union" }, // Unwind the union collection into a result set.
      { $replaceRoot: { newRoot: "$Union" } } // Replace the root to cleanup the resulting documents.
    ]);

More details are in the post above. Adding $sort, $skip, and $limit is just a matter of adding them to the aggregate pipeline. Many thanks to @sboisse!

edrichhans
  • 113
  • 1
  • 2
  • 11