I am noticing huge performance differences in what appears to be same aggregate, at least conceptually. The tests were made on a simple collection structure, that has an _id and a name and a createdAt, but there 20 million of those. There is an index on createdAt. It's hosted on an mlab cluster, version is 3.6.9 WiredTiger.
I am trying to get a simple paging going using aggregate, I know I could use find and limit, but I like to add more elements to the pipeline, the example I give is very distilled.
db.getCollection("runnablecalls").aggregate([
{
$facet: {
docs: [
{ $sort: {createdAt: -1} },
{ $limit: 25 },
{ $skip: 0 },
],
page_info: [
{ $group: { _id: null, total: { $sum: 1 } }
}
],
}
}
])
That takes almost 40s
. Now if I moved the $sort
and $limit
outside of the facet it takes 0.042s
.
db.getCollection("runnablecalls").aggregate([
{ $sort: {createdAt: -1} },
{ $limit: 25 },
{
$facet: {
docs: [
{ $skip: 0 },
],
page_info: [
{
$group: { _id: null, total: { $sum: 1 } }
}
]}
},
])
The page_info
facet makes no difference at the end, I can take it out without difference, I am just leaving it in because I like use it. I know how to solve the problem using two queries a count and an aggregate without a $facet
. I just like to understand why this happens.