18

I am seriously baffled by mongodb's aggregate function. All I want is to find the newest document in my collection. Let's say each record has a field "created"

db.collection.aggregate({
    $group: { 
        _id:0,
        'id':{$first:"$_id"},
        'max':{$max:"$created"}
    }
})

yields the correct result, but I want the entire document in the result? How would I do that?

This is the structure of the document:

{
    "_id" : ObjectId("52310da847cf343c8c000093"),
    "created" : 1389073358,
    "image" : ObjectId("52cb93dd47cf348786d63af2"),
    "images" : [
        ObjectId("52cb93dd47cf348786d63af2"),
        ObjectId("52f67c8447cf343509d63af2")
        ],
    "organization" : ObjectId("522949d347cf3402c3000001"),
    "published" : 1392601521,
    "status" : "PUBLISHED",
    "tags" : [ ],
    "updated" : 1392601521,
    "user_id" : ObjectId("52214ce847cf344902000000")
}
Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
Jamgold
  • 1,716
  • 1
  • 14
  • 18
  • 1
    Why aren't you using a sorted `find`? – JohnnyHK Mar 12 '14 at 18:28
  • because I want to understand how to use aggregate – Jamgold Mar 12 '14 at 18:49
  • `aggregate` isn't well suited for this type of thing as you can't easily include the full doc in a `$group`, you have to bring in each field separately. – JohnnyHK Mar 12 '14 at 19:02
  • I saw somewhere an example that forEach-ed over the result of the aggregation and then used results.push( findOne ), but I couldn't make it work. All I want to figure out is how to get to the documents either filtering by some complex relation or even adding aggregate fields to a document. – Jamgold Mar 12 '14 at 21:29
  • I see in the comments that you are actually trying to do something different to this. Can you please edit your question to explain what you actually want to achive? And please include a sample document. – Neil Lunn Mar 13 '14 at 03:39
  • I edited the title to better reflect what I am looking for – Jamgold Mar 13 '14 at 22:48

5 Answers5

21

In the documentation i found that the $$ROOT expression addresses this problem.

From the DOC: http://docs.mongodb.org/manual/reference/operator/aggregation/group/#group-documents-by-author

Volox
  • 1,068
  • 1
  • 12
  • 23
4
query = [
    {
        '$sort': {
            'created': -1
        }
    },
    {
        $group: { 
            '_id':null,
            'max':{'$first':"$$ROOT"}
        }
    }
]
db.collection.aggregate(query)
4
db.collection.aggregate([
 {

  $group: {
   '_id':"$_id",
   'otherFields':{ $push: { fields: $ROOT } }
  } 
 }
])
Med MANS
  • 199
  • 1
  • 2
  • 13
1

I think I figured it out. For example, I have a collection containing an array of images (or pointers). Now I want to find the document with the most images

results=[];
db.collection.aggregate([
    {$unwind: "$images"},
    {$group:{_id:"$_id", 'imagecount':{$sum:1}}},
    {$group:{_id:"$_id",'max':{$max: "$imagecount"}}},
    {$sort:{max:-1}},
    {$group:{_id:0,'id':{$first:'$_id'},'max':{$first:"$max"}}}
]).result.forEach(function(d){
    results.push(db.stories.findOne({_id:d.id}));
});

now the final array will contain the document with the most images. Since images is an array, I use $unwind, I then group by document id and $sum:1, pipe that into a $group that finds the max, pipe it into reverse $sort for max and $group out the first result. Finally I fetchOne the document and push it into the results array.

Jamgold
  • 1,716
  • 1
  • 14
  • 18
-3

You should be using db.collection.find() rather than db.collection.aggregate():

db.collection.find().sort({"created":-1}).limit(1)
Anand Jayabalan
  • 12,294
  • 5
  • 41
  • 52
  • That makes sense in my example, but it still does not answer my question about how to yield entire documents in the result of an aggregate. Lets assume I really wanted to aggregate over a set of documents and the result were more than one. – Jamgold Mar 12 '14 at 18:49
  • 1
    In the aggregation pipeline, if you had just a $match phase, the entire document will be returned. However, after a $group phase, there's no way the whole document can be returned. If you think about it, it wouldn't even make sense. When you group by a field, you'll only have those fields specified in the group phase for the subsequent phases in the pipeline. Since those fields could be aggregated fields like sum() and count() (i.e., their values have been computed from several documents), it isn't logical to show the entire document corresponding to the result. – Anand Jayabalan Mar 12 '14 at 18:53
  • what if I wanted to get an aggregation from a collection with a subset of the document fields added to the result? Lets say I want all documents with a count of some aggregated field >= 5? eg. show me all posts with more than 10 comments. – Jamgold Mar 12 '14 at 21:28
  • We can add a $match phase in the pipeline, after the $group phase to filter the results of the $group phase. – Anand Jayabalan Mar 12 '14 at 21:34
  • Anand, can you give an example for the syntax, please? – Jamgold Mar 12 '14 at 21:39
  • Here you go: `db.foo.aggregate([{$group:{_id:"$a", count:{$sum:1}}}, {$match:{"count":{$gt:1}}}])`. I'm grouping by the field "a" and getting the count. In the $match phase, I'm filtering the results for records with count > 1. Let me know if this helps – Anand Jayabalan Mar 12 '14 at 22:25
  • Sorry, that did not work. Using your example I can run the following aggregation on my collection (see document structure above) `db.collection.aggregate([{$unwind: "$images"},{$group:{_id:"$organization", 'imagecount':{$sum:1}}},{$match:{imagecount:{$gt:10}}}])` but that only matches the winning record `"result" : [ { "_id" : ObjectId("522949d347cf3402c3000001"), "imagecount" : 20 } ]` Maybe it is not possible to output an entire document with aggregate – Jamgold Mar 13 '14 at 23:14