how to aggregate in mongoDB?

Question

I have a book list in json as following:

{
    "_id" : ObjectId("1"),
    "author" : [
        "Mary",
        "Tony"
    ],
    "booktitle" : "Book1",
    "Category" : "Children"
}
{
        "_id" : ObjectId("2"),
        "author" : [
            "Joe",
            "Tony"
        ],
        "booktitle" : "Book2",
        "Category" : "Children"
}
{
            "_id" : ObjectId("3"),
            "author" : [
                "Joe",
            ],
            "booktitle" : "Book3",
            "Category" : "comedy"
}
.......

I hope to get top 10 authors who write books belongs to "Children". Because some book are not just written by one author. I don't know how to do it.

Given the example above, book1 and book2 belongs to "Children", Tony writes 2 books, Mary and Joe writes 1 book. Thus, Top 3 writers are in order Tony, Mary, Joe.

I just write down:

db.table.find({Category: "Children"}).aggregate({$group:{_id: '', count : {$sum : 1}}}, {$sort: {count : -1}})

but don't know how to write the group part. Thanks.

score 0 · Accepted Answer · answered Apr 13 '16 at 22:16

tl;dr

Actually, it is pretty easy.

db.books.aggregate([
    { $match:{ Category:"Children" }},
    { $project:{ _id:0, author:1 }},
    { $unwind:"$author" },
    { $group:{ _id:"$author", count:{ $sum: 1} }},
    { $sort:{ count:-1 }},
    { $limit: 10 } 
])

In detail

Sample data

{ "_id" : ObjectId("570ec0a29307105e18d42bae"), "author" : [ "Mary", "Tony" ], "booktitle" : "Book1", "Category" : "Children" }
{ "_id" : ObjectId("570ec0cf9307105e18d42baf"), "author" : [ "Joe", "Tony" ], "booktitle" : "Book2", "Category" : "Children" }
{ "_id" : ObjectId("570ec0f59307105e18d42bb0"), "author" : [ "Joe" ], "booktitle" : "Book3", "Category" : "comedy" }

The aggregation explained

{ $match:{ Category:"Children" }} First we find all books in the category in question
{ $project:{ _id:0, author:1 }} We only want the author array, which we
{ $unwind:"$author" }. That means that for each array element we create a new document, which retains the other fields of the original document. In our case, this is none, and hence, at this stage the result of our aggregation looks like this:
```
{ "author" : "Mary" }
{ "author" : "Tony" }
{ "author" : "Joe" }
{ "author" : "Tony" }
```
{ $group:{ _id:"$author", count:{ $sum: 1} }} Next, we simply group by author and increment a counter for each occurrence of the same distinct author name.
{ $sort:{ count:-1 }} We sort by the field we just created, count, in descending order and
{ $limit: 10 } limit to the ten authors which have the highest count

The aggregation gives the following

Result

{ "_id" : "Tony", "count" : 2 }
{ "_id" : "Joe", "count" : 1 }
{ "_id" : "Mary", "count" : 1 }