2

I have a document that looks something like this

{
  name : james,
  books : [
    {
      title: title1,
      year: 1990
    },
    {
      title: title2,
      year: 1990
    },
    {
      title: title3,
      year: 1991
    }
  ]
}

Say if I want to count how many books james owns with the year of 1990, how would I go about doing that? I've tried the following. But I realized it doesn't work because 'books' is an array.

db.collection(collectionName).find({name:james, books: {year: 1990}}).count(function(book_count){
  console.log(book_count);
  }

Any pointers would be much appreciated. Thanks!

EDIT:

I did see on another answer than you can use this code below to get the size of the whole array. But I am wondering how to get a count of items in the array with a particular parameter. ie. instead of seeing how many books james owns. I want to know how many of james' book are published in 1990.

db.mycollection.aggregate({$project: { count: { $size:"$foo" }}})
Community
  • 1
  • 1
davidx1
  • 3,525
  • 9
  • 38
  • 65
  • Possible duplicate of [MongoDB: count the number of items in an array](http://stackoverflow.com/questions/21387969/mongodb-count-the-number-of-items-in-an-array) – Blakes Seven Mar 08 '16 at 20:33
  • 2
    Seriously, if there is only one document to be returned from this then the aggregation framework ( which can get the size from the array and filter it down first as well ) would be a poor choice here. Better to just test the size of the array returned in the document matching your condition instead. `db.collection(collectionName).findOne({ "name": "james", "books.year": 1990 },function(err,result) { console.log( result.books.filter(function(book) { return book.year == 1990 }).length) })`. Pretty simple stuff. Only aggregate when you intend to and what to use that number for further totals. – Blakes Seven Mar 08 '16 at 20:54

2 Answers2

2

The aggregation framework is ideal for such. Consider running the following pipeline to get the desired result.

pipeline = [
    {
        "$match": {
            "name": "james",
            "books.year": 1990
        }
    },
    {
        "$project": {
            "numberOfBooks": {
                "$size": {                  
                    "$filter": {
                        "input": "$books",
                        "as": "el",
                        "cond": { "$eq": [ "$$el.year", 1990 ] }
                    }                   
                }
            }
        }
    }
];
db.collection.pipeline(pipeline);

The above pipeline uses the new $filter operator available for MongoDB 3.2 to produce an array which meets the specified condition i.e. it filters outer elements that do not satisfy the criteria. The initial $match pipeline is necessary to filter out documents getting into the aggregation pipeline early as a pipeline optimization strategy.

The $size operator which accepts a single expression as argument then gives you the number of elements in the resulting array, thus you have your desired book count.


For an alternative solution which does not use the $filter operator not found in earlier versions, consider the following pipeline operation:

pipeline = [
    {
        "$match": {
            "name": "james",
            "books.year": 1990
        }
    },
    {
        "$project": {
            "numberOfBooks": {
                "$size": {                  
                    "$setDifference": [
                        {
                            "$map": {
                                "input": "$books",
                                "as": "el",
                                "in": {
                                    "$cond": [
                                        { "$eq": [ "$$el.year", 1990 ] },
                                        "$$el",
                                        false
                                    ]
                                }
                            }
                        },
                        [false]
                    ]                   
                }
            }
        }
    }
];
db.collection.pipeline(pipeline);

The $project pipeline stage involves fittering the books array so that you remove the documents which do not have the year 1990. This is made possible through the $setDifference and $map operators.

The $map operator in essence creates a new array field that holds values as a result of the evaluated logic in a subexpression to each element of an array. The $setDifference operator then returns a set with elements that appear in the first set but not in the second set; i.e. performs a relative complement of the second set relative to the first. In this case it will return the final books array that has elements with year 1990 and subsequently the $size calculates the number of elements in the resulting array, thus giving you the book count.


For a solution that uses the $unwind operator, bearing in mind that (thanks to this insightful response from @BlakesSeven in the comments):

Since there is only a single document returned with nothing but a null key and a count, there is no more chance for this breaking that limit than the previous operation with the same output. It's not that $unwind "breaks the limit", it's that it "produces a copy of each document per array entry", which uses more memory ( possible memory cap on aggregation pipelines of 10% total memory ) and therefore also takes "time" to produce as well as "time" to process.

and as a last resort, run the following pipeline:

pipeline = [
    {
        "$match": {
            "name": "james",
            "books.year": 1990
        }
    },
    { "$unwind": "$books" },
    {
        "$match": { "books.year": 1990 }
    },
    {
        "$group": {
            "_id": null
            "count": { "$sum": 1 }
        }
    }
]
db.collection.pipeline(pipeline)
chridam
  • 100,957
  • 23
  • 236
  • 235
  • The reasons to not use `$unwind` here do not include the 16MB BSON limit. Since there is only a single document returned with nothing but a `null` key and a count, there is no more chance for this breaking that limit than the previous operation with the same output. It's not that `$unwind` "breaks the limit", it's that it *"produces a copy of each document per array entry"*, which uses more memory ( possible memory cap on aggregation pipelines of 10% total memory ) and therefore also takes "time" to produce as well as "time" to process. – Blakes Seven Mar 08 '16 at 20:47
  • You also seem to have copied the `$map`/`$setDifference` example from another post since the fields are not related to this question. – Blakes Seven Mar 08 '16 at 20:49
  • @BlakesSeven Yes, it's from one of my answers http://stackoverflow.com/a/34763109/122005. Thanks for the corrections. – chridam Mar 08 '16 at 20:51
0

You can use $elemMatch in projection to retrieve the document with only the matching books.

db.collection(collectionName).findOne({name:james, books: {year: 1990}}, { books: { $elemMatch: { year: 1990 } } }). // returned document will only contains books having the year 1990.

If you want only the count then you need to use aggregation framework. First match the documents, then unwind the books array, then match against year field. Something like following should work:

db.collection(collectionName).aggregate([{$match: {name: "james"}}, {$unwind:"$books"}, {$match:{"books.year":1990}}]
cubbuk
  • 7,800
  • 4
  • 35
  • 62