0

I'm trying to test MongoDB's performance (university research) inserting and querying same datas (with same queries) in different ways, but I'm having some troubles with response size, I'll try to explain what I'm doing.

My original file has this format:

[{"field":"aaaa","field2":"bbbbb","field3":"12345"},{"field":"cccc","field2":"ddddd","field3":"12345"},{"field":"ffff","field2":"ggggg","field3":"12345"},{"field":"hhhhh","field2":"iiiii","field3":"12345"},{"field":"jjjj","field2":"kkkkk","field3":"12345"},{"field":"lllll","field2":"mmmmm","field3":"12345"}]

1° Approach - I insert the whole file as a document in Mongo, but it doesn't accept it this way, so I have to add "Array" in front of the file, this way: {"Array":[{..},{..},{..},...]}, once inserted I query it with

db.collection.aggregate([
     { $match: {_cond_},
     { $unwind: "$Array"},
     { $match: {_cond_},
     { $group: {_id: null, count: {$sum:1}, Array: {$push: "$Array"}}},
     { $project: {"Numero HIT": "$count", Array:1}}
])

to retrieve inner file datas and count number of HITS. (_cond_ of course is something like "Array.field": "aaaa" or "Array.field": /something to search/).

2° Approach - I insert each inner document by itself: I split the original file (it's ALL in line) in an array, then i cycle it inserting each element. Then I query it with:

db.collection2.find({field: "aaaa"}) (or field: /something to search/)

I'm using two different collections, one for each approach, each of them of 207/208MB. Everything seemed fine, then doing a query with 1° approach i got this error:

BSONObj size: 24002272 (0x16E3EE0) is invalid. Size must be between 0 and 16793600(16MB)

I remembered response from MongoDB's query MUST be lower then 16MB, ok, but how is it possible that approach 1 give me the error and the SAME* query in approach 2 doesn't say anything?? And how do I fix it?I mean: ok, the response is >16MB, how do I handle it? I can't do this kind of queries in any way? I hope it was clear what I meant.

Thanks in advance

*with SAME i mean something like:

1° Approach:

 db.collection.aggregate([
         { $match: {"Array.field":"aaa", "Array.field3": 12345},
         { $unwind: "$Array"},
         { $match: {"Array.field":"aaa", "Array.field3": 12345},
         { $group: {_id: null, count: {$sum:1}, Array: {$push: "$Array"}}},
         { $project: {"Numero HIT": "$count", Array:1}}
    ])

2° Approach:

db.collection2.find({field: "aaa", field3: 12345}) 
Andrea Cristiani
  • 351
  • 2
  • 5
  • 13
  • Because you are using `$unwind` in first query which splits the data and breaches the BSON limit – Ashh Jun 06 '18 at 16:41
  • why my question is considered a duplicate?? I didn't ask for retrieve queried element in an object or how to count them, I asked why my 2 different approaches (which query the DB for the same documents) have different result (one gives me error Bson size, the other one gives me the normal result of the query) and how (if there's a way) to handle the Bson Obj size problem. – Andrea Cristiani Jun 07 '18 at 07:24
  • @Ashish what do you mean with 'breaches the BSON limit'? I knew `$unwind` just unroll the array, if I have a document like `{"field": [{"subfield": "val"},{"subfield": "val2"},...]}` it should give me `{"field": {"subfield"}: "val"} {"field": {"subfield": "val2"}}` and with `$group` I bring back together the result of the query, right? – Andrea Cristiani Jun 07 '18 at 07:30
  • nopes you are wrong... your bson limit exceeds in `$unwind` stage and it throws error before it reaches to your group stage... – Ashh Jun 07 '18 at 07:51
  • Ooh! Got it! And is there a solution to this problem? How do I handle this? oh and by the way, why does this happen only with aggregate approach? collections have same datas (stored differently, but same data) and the query asks for the same things (eg, field: "aaa"). – Andrea Cristiani Jun 07 '18 at 07:59
  • It is not problem with the aggregate it's a problem with `$unwind` and solution is always create a new collection instead of pushing the elements into an array which makes searching easier – Ashh Jun 08 '18 at 10:28

0 Answers0