Mongo aggregation vs Java for loop and performance

Question

I have a below mongo document stored

{
 "Field1": "ABC",
 "Field2": [
    { "Field3": "ABC1","Field4": [ {"id": "123" }, { "id" : "234" }, { "id":"345" }] }, 
    { "Field3": "ABC2","Field4": [ {"id": "123" }, { "id" : "234" }, { "id":"345" }] }, 
    { "Field3": "ABC3","Field4": [{ "id":"345" }] },  
    ]
}

from the above, I want to fetch the subdocuments which is having id "123"

ie.

{ 
    "Field3" : "ABC1",
    "Field4" : [ { "id": "123"} ]
} ,
{
    "Field3" : "ABC2",
    "Field4" : [ { "id": "123"} ]
}

1. Java way  
    A. use Mongo find method to get the ABC document from Mongo DB
    B. for Loop to Iterate the Field2 Json Array
    C. Again for Loop to Iterate over Field4 Json Array
    D. Inside the nested for loop I've if condition to Match id value to "123"
    E. Store the Matching subdocument into List 

2. Mongo Way
   A. Use Aggregation query to get the desired output from DB.No Loops and conditions in the Java side. 
   B. Aggregation Query below stages
    I)  $Match - match the ABC document
    II) $unwind - Field2
    III) $unwind - Field4
    IV) $match - Match the with id ( value is "123")
    V) $group - group the document  based on Field3 (based on "ABC1" or "ABC2") 
    VI) execute aggregation and return results

Both are working good and returning proper results.
Question is which one is the better to follow and why ? I used the aggregation in restful service get method, So executing aggregation queries 1000 or more times in parallel will cause any performance problems?

IMO:Unless we analyze your dataset&DB we can't suggest on one over the other, there could be multiple reasons an aggregate could go slow,Proper indexes, a small dataset & proper querying(using `$match`, `$limit` or `$skip` as early as possible)could help you to yield results faster,Most likely for these king of question I would say OP has to dig in thru their code to determine what's best, try check `executionStats` via `db.collection.explain('executionStats').aggregate([])`.Some prefer less logic on DB to ease on its performance, otherway is to say if DB can handle well get less data from DB. — whoami - fakeFaceTrueSoul, Dec 28 '19 at 06:13

prasad_ · Accepted Answer · 2019-12-28T12:18:34.713

With Aggregation, the whole query is executed as a single process on the MongoDB server - the application program will get the results cursor from the server.

With Java program also you are getting a cursor from the database server as input to the processing in the application. The response cursor from the server is going to be larger set of data and will use more network bandwidth. And then there is processing in the application program, and this adds more steps to complete the query.

I think the aggregation option is a better choice - as all the processing (the initial match and filtering the array) happens on the database server as a single process.

Also, note the aggregation query steps you had posted can be done in an efficient way. Instead of multiple stages (2, 3, 4 and 5) you can do those operations in a two stages - use a $project with $map on the outer array and then $filter on the inner array and then $filter the outer array.

The aggregation:

db.test.aggregate( [
  { 
      $addFields: { 
          Field2: { 
              $map: {
                   input: "$Field2",
                      as: "fld2",
                      in: {
                           Field3: "$$fld2.Field3",
                           Field4: { 
                               $filter: {
                                   input: "$$fld2.Field4",
                                      as: "fld4",
                                    cond: {  $eq: [ "$$fld4.id", "123" ] }
                               }
                           }
                       }
                 } 
          }
      }
  },
  { 
      $addFields: { 
          Field2: { 
              $filter: {
                   input: "$Field2",
                      as: "f2",
                    cond: {  $gt: [ { $size: "$$f2.Field4" }, 0 ] }
              }
          }
      }
  },
] )

,. I tried with $project, $map and $filter but unsuccessful.. could you please give me the sample query for the above Json. — PrabaharanKathiresan, Dec 28 '19 at 11:32
You can try the answer from this SO post (it has a similar array within array structure and filtering the inner array): [MongoDB- Fetching exact array element, excluding others](https://stackoverflow.com/questions/59492131/mongodb-fetching-exact-array-element-excluding-others). — prasad_, Dec 28 '19 at 11:39
I am posting the query. Actually it will have two stages instead of four stages (I had mentioned one stage in the answer). — prasad_, Dec 28 '19 at 12:10
Hi Prasad_, Thanks for the help. Answer works as expected in mongo db but I stuck in implementing the same in spring data. posted new question https://stackoverflow.com/questions/59534705/spring-mongo-aggregation-filter-with-and-in-and-eq-conditions. problem comes when we have $filter.cond with multiple conditions. Unlike criteria, ArrayOperators.Filter / ComparisonOperators.Filter are not providing chaining. — PrabaharanKathiresan, Dec 30 '19 at 17:42

score 1 · Answer 2 · answered Dec 28 '19 at 06:08

1

The second way is probably better because it returns a smaller result from the datastore; shlepping bits over the wire is expensive.

answered Dec 28 '19 at 06:08

Floegipoky

3,087
1
31
47

Mongo aggregation vs Java for loop and performance

2 Answers2

Linked