13

I have an employee collection with half a million records. Each record will have the following details.

The mongo document is as follows.

{
  "_id": "234463456453643563456",
  "name": "Mike",
  "empId": "10",
  "managerId": "8",
  "projects" : [ "123", "456", "789"]
}
  1. When i give any empId, it should return the complete hierarchies from that manager to the bottom level along with the following filter.

a. filter on location
b. filter on projects

The result should be like,

      10     ->>> Manager
      /\
     /  \
    8    6  ---->> 8 & 6 reporting to manager 10
    /\    /\
   /  \  /  \
  4    5 2   1  ---->> 4 & 5 reporting to manager 8 ...

Any help will be appreciated for getting the hierarchical results with level?

I am not able to get the result as expected.

Sample Data :-

db.getCollection("employees").insert({"_id":"10","empId": "10","name":"Employee10","managerId":"15" });

db.getCollection("employees").insert({"_id":"8","empId": "8","name":"Employee8","managerId":"10" });

db.getCollection("employees").insert({"_id":"6","empId": "6","name":"Employee6","managerId":"10" });

db.getCollection("employees").insert({"_id":"4","empId": "4","name":"Employee4","managerId":"8" });

db.getCollection("employees").insert({"_id":"5","empId": "5","name":"Employee5","managerId":"8" });

db.getCollection("employees").insert({"_id":"2","empId": "2","name":"Employee2","managerId":"6" });

db.getCollection("employees").insert({"_id":"1","empId": "1","name":"Employee1","managerId":"6" });

Query :-

db.getCollection('employees').aggregate([
{
    $match: {
        empId : "10"
    }
},
{
   $graphLookup: {
      from: "employees",
      startWith: "$empId",
      connectFromField: "empId",
      connectToField: "managerId",
      as: "reportees",
      maxDepth: 4,
      depthField: "level"
   }
},
{
   $project: {
     "empId":1,
     "managerId":1,
     "reportees.empId":1,
     "reportees.name":1,
     "reportees.managerId":1,
     "reportees.level":1
   }
}
]);

Actual Result :-

{ 
    "_id" : "10", 
    "empId" : "10", 
    "managerId" : "15", 
    "reportees" : [
        {
            "empId" : "1", 
            "name" : "Employee1", 
            "managerId" : "6", 
            "level" : NumberLong(1)
        }, 
        {
            "empId" : "4", 
            "name" : "Employee4", 
            "managerId" : "8", 
            "level" : NumberLong(1)
        }, 
        {
            "empId" : "2", 
            "name" : "Employee2", 
            "managerId" : "6", 
            "level" : NumberLong(1)
        }, 
        {
            "empId" : "5", 
            "name" : "Employee5", 
            "managerId" : "8", 
            "level" : NumberLong(1)
        }, 
        {
            "empId" : "6", 
            "name" : "Employee6", 
            "managerId" : "10", 
            "level" : NumberLong(0)
        }, 
        {
            "empId" : "8", 
            "name" : "Employee8", 
            "managerId" : "10", 
            "level" : NumberLong(0)
        }
    ]
}

Expected Result :-

{ 
    "_id" : "10", 
    "empId" : "10", 
    "managerId" : "15", 
    "reportees" : [
        {
            "empId" : "6", 
            "name" : "Employee6", 
            "managerId" : "10", 
            "level" : NumberLong(0),
            "reportees" : [
              {
               "empId" : "1", 
               "name" : "Employee1", 
               "managerId" : "6", 
               "level" : NumberLong(1)
              }, 
              {
               "empId" : "2", 
               "name" : "Employee2", 
               "managerId" : "6", 
               "level" : NumberLong(1)
              }
            ]
        }, 
        {
            "empId" : "8", 
            "name" : "Employee8", 
            "managerId" : "10", 
            "level" : NumberLong(0),
            "reportees" : [
              {
                "empId" : "5", 
                "name" : "Employee5", 
                "managerId" : "8", 
                "level" : NumberLong(1)
              },
              {
                "empId" : "4", 
                "name" : "Employee4", 
                "managerId" : "8", 
                "level" : NumberLong(1)
              }
             ]
        }
    ]
}

Questions :-

  1. Is it possible to get the expected output with $graphLookup?
  2. Also, Is it possible to get the count at the top level and also for each sub level?
  3. How to apply projection at all level?
  4. How to apply filter on top of this?
halfer
  • 19,824
  • 17
  • 99
  • 186
user1578872
  • 7,808
  • 29
  • 108
  • 206

4 Answers4

16

I belive that having level field we can build hierarchical structure from an array using $reduce. To achieve that we need to get reportees ordered by level descending after $graphLookup. Unfortunately the only way to do it currently is to use $unwind + $sort + $group which makes the aggregation quite long.

Then we can process that ordered array using $reduce. In each step we just have to add an employee to the result set including his reportees from previous level. Additionally we need to detect when level changes during our processing and rearrange helper arrays in that case.

$addFields simply replaces existing reportees field in this case. $concatArrays allows us to append current employee ($$this) to the result. Using $filter we can get reportees from lower level.

db.getCollection('employees').aggregate([
    {
        $match: {
            empId : "10"
        }
    },
    {
        $graphLookup: {
            from: "employees",
            startWith: "$empId",
            connectFromField: "empId",
            connectToField: "managerId",
            as: "reportees",
            maxDepth: 4,
            depthField: "level"
        }
    },
    {
        $project: {
            "empId":1,
            "managerId":1,
            "reportees.empId":1,
            "reportees.name":1,
            "reportees.managerId":1,
            "reportees.level":1
        }
    },
    {
        $unwind: "$reportees"
    },
    {
        $sort: { "reportees.level": -1 }
    },
    {
        $group: {
            _id: "$_id",
            empId: { $first: "$empId" },
            managerId: { $first: "$managerId" },
            reportees: { $push: "$reportees" }
        }
    },
    {
        $addFields: {
            reportees: {
                $reduce: {
                    input: "$reportees",
                    initialValue: {
                        currentLevel: -1,
                        currentLevelEmployees: [],
                        previousLevelEmployees: []
                    },
                    in: {
                        $let: {
                            vars: {
                                prev: { 
                                    $cond: [ 
                                        { $eq: [ "$$value.currentLevel", "$$this.level" ] }, 
                                        "$$value.previousLevelEmployees", 
                                        "$$value.currentLevelEmployees" 
                                    ] 
                                },
                                current: { 
                                    $cond: [ 
                                        { $eq: [ "$$value.currentLevel", "$$this.level" ] }, 
                                        "$$value.currentLevelEmployees", 
                                        [] 
                                    ] 
                                }
                            },
                            in: {
                                currentLevel: "$$this.level",
                                previousLevelEmployees: "$$prev",
                                currentLevelEmployees: {
                                    $concatArrays: [
                                        "$$current", 
                                        [
                                            { $mergeObjects: [ 
                                                "$$this", 
                                                { reportees: { $filter: { input: "$$prev", as: "e", cond: { $eq: [ "$$e.managerId", "$$this.empId"  ] } } } } 
                                            ] }
                                        ]
                                    ]
                                }
                            }
                        }
                    }
                }
            }
        }
    },
    {
        $addFields: { reportees: "$reportees.currentLevelEmployees" }
    }
]).pretty()

Above solution should work for multiple levels. Outputs:

{
    "_id" : "10",
    "empId" : "10",
    "managerId" : "15",
    "reportees" : [
        {
            "empId" : "6",
            "name" : "Employee6",
            "managerId" : "10",
            "level" : NumberLong(0),
            "reportees" : [
                {
                        "empId" : "1",
                        "name" : "Employee1",
                        "managerId" : "6",
                        "level" : NumberLong(1),
                        "reportees" : [ ]
                },
                {
                        "empId" : "2",
                        "name" : "Employee2",
                        "managerId" : "6",
                        "level" : NumberLong(1),
                        "reportees" : [ ]
                }
            ]
        },
        {
            "empId" : "8",
            "name" : "Employee8",
            "managerId" : "10",
            "level" : NumberLong(0),
            "reportees" : [
                {
                    "empId" : "5",
                    "name" : "Employee5",
                    "managerId" : "8",
                    "level" : NumberLong(1),
                    "reportees" : [ ]
                },
                {
                    "empId" : "4",
                    "name" : "Employee4",
                    "managerId" : "8",
                    "level" : NumberLong(1),
                    "reportees" : [ ]
                }
            ]
        }
    ]
}
mickl
  • 48,568
  • 9
  • 60
  • 89
  • 1
    Utter magic!!! I forgot about that `depthField` parameter... I shall review and award that happily should there not be hidden surprises (or better ideas even)... – dnickless Oct 05 '18 at 09:55
  • This is the perfect answer!!. It saves my day. any idea about the performance of this query when we've millions of records in the collection? – Nishchit Mar 16 '21 at 09:13
  • 1
    @NishchitDhanani haven't been tested but you're welcome to test it on your dataset and share your results here :) – mickl Mar 16 '21 at 10:19
  • This works fine, but one glitch, when you have a manager without anyone to report to him, it will not list in the response. any suggestion , – Warrior Jun 22 '21 at 13:47
  • Figured it our, by changing the unwind block we can get parent node which do not have child node, like below { $unwind: { path: "$reportees", includeArrayIndex: "arrayIndex", preserveNullAndEmptyArrays: true } } – Warrior Jun 23 '21 at 04:05
5

That's precicsely what you would $graphLookup for (the traversal bit at least). For the filtering part you could simply use $filter or $match depending on how exactly you want to filter.

Have a look at the results of this query:

db.employees.aggregate({
    $graphLookup: {
      from: "employees",
      startWith: "$managerId",
      connectFromField: "managerId",
      connectToField: "empId",
      as: "managers",
    }
})

UPDATE 1 based on your clarification:

In order to get the hierarchical structure that you'd like to get you could do the following. However, I wouldn't call this a pretty solution since it requires you statically define the number of levels you want to go down and also to repeat sections but it does the job for your example. Not sure, if/how easily this can be extended to more levels, either. Personally, I think a client side loop solution would be more suitable for this kind of job:

db.employees.aggregate([
{
    $match: {
        empId : "10"
    }
},
// level 0
{
   $graphLookup: {
      from: "employees",
      startWith: "$empId",
      connectFromField: "empId",
      connectToField: "managerId",
      as: "reportees",
      maxDepth: 0
   }
},
{
    $unwind: "$reportees" // flatten
},
{
    $addFields: {
        "reportees.level": 0 // add level field
    }
},
// level 1
{
   $graphLookup: {
      from: "employees",
      startWith: "$reportees.empId",
      connectFromField: "reportees.empId",
      connectToField: "managerId",
      as: "reportees.reportees",
      maxDepth: 0
   }
},
{
    $group: { // group previously flattened documents back together
        _id: "$_id",
        empId: { $first: "$empId" },
        name: { $first: "$name" },
        managerId: { $first: "$managerId" },
        reportees: { $push: "$reportees" },
    }
},
{
    $addFields: {
        "reportees.reportees.level": 1 // add level field
    }
}
])

UPDATE 2:

The following query gets you to where you want to be from an output structure point of view (I omitted the level field but it should be easy to add). It is, however, not particularly pretty and, again, requires you to define a maximum organisational depth upfront.

db.employees.aggregate([
{
    $match: {
        empId : "10"
    }
},
{
   $graphLookup: { // get the relevant documents out of our universe of employees
      from: "employees",
      startWith: "$empId",
      connectFromField: "empId",
      connectToField: "managerId",
      as: "reportees"
   }
},
{
    $project: { // add the employee we are interested in into the array of employees we're looking at
        _id: 0,
        reportees: { $concatArrays: [ "$reportees", [ { _id: "$_id", empId: "$empId", name: "$name", managerId: "$managerId" } ] ] }
    }
},
{
    $project: {
        reportees: {
            $let: {
                vars: {
                    managers: {
                        $filter: { // remove employees with no reportess so keep managers only
                            input: {
                                $map: {
                                    input: "$reportees",
                                    as: "this",
                                    in: {
                                        $mergeObjects: [
                                            "$$this",
                                            {
                                                reportees: {
                                                    $filter: { // extract reportees from list of employees
                                                        input: "$reportees",
                                                        as: "that",
                                                        cond: {
                                                            $eq: [ "$$this._id", "$$that.managerId" ]
                                                        }
                                                    }
                                                }
                                            }
                                        ]
                                    }
                                }
                            },
                            as: "this",
                            cond: { $ne: [ "$$this.reportees", [] ] }
                        }
                    }
                },
                in: {
                    $cond: [ // this is to break the processing once we have reached a top level manager
                        { $eq: [ "$$managers", [] ] },
                        "$reportees",
                        "$$managers"
                    ]
                }
            }
        }
    }
},
// second level: exactly identical to the previous stage
// third level: exactly identical to the previous stage
// basically, from here onwards you would need to repeat an exact copy of the previous stage to go one level deeper
]);
dnickless
  • 10,733
  • 1
  • 19
  • 34
  • Already tried but not working as expected. Getting hierarchies for each record as opposed to just returning for a given manager I’d. – user1578872 Sep 21 '18 at 05:56
  • You could just add a `$match` stage before the lookup that limits the results to one specific manager! – dnickless Sep 21 '18 at 06:21
  • It returns all the levels under reportees of first level and returns level correctly as 0 for immediate reportees and 1 for the following, ... But, I want the result in hierarchical formats. – user1578872 Sep 21 '18 at 14:42
  • So, its not possible with the single query irrespective of levels. – user1578872 Oct 01 '18 at 14:18
  • 1
    Well, technically speaking you could have a single query following the `Update 2` pattern which caters for 100 levels or so but I admit that this would still be a limitation and probably slow and nasty, too. I cannot think of a different solution at this stage, though. I'm sorry. – dnickless Oct 01 '18 at 15:43
0

The official documentation on $graphLookup may provide help more or less.

https://docs.mongodb.com/manual/reference/operator/aggregation/graphLookup/

Just a kind remind.

CandyCrusher
  • 308
  • 1
  • 14
0

If using Node.js, a simpler solution is to use mongoose and the populate feature. I answered it here

https://stackoverflow.com/a/69017771/2956628

Charles-Eugene Loubao
  • 1,080
  • 2
  • 12
  • 22