18

I have a list of documents in MongoDB with tree structure, where Model Tree Structures with Parent References pattern used. I want a single aggregation query which returns ancestor list(till the root), given the 'name' property.

Structure:

{
  '_id': '1',
  'name': 'A',
  'parent': '',
},
{
  '_id': '2',
  'name': 'B',
  'parent': 'A',
},
{
  '_id': '3',
  'name': 'C',
  'parent': 'B',
},
{
  '_id': '4',
  'name': 'D',
  'parent': 'C',
}

Aggregation result:(Given, name = 'D')

{
  '_id': '4',
  'name': 'D',
  'ancestors': [{name:'C'}, {name:'B'}, {name:'A'}]
}

Note: I can't change the document structure now. It will cause many problems. I saw many solutions which suggest to use Model Tree Structures with an Array of Ancestors. But I cannot use it now. Is there any way to achieve it with the above pattern using single aggregation query? Thank you

RaR
  • 3,075
  • 3
  • 23
  • 48

2 Answers2

26

Starting from MongoDB 3.4, we can do this with the Aggregation Framework.

The first and most important stage in our pipeline is the $graphLookup stage. $graphLookup allows us to recursively match on the "parent" and "name" field. As result, we get the ancestors of each "name".

The next stage in the pipeline is the $match stage where we simply select the "name" we are interested in.

The final stage is the $addFields or $project stage where we apply an expression to the "ancestors" array using the $map array operator.

Of course with the $reverseArray operator we reverse our array in order to get the expected result.

db.collection.aggregate(
    [ 
        { "$graphLookup": { 
            "from": "collection", 
            "startWith": "$parent", 
            "connectFromField": "parent", 
            "connectToField": "name", 
            "as": "ancestors"
        }}, 
        { "$match": { "name": "D" } }, 
        { "$addFields": { 
            "ancestors": { 
                "$reverseArray": { 
                    "$map": { 
                        "input": "$ancestors", 
                        "as": "t", 
                        "in": { "name": "$$t.name" }
                    } 
                } 
            }
        }}
    ]
)
Community
  • 1
  • 1
styvane
  • 59,869
  • 19
  • 150
  • 156
  • Hi, how do I need to change this, if I am given name = 'A' instead of name = 'D'? I need all documents whose ancestor is A. – user3629892 Mar 12 '18 at 10:30
  • @user3629892, I'm not sure I understand your question. Do you mean you have 'A' instead of 'D'? well, if that's the case, simply replace 'D' by 'A' or even better, you can assign the *name* to a variable and use that variable in your pipeline. – styvane Mar 17 '18 at 16:59
  • 1
    What I meant was that I have entries that each reference their parent. Given a leaf node (with no children), i want to find its parent and the parent of the parent and so on... As input, I dont get the parent node, but the leaf node, so in the OP's example it would be D instead of A. I will try out your suggestion – user3629892 Apr 09 '18 at 06:51
  • Hi, i have a similar situation but, i have users with followers and those followers have many followers, i need get recursively followers of first user to down recursively. – Alejandro Reyes Apr 27 '21 at 23:20
  • I have one problem with this solution it does not maintain the correct. The order comes randomly – Dijiflex Jul 22 '22 at 10:52
2

If you are open to use client side javascript, you can use recursion on the mongo shell to achieve this:

var pushAncesstors = function (name, doc) {
  if(doc.parent) {
    db.collection.update({name : name}, {$addToSet : {"ancesstors" : {name : doc.parent}}});
    pushAncesstors(name, db.collection.findOne({name : doc.parent}))
  }
}

db.collection.find().forEach(function (doc){
  pushAncesstors(doc.name, doc);
})

This will give you complete hirearchy for all products. Sample output:

{ "_id" : "1", "name" : "A", "parent" : "" }
{ "_id" : "2", "name" : "B", "parent" : "A", "ancesstors" : [ { "name" : "A" } ] }
{ "_id" : "3", "name" : "C", "parent" : "B", "ancesstors" : [ { "name" : "B" }, { "name" : "A" } ] }
{ "_id" : "4", "name" : "D", "parent" : "C", "ancesstors" : [ { "name" : "C" }, { "name" : "B" }, { "name" : "A" } ] }

If your requirement is not to update the correct collection, insert the data in a diffferent collection and update there. The pushAncesstors function will change to:

var pushAncesstors = function (name, doc) {
  if(doc.parent) {
    db.outputColl.save(doc)
    db.outputColl.update({name : name}, {$addToSet : {"ancesstors" : {name : doc.parent}}});
    pushAncesstors(name, db.collection.findOne({name : doc.parent}))
  }
}
ares
  • 4,283
  • 6
  • 32
  • 63
  • Thanks for the answer. Yes, I am open to using client side javascript. But, the above one will update the existing document, right? The need is to get the hierarchy but not to update the document. – RaR Jan 30 '17 at 08:25
  • Updated the answer to leave the current collection unmodified. – ares Jan 30 '17 at 10:44
  • that's the classic example of bringing data to logic (code) which is super expensive. i'd suggest you to send your logic(code) to data, by using mongo apis (cheap op.). – Ravi Soni Feb 04 '21 at 12:50