0

I have kind of graph connections in MongoDB:

{ 
    from: ObjectID(user),
    to: ObjectID(user) 
}

if I want to get all users at depth 1, is easy:

db.connections.find({from:ObjectID(myUser)});

But when I want to find all users at depth 2, is hard: The foolish idea is "find all people at depth 1, then do a query like the previous one, for each person. No way, plus it would return also all circular paths.

e.g 
1->2 
2->1 
1->3
2->4
results of user(1).findDepth2() would be 2,1,3,4 instead of 4. 

The example in pseudo node.js:

var myFriends = db.connections.find({from:ObjectID(myUser)});
var friendAtDepth2 = [];
for(friend in myFriends){
    // I find friends of each friend
    var frendsOfFriend = db.connections.find({from:ObjectID(friend)});
    for(friendOfFriend in frendsOfFriend){
        if(friendAtDepth2.indexof(friendOfFriend)==-1 
             && myFriends.indexof(friendOfFriend)==-1
             && myUser!=friendOfFriend){
             // I haven't already found this user, so I it is actually at depth 2
             friendAtDepth2.push(friendOfFriend);
        }
    }
}

But in this way, I do really a lot of queries, and I hope exist a kind of join query like in mysql helipng me doing this in one query only

Is a query like this feasible?

M4rk
  • 2,172
  • 5
  • 36
  • 70
  • 1
    I think you need to add a clear example to this really. There is nothing in your question that demonstrates an "arbitrary depth". – Neil Lunn Jul 02 '14 at 11:44
  • The mean is: if you find a query to find people at depth 2, it should be possible also find a query for depth 3, or 4 – M4rk Jul 02 '14 at 11:48
  • Still not enough information. A sample document to illustrate is good. A failing code attempt is better. Just trying to advise as people will close this. Up to you. – Neil Lunn Jul 02 '14 at 12:07
  • Perhaps rather than commenting on things that do not answer your question you could concentrate on explaining your question more clearly as you were asked to. – Neil Lunn Jul 02 '14 at 12:33
  • You could do it by browsing only one time your collection: 1. You browse your collection and build the matrix of 1-to-1 relationships (symmetric or not), 2. you square the matrix. This is not an optimal theoretical complexity but it is a binary matrix and if you take into account the queries execution time, you could save a lot of time... – dgiugg Jul 02 '14 at 13:03

1 Answers1

0

Implement the Breath-First Search algorithm to find all the nodes at distance two from a specific node.

It has a linear time complexity O(|E|+|V|), thus you can't really do better than this.

If you want all nodes of distance two, three, etc. then you might need to look into the All-Pairs Shortest path algorithms and customize them a bit.

By customizing it I mean you should run a BFS from every node and stop at the required depth. If you want some concrete code examples you should take a look at similar question, like All paths of length L from node n using python.

Also depending on your data you can add explicit neighbor lists to your nodes, that have excessive space requirements, you can get all your data into the memory with one query or you can try to use the aggregation framework (there it might be a bit trickier to implement, but that should be the fastest solution).

Alternatively if you're application is really heavy on graphs and graph algorithms you should consider graph databases, like Neo4j.

Community
  • 1
  • 1
Pio
  • 4,044
  • 11
  • 46
  • 81
  • 1
    Unless you are going to show some code of how to do this in this case then this is not an answer. Links tend to change over time so either explain the answer or accept that you have not provided one that is complete. – Neil Lunn Jul 02 '14 at 12:05
  • The Breath First Search algorithm is very well known, I just provided the first Google hit. Also the question does not provide clear data structures thus there is no way I can provide code for this question. This is a theoretical answer. – Pio Jul 02 '14 at 12:07
  • Yep. Google search only is also "not an answer". You need to show "how to get the desired result". Anyone can "Google". Maybe better as a comment unless you can show how to do it. That is what an "answer" is. – Neil Lunn Jul 02 '14 at 12:10
  • yes it's very known, but how to implement it? As I said "The foolish idea is "find all people at depth 1, then do a query like the previous one, for each person. No way" is practically the implementation of BFS, but it is very inefficient – M4rk Jul 02 '14 at 12:10
  • @rodi, maybe I wasn't explicit enough. Check my edit in the answer. – Pio Jul 02 '14 at 13:21
  • I know very well BFS and many other algorithm that concerns graphs, and neo4j too. But I really need to figure out how to compose a query on mongodb, to do heavy map-reduce jobs and aggregation of data. But I really don't know how to do it – M4rk Jul 03 '14 at 09:16
  • My suggestion is to do the work at once on database side. I assume you use Mongodb. It's [aggregation framework](http://docs.mongodb.org/manual/core/aggregation-introduction/) supports [Map-Reduce](http://docs.mongodb.org/manual/core/map-reduce/). I'll try to do an example in the near future. – Pio Jul 03 '14 at 12:28