1

I'm using Neo4j to find the degree of connection between users. I have data in the follower shape:

(user)-[:INTERACTS_WITH]->(user)

So if user_1 interact with user_2 and user_2 interacts with user_3, then user_1 and user_3 share a second-degree connection.

Ideally, I would like to get the following dataset like this in return:

degree count
NULL   123
1      1050
2      3032
3      2110
...    ...

Is there a better way to do this than to simply run shortestPath() function for every pair of users? If not, then what is the best way to loop over users in Neo4j?

Also, I imagine the direction plays a role here, so would you recommend making this relationship bidirectional, so that for every (user1)-[:INTERACTS_WITH]->(user2) I would also create the reverse relationship (user2)-[:INTERACTS_WITH]->(user1)?

If you have any tips on how to create the above dataset, please let me know.

Many thanks!

Bruno Peres
  • 15,845
  • 5
  • 53
  • 89
IVR
  • 1,718
  • 2
  • 23
  • 41
  • You don't have to create opposite relations, just dont specify the direction in your query ()-[]-() – Jerome_B Jun 25 '17 at 09:56
  • Thanks, but it looks like all relationships in Neo4j must have a direction (according to [this](https://stackoverflow.com/questions/24010932/neo4j-bidirectional-relationship) post). – IVR Jun 26 '17 at 08:55
  • Indeed, all relations are created with a direction. But you are not forced to specify a direction when you query it for results. As per the example in the link you provided: MATCH (A)-[FRIEND]-(B) RETURN A, B – Jerome_B Jun 26 '17 at 12:29
  • @JeromeB, thanks for the suggestion, but I think it's still important to define the [FRIEND] going both ways. Imagine a scenario where node A flows into node B and it also flows into node C. In this situation I'd like nodes C and B to share a 2nd degree connection, but unless you define A -> B AND A <- B and do the same for A -> C and A <- C, there is no way to connect B and C. – IVR Jun 28 '17 at 06:16

1 Answers1

1

Is there a better way to do this than to simply run shortestPath() function for every pair of users? If not, then what is the best way to loop over users in Neo4j?

I believe that run shortestPath() for every pair of users is a good choice, but keep in mind that it should be very expensive.

Also, I imagine the direction plays a role here, so would you recommend making this relationship bidirectional, so that for every (user1)-[:INTERACTS_WITH]->(user2) I would also create the reverse relationship (user2)-[:INTERACTS_WITH]->(user1)?

No, you do not need another relationship. Remember that relationship direction can be ignored at query time in Neo4j. When modeling relationships that are naturally bidirectional we should use only one relationship to represent it. So when querying the graph we can transverse from a to b and b to a. You only need an extra relationship when some data in the bidirectional relationship can be different between a to b and b to a. Suppose that the interaction between the users in your model has a weight and this weight can be different from a to b and b to a. In this case you can store this weight as a property in the relationship. Example:

(a)-[:INTERACTS_WITH {weight:10}]->(b)
(b)-[:INTERACTS_WITH {weight:6}]->(a)

Take a look in this link about modelling bidirectional relationships.

Bruno Peres
  • 15,845
  • 5
  • 53
  • 89
  • Thanks for your answer. What do you mean by "shortest path only from a to b and not a to b AND b to a"? Should it be "A to B and NOT B to A"? If so, wouldn't I get different results for A to B and for B to A in a unidirectional graph? If so, then I'd need both numbers – IVR Jun 26 '17 at 08:58
  • Hello @de1pher. Yes, you are right. I removed it from my answer. Thanks. – Bruno Peres Jun 26 '17 at 11:23
  • Great, thanks! Any tips on how you'd go about constructing this query? Do you think it might be a good idea to run through the actual loop in R / Python which would send an individual query for every pair of users? – IVR Jun 27 '17 at 06:02