1

Consider the following statements

Directed Graph-1:

a1->p1->b1
b1->p2->c1
b1->p3->c2
b1->p4->c3

Directed Graph-2:

a1->p4->c2
a1->p1->b1
a2->p2->b1
b1->p3->c1
a2->p5->b2

Here 'a', 'b' and 'c' are vertices and all 'p' are edges. The direction is from node a to node b/c. When a graph is drawn, it looks similar to the graph in slide-8 of http://www.slideshare.net/fvanvollenhoven/network-analysis-with-hadoop-and-neo4j

In Graph-1 there is a cluster starting from node b1 and in Graph-2 there are two clusters connected at b1. By cluster I mean, all the outgoing edges connected to one vertex and also the vertices involved in that group (of out-going edges). Is there a quick and easy way to find these clusters using any of the existing java based graph APIs? I also would like to find the edges connecting to the clusters (like a1 p1 b1 in Graph-1 and b1 p3 c1 in Graph-2). Am I missing/misusing some graph terminology here? I looked at Good Java graph algorithm library? but didn't find exactly what I am looking for.

The graphs are very small, about 20 vertices and 10 edges.

Note: added Neo4j tag since I felt it is a good candidate for this. Neo4j specific question: Is there a way to get all out going edges whose count is greater than 1? (Exploring cypher now).

Thanks in advance.

Community
  • 1
  • 1
Raghava
  • 947
  • 4
  • 15
  • 29
  • Are these directed or undirected graphs? What constitutes a cluster? Is that just the vertex with the highest degree of incoming/outgoing edges? – Matt Apr 12 '13 at 04:42
  • @Matt Made edits to my question based on your comment. Yes, it is a directed graph. About the cluster -- yes, but need not be the highest. One cluster might consist of a vertex with 5 out-going edges and another with 3. But the cluster consists of not just that vertex but other vertices in that group (or out-going edges). – Raghava Apr 12 '13 at 05:28
  • "Cluster: all outgoing edges connected to one vertex, and the vertices connected to those out-going edges". So does that mean each returned cluster should comprise of: a central node, that node's out-edges, and the nodes connected to the other side of those out-edges? – Alex Averbuch Apr 12 '13 at 20:48
  • @AlexAverbuch, yes that is correct (#out-edges > 1). Now, I get a feeling that I should do it myself rather than looking for something in an api. – Raghava Apr 12 '13 at 21:45
  • "I also would like to find the edges connecting to the clusters" - looking at the example, you're talking about both incoming and outgoing edges, right? – Michal Bachman Apr 13 '13 at 10:20
  • @MichalBachman, for the cluster it is outgoing edges. For the connection between clusters, incoming edges should be considered. But considering that incoming edge to one vertex is same as saying its an outgoing edge from its source, we can may be stick to using the term outgoing edges. – Raghava Apr 13 '13 at 17:07
  • 2
    First of all, I wouldn't call what you're looking for "clusters". You are just looking for "high" degree nodes and their neighbors. What exactly should the algorithm return? [1] just the cluster centers (you can retrieve the neighbors later) [2] the cluster centers and their out edges [3] the cluster centers, their out edges, and the relationship that connects them to another cluster? – Alex Averbuch Apr 16 '13 at 12:13
  • @AlexAverbuch, yes you is right. Those are the things that I am looking for. – Raghava Apr 16 '13 at 16:18

0 Answers0