8

I'm working on a project which will involve running algorithms on large graphs. The largest two have around 300k and 600k vertices (fairly sparse I think). I'm hoping to find a java library that can handle graphs that large, and also trees of a somewhat smaller size, as one of the algorithms I'll be using involves decomposing a graph into a tree. Ideally the library would also include breadth first search and Dijkstra's or other shortest-path algorithms.

Based on another question, I've been looking at a few libraries (JGraphT, JUNG, jdsl, yworks) but I'm having a hard time finding out how many vertices they can realistically handle. Looking at their documentation, all I could find was a bit in the JUNG FAQ that said it could easily handle graphs of upwards of 150k vertices, which is still quite a bit smaller than my graphs... I'm hoping someone here has used one or more of these libraries and can tell me if it'll handle the graph sizes I need, or if there's some other library that would be better.

For the record I don't need any visualization tools; this is strictly about representing the graphs and trees in data structures and running algorithms on them.

Background if anyone really cares: for a class I'm supposed to implement an algorithm described in a research paper, and run the experiments run in the paper as best I can. The paper and datasets I'll be using can be found here. My professor says I can use any library I can find as long as I can tell what the time/space complexity of the algorithms/data structures are.

Community
  • 1
  • 1
Maltiriel
  • 793
  • 2
  • 11
  • 28
  • 1
    Just found some info on [JGraphT](http://jgrapht-users.107614.n3.nabble.com/Max-limit-of-vertices-td1194057.html). Apparently it should handle these graphs no problem... – Maltiriel Mar 29 '12 at 19:16

4 Answers4

3

You should take a look at Neo4J which is a graphical database which might be a good solution for your problems.

khmarbaise
  • 92,914
  • 28
  • 189
  • 235
  • Thanks, I'm looking into this now. It can definitely handle those datasets. – Maltiriel Mar 29 '12 at 18:15
  • 1
    I'm first going to try one of the in-memory libraries, as that's what's done in the paper so I think my prof would like that better, but if that doesn't work I'll go with Neo4J. It looks easy to use and it has all the algorithms I need. Thanks for the suggestion! – Maltiriel Apr 02 '12 at 14:26
3

Checkout JGraph as well. However it is oriented towards visualization.

Also, maybe Apache Hama - a distributed computing framework for massive scientific computations e.g., matrix, graph and network algorithms.

Annas may also interest you - open-source Java framework that was built for developers and researchers in the fields of Graph Theory - AI, Path finding, distributed systems, etc.

tenorsax
  • 21,123
  • 9
  • 60
  • 107
  • Hmm. The info I've seen makes it seem like this wouldn't be as suitable... In the user manual they start off by going on about swing for example. I don't want to have to mess with the visualization stuff at all. Is that possible, do you know? – Maltiriel Mar 29 '12 at 18:19
  • @Maltiriel, you could potentially work on graph model standalone. However, if you don't need to visualize the graph, it is an overkill. – tenorsax Mar 29 '12 at 18:43
  • Thanks for the additional suggestions. Hama may be a bit much for what I'm doing, but Annas looks very interesting. I haven't come across either one in my searches before this. – Maltiriel Mar 29 '12 at 21:20
1

Cassovary https://github.com/twitter/cassovary -project from Twitter can handle very big graphs with Scala (thus JVM) in memory.

Alternatively, GraphChi's Java version can handle even bigger graphs, by using disk: http://code.google.com/p/graphchi-java/

However, GraphChi will not be efficient for exact shortest-path type algorithms, as they require fast random access.

Aapo Kyrola
  • 1,100
  • 8
  • 8
0

My answer may be a bit late for the you, but for others who have similar questions, I would recommend using GraphScope, a one-stop graph computing system developed by Alibaba. It will meet your needs for large-scale graphs.

If you want to write your graph algorithm in Java, then you're in luck. GraphScope provides comprehensive Java support. You can use GRAPE-JDK's user-friendly Java interface to write your own graph algorithms.

In addition to the ease of use of the interface, GRAPE-JDK also provides high query performance. Using the LLVM4JNI technology provided by FastFFI, it can accelerate JNI calls between Java and C++, thus reducing program execution time. According to performance report provided by GraphScope, GRAPE-JDK can even provide execution efficiency that is close to that of C++ programs.

If you would like to learn more about GRAPE-JDK, please visit this blog post: Breaking the Language Barrier in Large Scale Graph Computing.

Disclaimer: I'm an author of GraphScope.

Lei Zhang
  • 1
  • 1