9

I am looking for something that will make it easy to run (correctly coded) embarrassingly parallel JVM code on a cluster (so that I can use Clojure + Incanter).

I have used Parallel Python in the past to do this. We have a new PBS cluster and our admin will soon set up IPython nodes that use PBS as the backend. Both of these systems make it almost a no-brainer to run certain types of code in a cluster.

I made the mistake of using Hadoop in the past (Hadoop is just not suited to the kind of data that I use) - the latency made even small runs execute for 1-2 minutes.

Is JPPF or Gridgain better for what I need? Does anyone here have any experience with either? Is there anything else you can recommend?

Alex Miller
  • 69,183
  • 25
  • 122
  • 167

4 Answers4

3

Check out cascalog - http://github.com/nathanmarz/cascalog

simon-says
  • 31
  • 1
2

Clojure is reported to work on Terracotta, subject to some patching.

Stuart Sierra
  • 10,837
  • 2
  • 29
  • 35
1

Look at Skandium

Edit :

Above link is no more live, so adding github link

https://github.com/mleyton/Skandium

Dhananjay
  • 3,903
  • 2
  • 29
  • 44
Aravind Yarram
  • 78,777
  • 46
  • 231
  • 327
0

I suggest you look at Skandium, alternative licenses to GPL can be negotiated with the developers upon request.

user559370
  • 28
  • 4