4

I am looking to do some quite processor-intensive brute force processing for string matching. I have run my prototype in a multi-threaded environment and compared the performance to an implementation using Gridgain with a couple of nodes (also multithreaded).

The performance I observed was that my Gridgain implementation performed slower to my multithreaded implementation. It could be the case that there was a flaw in my gridgain implementation, but it was only a prototype, and I thought the results were indicative. So my question is this:

What are the advantages of having to learn and then build an implementation for a particular grid platform (hadoop, gridgain, or EC2 if going hosted - other suggestions welcome), when one could fairly easily put together a lightweight compute grid platform with a much shallower learning curve?...i.e. what do we get for free with these cloud/grid platforms that are worth having/tricky to implement?

(Please note, I don't have any need for a data grid)

Cheers,

-James

(p.s. Happy to make this community wiki if needbe)

James B
  • 3,692
  • 1
  • 25
  • 34
  • 1
    Just a note that if you're looking at self-hosting you could checkout Platform.com - their grid computing platform is pretty widely used in the chip design area, not sure how their prices are though. – tloach Mar 26 '10 at 15:25
  • @tloach - thanks for the link, I'd not seen them before....This is kind of exactly my point, why would I spend time and money on paying for platform.com's consulting and api when I could host my own implementation of a private cloud, with seemingly, less processing and learning curve overhead – James B Mar 26 '10 at 15:42
  • 1
    EC2 provides virtual machine hosting while Hadoop is a framework for distributed computing. That's not comparing apples and oranges but apples and ... something completely different - obvious lack of creativity here :) So what are you looking for? Whether to host your own hardware/virtual machines or whether to build your software to distribute and coordinate tasks across multiple nodes? – sfussenegger Mar 26 '10 at 16:16
  • @sfussenegger - "apples and carrots"?...I see what you mean, I stuck EC2 in that list more to signal that it was part of the mix of tech stack I have under review. I'll probably host the grid locally, (so perhaps I should remove EC2 from my original question). What I'm looking for is a technical justification for learning and building to a toolkit like hadoop or gridgain, or whether my time is better spent rolling my own more lightweight platform – James B Mar 26 '10 at 17:31

1 Answers1

1

What kind of grid are you dealing with? A dozen hosts running the same OS would be pretty straightforward to run a grid for - all you really have to deal with is sending work to each host, maybe a little load balancing, maybe take into account what to do if a host goes down, maybe deal with distributing new service code to the hosts when you update your service, but if you don't deal with any of those it's not a big deal since the grid is a manageable size. If you're dealing with 1000s of hosts, or with a service that should never be down or have errors due to single hosts going down then you suddenly have to worry about:

  • not overloading any single host
  • distributing new service code
  • detecting when a host isn't responding and not sending it new work, as well as resending whatever it was working on
  • possibly working across different OSes and architectures (little vs. big endian)
  • energy savings - shutting down hosts during low load and bringing them back up for high load
  • scaling - if you add 100 hosts to your grid tomorrow how long does it take to get them connected and working?
  • reliability - some services may actually perform calculations on 2-3 different hosts and only return an answer that all the hosts agree on

That's a short list of things that most grid software should do for you if you need it. If you're working on something small or non-critical then by all means, roll your own. If you're working on something that has to work, or is big enough that having any manual steps in a deployment process would be a maintenance nightmare then you probably want to go with something that already exists.

tloach
  • 8,009
  • 1
  • 33
  • 44
  • @tloach, this is exactly the sort of list I was looking for, thanks very much. I was surprised that this question didn't attract more answers or more views...Perhaps grid/cloud computing is not quite as big as the media is making it out to be, or people are so busy deploying interesting apps to their grids that they don't have time for SO! – James B Mar 29 '10 at 08:47