6

I have been spending some time on Google looking for a queuing / load balancer package for R.

What I am trying to achieve:

  • executing multiple independant from eachother R functions from remote front ends
  • taking advantage of 2 dual-core servers as R backend

Knowing that:

  • each functions is usually processed in 10 to 30 seconds
  • every 5 min on average (but it can at the same time) a set of 8-15 functions to be executed is being sent to the backend (queued for processing: first in first out). The 5 min is an average, several sets can be sent at the same time as well
  • the 2x2 R instance would already be running, with the required packages loaded, they are always the same, so no need to re-load them all the time
  • input amount of data being transfered is very low: 50k max

There is no code parallelization subject here (snow, snowfall foreach, condor and other traditionnal cluster solutions)

Would you know a good package/tool designed for R which could help ?

Thanks a lot !

Sam
  • 565
  • 6
  • 23
  • 1
    I'm not aware of anything existing. As a starting point, I'd look to something like Redis + the doRedis package. Resque is a popular ruby queue manager built on Redis (https://github.com/defunkt/resque). – Noah May 18 '11 at 19:56
  • Hi Noah, thanks for your answer. For what I understood, redis is a database that can be accessed by other client than R, but what's the advantage compare to MySQL with non-parallel computation ? – Sam May 19 '11 at 08:30

1 Answers1

2

This sounds like a reasonable context for using RApache, which can instantiate several R instances and necessary packages.

Iterator
  • 20,250
  • 12
  • 75
  • 111