0

I'm making a bunch of AWS calls to create/delete rules from security groups and want to speed things up with parallelization.

Is there a general way to parallelize an I/O bound operation across a fixed size collection? A method that takes a collection, batch size, and functional block would be nice.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
crizCraig
  • 8,487
  • 6
  • 54
  • 53
  • Write your question in the form of a question, setting it up with the appropriate issues you ran into, as if it was a "real" question, which it really is. – the Tin Man Mar 09 '16 at 22:15
  • That helps. You might show your old attempt, since your answer says you got a ~100x speed-up. Think of a self-answered Q/A like this: Because SO is a reference-book/cookbook of programming questions with answers, it'd help people who are searching for a similar solution to know where you started. – the Tin Man Mar 09 '16 at 23:35

1 Answers1

2

The following method worked great for me, yielding a ~100x speedup:

#  A method to parallelize an operation across a collection.
#  Example:
#
#  fan_out [1, 2, 3, 4], 2 do |batch|
#    puts batch.to_s
#  end
#
#  out >>
#  [3, 4]
#  [1, 2]
def fan_out(arr, num_batches, &block)
  threads = []
  arr.each_slice(arr.size / num_batches).each do |batch|
    threads << Thread.new {
      block.call(batch)
    }
  end
  threads.each(&:join)
end
crizCraig
  • 8,487
  • 6
  • 54
  • 53
  • 1
    This is a pretty good solution, but if some batches run more quickly than others you'll have threads shutting down while there's still work to do. A [Queue](http://ruby-doc.org/core-2.3.0/Queue.html) object is useful for spreading work across N threads. – tadman Mar 09 '16 at 22:39
  • Agreed, a Queue with a pool of threads pulling in tasks would be more efficient. In my case, each operation took roughly the same time, so this worked well. – crizCraig Mar 09 '16 at 23:22
  • 1
    You might want to reconsider using `"""` in Ruby code. Ruby isn't Python. http://stackoverflow.com/q/28511229/128421 – the Tin Man Mar 09 '16 at 23:41
  • this will be fine as long as the blocks are well behaved and don't block. You probably want to keep track the outcomes of the threads with some logging. – Amias Jul 10 '17 at 08:45