2

I have a process which uses the concurrent-ruby gem to handle a large number of API calls concurrently using Concurrent::Future.execute, and, after some time, it dies:

ERROR -- : can't create Thread (11) (ThreadError)
/current/vendor/bundler_gems/ruby/2.0.0/bundler/gems/concurrent-ruby-cba3702c4e1e/lib/concurrent/executor/ruby_thread_pool_executor.rb:280:in `initialize'

Is there a simple way I can tell Concurrent to limit the number of threads it spawns, given I have no way of knowing in advance just how many API calls it's going to need to make?

Or is this something I need to code for explicitly in my app?

I am using Ruby 2.0.0 (alas don't currently have the option to change that)

David Moles
  • 48,006
  • 27
  • 136
  • 235
Dave Sag
  • 13,266
  • 14
  • 86
  • 134

2 Answers2

6

After some reading and some trial and error I have worked out the following solution. Posting here in case it helps others.

You control the way Concurrent uses threads by specifying a RubyThreadPoolExecutor1

So, in my case the code looks like:

threadPool = Concurrent::ThreadPoolExecutor.new(
  min_threads: [2, Concurrent.processor_count].min,
  max_threads: [2, Concurrent.processor_count].max,
  max_queue:   [2, Concurrent.processor_count].max * 5,
  overflow_policy: :caller_runs
)

result_things = massive_list_of_things.map do |thing|
  (Concurrent::Future.new executor: threadPool do
    expensive_api_call using: thing
  end).execute
end

So on my laptop I have 4 processors so this way it will use between 2 and 4 threads and allow up to 20 threads in the queue before forcing the execution to use the calling thread. As threads free up the Concurrency library will reallocate them.

Choosing the right multiplier for the max_queue value looks like being a matter of trial and error however; but 5 is a reasonable guess.

1 The actual docs describe a different way to do this but the actual code disagrees with the docs, so the code I have presented here is based on what actually works.

Dave Sag
  • 13,266
  • 14
  • 86
  • 134
  • Do let us know if more than 1 processor actually gets used. I'd be keen to find out. – suranyami Jan 16 '15 at 00:28
  • Looking at ActivityMonitor it's certainly using both physical cores and also pushing work out to the Hyper-Threaded virtual cores, so yes it is. Reliably sitting on 98% of CPU overall with between 10 and 15 actual OS threads running. – Dave Sag Jan 16 '15 at 00:45
0

The typical answer to this is to create a Thread pool.

Create a finite number of threads, have a way of recording which are active and which aren't. When a thread finishes an API call, mark it as inactive, so the next call can be handled by it.

The gem you're using already has thread pools.

suranyami
  • 908
  • 7
  • 9
  • yeah - jut been pouring through the docs for `Concurrent::ThreadPoolExecutor`. Very poor documentation but I think I have it worked out. Will write up my answer once properly tested – Dave Sag Jan 15 '15 at 23:17