I am playing around with concurrency in Ruby (1.9.3-p0), and have created a very simple, I/O-heavy proxy task. First, I tried the non-blocking approach:
require 'rack'
require 'rack/fiber_pool'
require 'em-http'
require 'em-synchrony'
require 'em-synchrony/em-http'
proxy = lambda {|*|
result = EM::Synchrony.sync EventMachine::HttpRequest.new('http://google.com').get
[200, {}, [result.response]]
}
use Rack::FiberPool, :size => 1000
run proxy
=begin
$ thin -p 3000 -e production -R rack-synchrony.ru start
>> Thin web server (v1.3.1 codename Triple Espresso)
$ ab -c100 -n100 http://localhost:3000/
Concurrency Level: 100
Time taken for tests: 5.602 seconds
HTML transferred: 21900 bytes
Requests per second: 17.85 [#/sec] (mean)
Time per request: 5602.174 [ms] (mean)
=end
Hmm, I thought I must be doing something wrong. An average request time of 5.6s for a task where we are mostly waiting for I/O? I tried another one:
require 'sinatra'
require 'sinatra/synchrony'
require 'em-synchrony/em-http'
get '/' do
EM::HttpRequest.new("http://google.com").get.response
end
=begin
$ ruby sinatra-synchrony.rb -p 3000 -e production
== Sinatra/1.3.1 has taken the stage on 3000 for production with backup from Thin
>> Thin web server (v1.3.1 codename Triple Espresso)
$ ab -c100 -n100 http://localhost:3000/
Concurrency Level: 100
Time taken for tests: 5.476 seconds
HTML transferred: 21900 bytes
Requests per second: 18.26 [#/sec] (mean)
Time per request: 5475.756 [ms] (mean)
=end
Hmm, a little better, but not what I would call a success. Finally, I tried a threaded implementation:
require 'rack'
require 'excon'
proxy = lambda {|*|
result = Excon.get('http://google.com')
[200, {}, [result.body]]
}
run proxy
=begin
$ thin -p 3000 -e production -R rack-threaded.ru --threaded --no-epoll start
>> Thin web server (v1.3.1 codename Triple Espresso)
$ ab -c100 -n100 http://localhost:3000/
Concurrency Level: 100
Time taken for tests: 2.014 seconds
HTML transferred: 21900 bytes
Requests per second: 49.65 [#/sec] (mean)
Time per request: 2014.005 [ms] (mean)
=end
That was really, really surprising. Am I missing something here? Why is EM performing so badly here? Is there some tuning I need to do? I tried various combinations (Unicorn, several Rainbows configurations, etc), but none of them came even close to the simple, old I/O-blocking threading.
Ideas, comments and - obviously - suggestions for better implementations are very welcome.