Ruby performance with multiple threads vs one thread

Question

I am writing a program that loads data from four XML files into four different data structures. It has methods like this:

def loadFirst(year)
  File.open("games_#{year}.xml",'r') do |f|
    doc = REXML::Document.new f
    ...
  end
end
def loadSecond(year)
  File.open("teams_#{year}.xml",'r') do |f|
    doc = REXML::Document.new f
    ...
  end
end

etc...

I originally just used one thread and loaded one file after another:

def loadData(year)
  time = Time.now
  loadFirst(year)
  loadSecond(year)
  loadThird(year)
  loadFourth(year)
  puts Time.now - time
end

Then I realized that I should be using multiple threads. My expectation was that loading from each file on a separate thread would be very close to four times as fast as doing it all sequentially (I have a MacBook Pro with an i7 processor):

def loadData(year)
  time = Time.now
  t1 = Thread.start{loadFirst(year)}
  t2 = Thread.start{loadSecond(year)}
  t3 = Thread.start{loadThird(year)}
  loadFourth(year)
  t1.join
  t2.join
  t3.join
  puts Time.now - time
end

What I found was that the version using multiple threads is actually slower than the other. How can this possibly be? The difference is around 20 seconds with each taking around 2 to 3 minutes.

There are no shared resources between the threads. Each opens a different data file and loads data into a different data structure than the others.

Which version of the language and which VM are you using? I believe most ruby runtimes are still using "green" threads (read: not actually multithreaded, but instead emulated in a single thread) — Iron Savior, Jul 09 '13 at 13:38
I'm just using regular Ruby version 1.9.3 (on Windows). I just did some more investigation and realized one of the files has much more data than the others, so that explains why the performance wouldn't change by a factor of four. But the three others still collectively take over a minute, so I would expect to see a performance boost in the area of a minute using multiple threads... — drew.cuthbert, Jul 09 '13 at 13:45
classic ruby has a GIL - you don't get compute parallelism because in general only one thread runs at a time (with exceptions for IO and some other cases). try your code with jruby — Frederick Cheung, Jul 09 '13 at 13:47
Perhaps you can program to print out the current time (up to usec if necessary) when each thread starts and ends. Then you will have better idea what is happening. Especially, you should be able to see whether the n+1 th thread starts before the n th thread ends. — sawa, Jul 09 '13 at 13:47
See [this link](http://www.igvita.com/2008/11/13/concurrency-is-a-myth-in-ruby) for details about the thread implementations in various ruby runtimes. You're not actually getting concurrent execution with Ruby 1.9.3. It's a fantastic language, but the implementations are still somewhat young in some respects. — Iron Savior, Jul 09 '13 at 13:48
This is interesting to note, I timed each thread individually and this was the result: t1 took 125 seconds, t2 took 23 seconds, t3 took 9 seconds, and t4 took 40 seconds. So how could the sequential version of my code take only around 2 minutes?? It doesn't make any sense to me. — drew.cuthbert, Jul 09 '13 at 13:49
Thanks for the info on the Ruby implementations, definitely good to know. I had no idea regular Ruby couldn't achieve true parallelism. I think I will use jRuby in the future. Still though, I can't understand what I explained in my previous comment. — drew.cuthbert, Jul 09 '13 at 13:51
And Frederick, maybe a dumb question but what does GIL mean? — drew.cuthbert, Jul 09 '13 at 13:51
@andrew.cuthbert GIL = Global Interpreter Lock. It basically prevents any concurrent execution of ruby code within the same VM/host process. — Iron Savior, Jul 09 '13 at 13:52
@Iron Savior okay thanks. So basically, the CRuby interpreter doesn't trust you to write thread-safe code no matter what? — drew.cuthbert, Jul 09 '13 at 13:54
Take a look at the following link : http://stackoverflow.com/questions/56087/does-ruby-have-real-multithreading . Also see http://ruby-doc.org/core/Fiber.html‎ — Anand Shah, Jul 09 '13 at 13:54
Thanks for all of the information guys, I learned a lot here. Out of curiosity, is there any benefit at all to using CRuby over JRuby? This situation seems like a serious limitation and a good reason to never use CRuby again. — drew.cuthbert, Jul 09 '13 at 14:10
@andrew.cuthbert I don't think it's a matter of trust because you can *definitely* still write unsafe code. I imagine that it was for technical reasons--I speculate that the GIL provides some guarantees that might have been more difficult/complex to otherwise achieve. — Iron Savior, Jul 09 '13 at 14:11
@andrew.cuthbert CRuby is Matz's interpreter and I consider it to be the canonical runtime for 1.9.3, which is what most of my ruby work is done in. I would expect that for 1.9 CRuby will provide the most compatibility. If you want real concurrency, you might also look into having more than one ruby process. It doesn't hurt to experiment with the other runtimes, either. "Use the best tool for the job" — Iron Savior, Jul 09 '13 at 14:17
my guess is having multiple threads causes some kind of swapping contention to be added. You could profile it I suppose... — rogerdpack, Jul 09 '13 at 15:48

score 3 · Accepted Answer · answered Jul 14 '13 at 00:11

I think (but I'm not sure) the problem is that you are reading (using multiple threads) contents placed on the same disk, so all your threads can't run simultaneously because they wait for IO (disk).

Some days ago I had to do a similar thing (but fetching data from network) and the difference between sequential vs threads was huge.

A possible solution could be to load all file content instead of load it like you did in your code. In your code you read contents line by line. If you load all the content and then process it you should be able to perform much better (because threads should not wait for IO)

score 0 · Answer 2 · answered Jul 09 '13 at 14:25

It's impossible to give a conclusive answer to why your parallel problem is slower than the sequential one without a lot more information, but one possibility is:

With the sequential program, your disk seeks to the first file, reads it all out, seeks to the 2nd file, reads it all out, and so on.

With the parallel program, the disk head keeps moving back and forth trying to service I/O requests from all 4 threads.

I don't know if there's any way to measure disk seek time on your system: if so, you could confirm whether this hypothesis is true.

Slightly off-topic: I have used thread parallelism on Ruby for processing multiple network requests concurrently, and it did wonders for my program's efficiency. That was on MRI (CRuby). So it's not as if you necessarily have to move to JRuby to get any benefit from using threads for parallel I/O. — Alex D, Jul 09 '13 at 14:28

Ruby performance with multiple threads vs one thread

2 Answers2