2

I'm using savon gem to interact with a soap api. I'm trying to send three parallel request to the api using parallel gem. Normally each request takes around 13 seconds to complete so for three requests it takes around 39 seconds. After using parallel gem and sending three parallel requests using 3 threads it takes around 23 seconds to complete all three requests which is really nice but I'm not able to figure out why its not completing it in like 14-15 seconds. I really need to lower the total time as it directly affects the response time of my website. Any ideas on why it is happening? Are network requests blocking in nature?

I'm sending the requests as follows

Parallel.map(["GDSSpecialReturn", "Normal", "LCCSpecialReturn"], :in_threads => 3){ |promo_plan| self.search_request(promo_plan) }

I tried using multiple processes also but no use.

Super Engineer
  • 1,966
  • 1
  • 13
  • 21
  • 1
    Does the api you use enforce any usage limits? Any limitations in ruby's threading wouldn't explain why separate processes are no faster – Frederick Cheung Jul 27 '13 at 13:33
  • NO I don't think so. I tried running same search in two different browser windows at same time, and both searches completed in normal time. – Super Engineer Jul 29 '13 at 10:23

2 Answers2

3

I have 2 theories:

  • Part of the work load can't run in parallel, so you don't see 3x speedup, but a bit less than that. It's very rare to see multithreaded tasks speedup 100% proportionally to the number of CPUs used, because there are always a few bits that have to run one at a time. See Amdahl's Law, which provides equations to describe this, and states that:

The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program

  • Disk I/O is involved, and this runs slower in parallel because of disk seek time, limiting the IO per second. Remember that unless you're on an SSD, the disk has to make part of a physical rotation every time you look for something different on it. With 3 requests at once, the disk is skipping repeatedly over the disk to try to fulfill I/O requests in 3 different places. This is why random I/O on hard drives is much slower than sequential I/O. Even on an SSD, random I/O can be a bit slower, especially if small-block read-write is involved.

I think option 2 is the culprit if you're running your database on the same system. The problem is that when the SOAP calls hit the DB, it gets hit on both of these factors. Even blazing-fast 15000 RPM server hard drives can only manage ~200 IO operations per second. SSDs will do 10,000-100,000+ IO/s. See figures on Wikipedia for ballparks. Though, most databases do some clever memory caching to mitigate the problems.

A clever way to test if it's factor 2 is to run an H2 Database in-memory DB and test SOAP calls using this. They'll complete much faster, probably, and you should see similar execution time for 1,3, or $CPU-COUNT requests at once.

BobMcGee
  • 19,824
  • 10
  • 45
  • 57
1

That's actually is big question, it depends on many factors.

1. Ruby language implementation

It could be different between MRI, Rubinus, JRuby. Tho I am not sure if the parallel gem support Rubinus and JRuby.

2. Your Machine

How many CPU cores do you have in your machine, you can leverage this using parallel process? Have you tried using process do this if you have multiple cores?

Parallel.map(["GDSSpecialReturn", "Normal", "LCCSpecialReturn"]){ |promo_plan| self.search_request(promo_plan) } # by default it will use [number] of processes if you have [number] of CPUs

3. What happened underline self.search_request?

If you running this in MRI env, cause the GIL, it actually running your code not concurrently. Or put it precisely, the IO call won't block(MRI implementation), so only the network call part will be running concurrently, but not all others. That's why I am interesting about what other works you did inside self.search_request, cause that would have impact on the overall performance.


So I recommend you can test your code in different environments and different machines(it could be different between your local machine and the real production machine, so please do try tune and benchmark) to get the best result.

Btw, if you want to know more about the threads/process in ruby, highly recommend Jesse Storimer's Working with ruby threads, he did a pretty good job explaining all this things.

Hope it helps, thanks.

Community
  • 1
  • 1
Mike Li
  • 3,336
  • 2
  • 22
  • 27