using core.async with blocking clients/drivers : are there performance benefits?

Question

I'm programming a web application backend in Clojure, using among other things:

http-kit as an HTTP server and client (nonblocking)
monger as my DB driver (blocking)
clj-aws-s3 as an S3 client (blocking)

I am aware of the performance benefits of event-driven, non-blocking stacks like the ones you find on NodeJS and the Play Framework (this question helped me), and how it yields a much better load capacity. For that reason, I'm considering making my backend asynchronous using core.async.

My question is : Can you recreate the performance benefits of non-blocking web stacks by using core.async on top of blocking client/driver libraries?

Elaborating:

What I'm currently doing are the usual synchronous calls :

(defn handle-my-request [req]
  (let [data1 (db/findData1)
        data2 (db/findData2)
        data3 (s3/findData3)
        result (make-something-of data1 data2 data3)]
    (ring.util.response/response result))
  )

What I plan to do is wrapping any call involving IO in a thread block, and synchronize this inside a go block,

(defn handle-my-request! [req resp-chan] ;; resp-chan is a core.async channel through which the response must be pushed
  (go 
    (let [data1-ch (thread (db/findData1)) ;; spin of threads to fetch the data (involves IO)
          data2-ch (thread (db/findData2))
          data3-ch (thread (s3/findData3))
          result (make-something-of (<! data1-ch) (<! data2-ch) (<! data3-ch))] ;; synchronize
     (->> (ring.util.response/response result)
       (>! resp-chan)) ;; send response
     )))

Is there a point doing it that way?

I'm doing this because that's kind of the best practices I found, but their performance benefits are still a mystery to me. I thought the issue with synchronous stacks was that they use one thread per request. Now it seems they use more than one.

Thanks in advance for your help, have a beautiful day.

A *slight* benefit of doing it this way, is that when/if async drivers become available, the integration will be fairly seamless (assuming the APIs follow convention). — Dax Fohl, Oct 21 '14 at 20:15
You opened the bounty after dAni's answer was posted, and it seems to answer the question fully. What are you still unsure of? — Dax Fohl, Oct 23 '14 at 11:30
dAni's answer is more about the speed of processing an isolated request; my question is more about the overall load capacity. — Valentin Waeselynck, Oct 23 '14 at 17:59

score 2 · Answer 1 · answered Oct 19 '14 at 22:03

2

The benefit from your example is that findData1,2 and 3 are run in parallel, which can decrease the response time at the cost of using more threads.

In my experience, what usually happens is that the call to findData2 depends on the results of findData1, and findData3 depends on the results of findData2, which means that the calls cannot be parallelized, on which case there is no point on using core.async

answered Oct 19 '14 at 22:03

DanLebrero

8,545
1
29
30

Thanks, and what about the potential load capacity benefit? Is it compromised by the fact that the driver is blocking? – Valentin Waeselynck Oct 19 '14 at 23:40
That is correct. The load capacity is not going to improve – DanLebrero Oct 20 '14 at 08:06
@ValentinWaeselynck The overhead of launching new threads will actually make it worse. You'd have to run some load tests to determine how much worse. Offhand, I'd guess not much unless you're running clojure-clr. – Dax Fohl Oct 21 '14 at 20:19
Those threads come from a ThreadPool, so the cost of creating them can be ignored unless you have a very spiky load – DanLebrero Oct 21 '14 at 21:00
On the other hand, if I do have non-blocking libraries for other operations (e.g I have a truly non-blocking HTTP client and server like http-kit), then I won't block any thread for these operations; this way, the blocking time per thread is reduced, so that's still an improvement. – Valentin Waeselynck Oct 22 '14 at 06:35

score 1 · Accepted Answer · answered Oct 23 '14 at 22:47

The simple answer is no, you're not going to increase capacity at all this way. If you've got memory to hold 100 threads, then you've got 300 "thread seconds" of capacity for each 3-second interval. So, say each of your blocks takes one second to execute. It doesn't matter if each request runs synchronously, holding the thread for the full three seconds, or blockingly-asynchronously, holding a thread for one second three times, you're never going to serve more than 100 requests per three seconds.

However if you make one step asynchronous, then suddenly your code needs only two thread-seconds per request, so you can now serve 300/2=150 requests per three seconds.

The more complicated answer is it might make it better or worse, depending on how your client or web server handles timeouts, how quickly/often clients retry the request, how parallelizable your code is, how expensive thread swapping is, etc. If you try to do 200 requests in the synchronous implementation, then 100 will get through after 3 secs and the remaining 100 will get through in 6 secs. In the async implementation, since they're all competing for threads at various async junctures, most of them will take 5-6 secs to complete. So that's that. But if the blocks are parallelizable, then some requests may complete in just one second, so that's that too.

So on the very edge it kind of depends, but ultimately the capacity is thread-seconds, and by that standard sync or blocking-async, it is all the same. It's not Clojure specific, and there are certainly plenty of more in-depth resources out there detailing all the edge cases than what I've provided here.

Thanks, that was the answer I needed. I'd be very interested if you could give references to these resources. — Valentin Waeselynck, Oct 23 '14 at 23:27

using core.async with blocking clients/drivers : are there performance benefits?

2 Answers2