Non-blocking IO with Haskell

Question

Possible Duplicate:
What is the Haskell response to Node.js?
How can I watch multiple files/socket to become readable/writable in Haskell?

Is it possible to write a Haskell program that performs IO in a non-blocking way like in nodejs?

For example, i would like to get 10 records from a database that is far away, so I would like to fire 10 requests concurrently, and when the result is available, then return this collection. The IO monad is not going to help, because the monad explicitly serializes the computations with bind. I think the continuation passing style where you pass around the computation you want next have the same problem, again it serializes the computation. I do not want to work with threads, I am looking for another solution. Is this possible?

When you say you don't want to work with threads, would it be acceptable to use a library implemented with threads so long as you don't have to manage them yourself? — John L, Dec 10 '12 at 23:51
You should say what it is about threads that you don't like, rather than just that you don't want to use them. — Daniel Wagner, Dec 10 '12 at 23:52
Why the artificial exclusion of threads? That would be the natural solution in Haskell. — Chuck, Dec 10 '12 at 23:52
Incidentally, there is at least one way to accomplish this using just IO without any extra threads or libraries, just an unsafe function. I can't really recommend it though. — John L, Dec 10 '12 at 23:54
I think you mean you want event driven IO. Is it for web server programming? — AndrewC, Dec 10 '12 at 23:57
(I should point out that Haskell's take on threads is extraordinarily lightweight compared with OS threads, and that's one of the reasons the existing web server frameworks scale up very well.) — AndrewC, Dec 10 '12 at 23:58
Well, actually I would like to develop a futrue library for Haskell where unfinished computations would be captured with a future object. You will have one future object essentially for each function call, so using threads would be heavy weight (you might be hundreds of thousand outstanding requests). If Haskell can do all kinds of new control structures, then how can do this type? — mmaroti, Dec 11 '12 at 01:03
Haskell's threads *are* lightweight enough to have many many of them. How exactly do you think this can work without some form of "run multiple things at once" capability anyway? If you've sent off the request, either you wait for it to respond, or you arrange for something else (such as a thread) to wait for it to respond so that you can later ask the "something else" if it has received a response. If you do neither of those things, then nobody will be listening when the response comes in. — Ben, Dec 11 '12 at 01:29
5 seconds of googling hints to me that node.js' non-blocking operations *are implemented* using threads. So if you want to *implement* something like that, you need threads to be involved. — Ben, Dec 11 '12 at 01:33
@mmaroti Right, I thought that might be your complaint. Haskell threads scale to the hundreds of thousands range, so just use them! (This is also why I suggested the other StackOverflow question I did, which had the same concerns as you.) — Daniel Wagner, Dec 11 '12 at 02:50
You say, that Haskell threads scale to the hundreds of thousands, and I do not doubt it, but would like to know how it is actually implemented? Is there an event loop for each machine thread? Do they use work stealing? Does it use epoll internally? — mmaroti, Dec 12 '12 at 05:45
@Ben No node.js does it via an event loop system. You could argue there are *threads* involved giving your own definition but I am not sure if it could be called one by any acceptable definition of threads. There's not really two execution paths, basically. There's a listener which listens for interrupts from IO. — nawfal, Jan 22 '17 at 06:02

score 21 · Accepted Answer · edited Dec 08 '19 at 18:01

21

Haskell threads are exceedingly light weight. What is more, GHCs IO monad uses event driven scheduling much of the time, meaning ordinary Haskell code is like continuation passing style node.js code (only compiled to native code and run with multiple CPUs...)

Your example is trivial

import Control.Concurrent.Async

--given a list of requests
requests :: [IO Foo]

--you can run them concurrently
getRequests :: IO [Foo]
getRequests = mapConcurrently id requests

Control.Concurrent.Async is probably exactly what you are looking for with respect to a library for futures. Haskell should never choke on mere thousands of (ordinary) threads. I haven't ever written code that uses millions of IO threads, but I would guess your only problems would be memory related.

edited Dec 08 '19 at 18:01

mkrieger1

19,194
5
54
65

answered Dec 11 '12 at 01:09

Philip JF

28,199
5
70
77

I accept the solution, but this does not really answer the question I had. Is it possible to devise a control structure that allows a "single thread" to perform non-blocking concurrent programming (with some underlying thread pool)? By a "single thread" I mean to make sure that only one thread is executing concurrently (all others are blocked waiting for IO), so one can use regular IORefs with no synchronization/blocking. – mmaroti Dec 12 '12 at 05:58
@mmaroti Hm, one way to do this would be to build a simple monad with both references (implemented with IO refs) and FFI actions/system calls but separated into two universes by the type system (not so hard to do), and then anytime you want to perform an FFI call in your monad with references, the code you write is the equivalent of `async foo >>= unsafeInterleaveIO . wait` this would give you a strait ahead style of programming, and perform all IO in an asynchronous way, but has all the downsides of lazy IO. – Philip JF Dec 12 '12 at 06:08
Hmm, I have to read up on this. I have no problem with the "downsides" of lazy IO, since most of what I want to do is "functional": read immutable data from a database (e.g. git) – mmaroti Dec 12 '12 at 06:15

score 8 · Answer 2 · answered Dec 11 '12 at 04:24

8

TO flesh out the comments on Control.Concurrent.Async, here is an example using the async package.

import Network.HTTP.Conduit
import Control.Concurrent.Async

main = do
    xs <- mapM (async . simpleHttp) [ "www.stackoverflow.com"
                                    , "www.lwn.net"
                                    , "www.reddit.com/r/linux_gaming"]
    [so,lwn,lg] <- mapM wait xs
    -- parse these how ever you'd like

So in the above we define three HTTP get requests for three different websites, launch those requests asynchronously, and wait for all three to finish before proceeding.

answered Dec 11 '12 at 04:24

Thomas M. DuBuisson

64,245
7
109
166

2

Threaded code is not the same thing as a single threaded event driven non-blocking code. (I know that nodejs uses a thread pool internally, that is not the point). You have to use MVars to communictae between the threads, which involves synchronization and on the larger scale transactions. However, with events I know that nothing else is modifying the program state if I do not call anything that needs a callback. Your solution uses "wait", which will block, so any caller of such an async method needs to be put in a separate thread to be able to continue: one thread per method call, no? – mmaroti Dec 12 '12 at 05:52

Non-blocking IO with Haskell

2 Answers2