How to retrieve output from a process without blocking the thread in Haskell

Question

What is the best way to write to the stdin and read from the stdout of a subprocess without blocking?

The subprocess was created via System.IO.createProcess which returns handles for writing to and reading from the subprocess. The writing and reading is done in text format.

For example, my best attempt at doing non-blocking read is timeout 1 $ hGetLine out which returns a Just "some line" or Nothing if no line exists to be read. However, this seems an hack to me, so I am looking for a more "standard" way.

Thanks

What semantics do you want this function to have? What if the process writes the characters "ABC" to `out` and then your program calls `hGetLineNonBlocking`, reading the characters "ABC", but unable to return `Just` something, because a newline hasn't been read yet; and it can't block until there are more characters like `hGetLine` (obviously). Do you throw away that partial line? This is almost certainly wrong. I suspect that in this case you *do* want to block, waiting for the rest of the line. If so, just check if the handle is empty with `hReady` before reading it with `hGetLine`. — user2407038, Oct 20 '15 at 00:48
@user2407038 Sorry but I do not understand your comment, since doing `timeout 1 $ hGetLine ...` always returns `Just` something if there is a full line to be read, otherwise it returns `Nothing`. — mljrg, Oct 20 '15 at 00:58
In order for `hGetLine` to read a line, it must read characters sequentially until it reaches a newline character, at which point it returns. However, if `hGetLine` reads a bunch of characters, but no newline character, the line isn't finished - so hGetLine will block until there are more characters. If your function is sensible, it won't read characters from a buffer and discard them, but it can't block - what does it do with the characters that are already read? Does it return the partial line, which *isn't* terminated by a newline? — user2407038, Oct 20 '15 at 01:06
In general, you don't not block. Instead, you spawn a thread and just go ahead and let it block. See also [How can I watch multiple files/socket to become readable/writable in Haskell?](http://stackoverflow.com/q/11744527/791604). If that addresses your question, I'd be happy to mark it as a duplicate; what do you think? — Daniel Wagner, Oct 20 '15 at 01:15
@DanielWagner My question is better addressed by the very detailed answer of ErikR (see below). Nevertheless, thanks for the link to the other question. — mljrg, Oct 20 '15 at 10:21
@user2407038 Are you saying that doing `timeout 1 $ hGetLine` can return an half read line? I don't think it will, but instead it will return `Nothing`. However, I really do not know what happens to the output channel in the case that `hGetLine` just sees some characters and hangs until it sees a newline, and in the meantime the timeout occurs. Will the characters still be there in the channel the next time I try `hGetLine`? (I have not tested this). — mljrg, Oct 20 '15 at 10:26

score 7 · Accepted Answer · answered Oct 20 '15 at 08:15

Here are some examples of how to interact with a spawned process in a fashion mentioned by @jberryman.

The program interacts with a script ./compute which simply reads lines from stdin in the form <x> <y> and returns x+1 after a delay of y seconds. More details at this gist.

There are many caveats when interacting with spawned processes. In order to avoid "suffering from buffering" you need to flush the outgoing pipe whenever you send input and the spawned process needs to flush stdout every time it sends a response. Interacting with the process via a pseudo-tty is an alternative if you find that stdout is not flushed promptly enough.

Also, the examples assume that closing the input pipe will lead to termination of the spawn process. If this is not the case you will have to send it a signal to ensure termination.

Here is the example code - see the main routine at the end for sample invocations.

import System.Environment
import System.Timeout (timeout)
import Control.Concurrent
import Control.Concurrent (forkIO, threadDelay, killThread)
import Control.Concurrent.MVar (newEmptyMVar, putMVar, takeMVar)

import System.Process
import System.IO

-- blocking IO
main1 cmd tmicros = do
  r <- createProcess (proc "./compute" []) { std_out = CreatePipe, std_in = CreatePipe }
  let (Just inp, Just outp, _, phandle) = r

  hSetBuffering inp NoBuffering
  hPutStrLn inp cmd     -- send a command

  -- block until the response is received
  contents <- hGetLine outp
  putStrLn $ "got: " ++ contents

  hClose inp            -- and close the pipe
  putStrLn "waiting for process to terminate"
  waitForProcess phandle

-- non-blocking IO, send one line, wait the timeout period for a response
main2 cmd tmicros = do
  r <- createProcess (proc "./compute" []) { std_out = CreatePipe, std_in = CreatePipe }
  let (Just inp, Just outp, _, phandle) = r

  hSetBuffering inp NoBuffering
  hPutStrLn inp cmd   -- send a command, will respond after 4 seconds

  mvar <- newEmptyMVar
  tid  <- forkIO $ hGetLine outp >>= putMVar mvar

  -- wait the timeout period for the response
  result <- timeout tmicros (takeMVar mvar)
  killThread tid

  case result of
    Nothing -> putStrLn "timed out"
    Just x  -> putStrLn $ "got: " ++ x

  hClose inp            -- and close the pipe
  putStrLn "waiting for process to terminate"
  waitForProcess phandle

-- non-block IO, send one line, report progress every timeout period
main3 cmd tmicros = do
  r <- createProcess (proc "./compute" []) { std_out = CreatePipe, std_in = CreatePipe }
  let (Just inp, Just outp, _, phandle) = r

  hSetBuffering inp NoBuffering
  hPutStrLn inp cmd   -- send command

  mvar <- newEmptyMVar
  tid  <- forkIO $ hGetLine outp >>= putMVar mvar

  -- loop until response received; report progress every timeout period
  let loop = do result <- timeout tmicros (takeMVar mvar)
                case result of
                  Nothing -> putStrLn  "still waiting..." >> loop
                  Just x  -> return x
  x <- loop
  killThread tid

  putStrLn $ "got: " ++ x

  hClose inp            -- and close the pipe
  putStrLn "waiting for process to terminate"
  waitForProcess phandle

{-

Usage: ./prog which delay timeout

  where
    which   = main routine to run: 1, 2 or 3
    delay   = delay in seconds to send to compute script
    timeout = timeout in seconds to wait for response

E.g.:

  ./prog 1 4 3   -- note: timeout is ignored for main1
  ./prog 2 2 3   -- should timeout
  ./prog 2 4 3   -- should get response
  ./prog 3 4 1   -- should see "still waiting..." a couple of times

-}

main = do
  (which : vtime : tout : _) <- fmap (map read) getArgs
  let cmd = "10 " ++ show vtime
      tmicros = 1000000*tout :: Int
  case which of
    1 -> main1 cmd tmicros
    2 -> main2 cmd tmicros
    3 -> main3 cmd tmicros
    _   -> error "huh?"

Thanks a lot for your thorough answer: that's a really good example. Just a few questions: don't you need to `hClose outp`? Also, you are not doing `terminateProcess phandle`, is this because `hClose inp` is expected to shutdown the external process? If the spawned process hangs, how does your solution will force it to stop? — mljrg, Oct 20 '15 at 10:19

How to retrieve output from a process without blocking the thread in Haskell

1 Answers1

Linked