Why doesn't print force entire lazy IO value?

Question

I'm using http-client tutorial to get response body using TLS connection. Since I can observe that print is called by withResponse, why doesn't print force entire response to the output in the following fragment?

withResponse request manager $ \response -> do
    putStrLn $ "The status code was: " ++
    body <- (responseBody response)
    print body

I need to write this instead:

response <- httpLbs request manager

putStrLn $ "The status code was: " ++
           show (statusCode $ responseStatus response)
print $ responseBody response

Body I want to print is a lazy ByteString. I'm still not sure whether I should expect print to print the entire value.

instance Show ByteString where
    showsPrec p ps r = showsPrec p (unpackChars ps) r

Print is just `putStrLn` and `show`. So what you should probably be asking is "why doesn't 'Show' fully evaluate the value?". I suspect the answer will be obvious once you look at the Show instance for whatever the body type is. Also notice the only portion of the response that would be forced is the body, not the status or other fields. — Thomas M. DuBuisson, Jan 05 '17 at 00:05
Re-reading your question, it appears you want one value, `response`, to be evaluated when calling print on another value, `body`. Is that the case? If so, why would you expect that behavior in the first place? — Thomas M. DuBuisson, Jan 05 '17 at 00:08
What happened to the `show (statusCode) ...` line in the first snippet? — user253751, Jan 05 '17 at 00:20
This doesn't have to do with laziness, it's the difference between the `Response L.ByteString` you get in the "simple" case, and the `Response BodyReader` you get in the tls case. A `BodyReader` cant be printed directly since it's an IO action. But it's an action that can be repeated, yielding a new chunk each time. It follows the familiar protocol that when it is 'done' when it gets an empty bytestring. In your tls case, you are just printing the first chunk, but you need a loop to print the results as they come, til you hit an empty chunk. — Michael, Jan 05 '17 at 00:52
sevo, look at the difference between `bip` and `bop` here http://lpaste.net/8526468113670078464 You wrote `bip` which only fetches once, so to speak, and prints one nice-sized chunk - in this case the first couple chapters of the king james bible. But you want `bop` which prints chunks until it hits an empty chunk, and thus prints the whole translation. — Michael, Jan 05 '17 at 01:15
@Michael Comments are an impoverished space, which probably explains why you put your code in an lpaste instead of including it inline. Why not turn it into an answer, where you have enough space to include all the details without linking elsewhere? =) — Daniel Wagner, Jan 05 '17 at 01:59
@Michael Please make this an answer. I was hasty and not aware that "body" means "body chunk of arbitrary size" in this API. I'm still confused about the types and names in this API but you helped me to realize my mistake. — sevo, Jan 05 '17 at 23:41
The first example has a syntax error (trailing `++`). Please fix. — dfeuer, Jan 06 '17 at 16:39

Michael · Accepted Answer · 2017-01-07T00:46:16.010

This doesn't have to do with laziness, but with the difference between the Response L.ByteString you get with the Simple module, and the Response BodyReader you get with the TLS module.

You noticed that a BodyReader is an IO ByteString. But in particular it is an action that can be repeated, each time with the next chunk of bytes. It follows the protocol that it never sends a null bytestring except when it's at the end of file. (BodyReader might have been called ChunkGetter). bip below is like what you wrote: after extracting the BodyReader/IO ByteString from the Response, it performs it to get the first chunk, and prints it. But doesn't repeat the action to get more - so in this case we just see the first couple chapters of Genesis. What you need is a loop to exhaust the chunks, as in bop below, which causes the whole King James Bible to spill into the console.

{-# LANGUAGE OverloadedStrings #-} 
import Network.HTTP.Client
import Network.HTTP.Client.TLS
import qualified Data.ByteString.Char8 as B

main = bip
-- main = bop

bip = do 
  manager <- newManager tlsManagerSettings
  request <- parseRequest "https://raw.githubusercontent.com/michaelt/kjv/master/kjv.txt"
  withResponse request manager $ \response -> do
      putStrLn "The status code was: "  
      print (responseStatus response)
      chunk  <- responseBody response
      B.putStrLn chunk

bop = do 
  manager <- newManager tlsManagerSettings
  request <- parseRequest "https://raw.githubusercontent.com/michaelt/kjv/master/kjv.txt"
  withResponse request manager $ \response -> do
      putStrLn "The status code was: " 
      print (responseStatus response)
      let loop = do 
            chunk <- responseBody response
            if B.null chunk 
              then return () 
              else B.putStr chunk  >> loop 
      loop

The loop keeps going back to get more chunks until it gets an empty string, which represents eof, so in the terminal it prints through to the end of the Apocalypse.

This is behavior is straightforward but slightly technical. You can only work with a BodyReader by hand-written recursion. But the purpose of the http-client library is to make things like http-conduit possible. There the result of withResponse has the type Response (ConduitM i ByteString m ()). ConduitM i ByteString m () is how conduit types of a byte stream; this byte stream would contain the whole file.

In the original form of the http-client/http-conduit material, the Response contained a conduit like this; the BodyReader part was later factored out into http-client so it could be used by different streaming libraries like pipes.

So to take a simple example, in the corresponding http material for the streaming and streaming-bytestring libraries, withHTTP gives you a response of type Response (ByteString IO ()). ByteString IO () is the type of a stream of bytes arising in IO, as its name suggests; ByteString Identity () would be the equivalent of a lazy bytestring (effectively a pure list of chunks.) The ByteString IO () will in this case represent the whole bytestream down to the Apocalypse. So with the imports

 import qualified Data.ByteString.Streaming.HTTP as Bytes -- streaming-utils
 import qualified Data.ByteString.Streaming.Char8 as Bytes -- streaming-bytestring

the program is identical to a lazy bytestring program:

bap = do 
    manager <- newManager tlsManagerSettings
    request <- parseRequest "https://raw.githubusercontent.com/michaelt/kjv/master/kjv.txt"
    Bytes.withHTTP request manager $ \response -> do 
        putStrLn "The status code was: "
        print (responseStatus response)
        Bytes.putStrLn $ responseBody response

Indeed it is slightly simpler, since you don't have "extract the bytes from IO`:

        lazy_bytes <- responseStatus response
        Lazy.putStrLn lazy_bytes

but just write

        Bytes.putStrLn $ responseBody response

you just "print" them directly. If you want to view just a bit from the middle of the KJV, you can instead do what you would with a lazy bytestring, and end with:

        Bytes.putStrLn $ Bytes.take 1000 $ Bytes.drop 50000 $ responseBody response

Then you will see something about Abraham.

The withHTTP for streaming-bytestring just hides the recursive looping that we needed to use the BodyReader material from http-client directly. It's the same e.g. with the withHTTP you find in pipes-http, which represents a stream of bytestring chunks as Producer ByteString IO (), and the same with http-conduit. In all of these cases, once you have your hands on the byte stream you handle it in the ways typical of the streaming IO framework without handwritten recursion. All of them use the BodyReader from http-client to do this, and this was the main purpose of the library.

Why doesn't print force entire lazy IO value?

1 Answers1