3

from RWH http://book.realworldhaskell.org/read/extended-example-web-client-programming.html

The HTTP library used here does not read the HTTP result lazily. As a result, it can result in the consumption of a large amount of RAM when downloading large files such as podcasts. Other libraries are available that do not have this limitation. We used this one because it is stable, easy to install, and reasonably easy to use. We suggest mini-http, available from Hackage, for serious HTTP needs.

mini-http is deprecated on hackage. The questing is simple: Do you know of any package that offers and api for doing http requests and consuming the response body without loading it entirely into memory.

What I want is an api that offers a stream that can be transformed by iterating on it. Once simple example is counting bytes in a response.

Maybe an iteratee based api?

Don Stewart
  • 137,316
  • 36
  • 365
  • 468
Sadache
  • 651
  • 5
  • 10

2 Answers2

2

You want client-side downloading of files as a stream? How about download-curl's lazy interface?

Might be fine for your needs (or with a little tweaking).

Don Stewart
  • 137,316
  • 36
  • 365
  • 468
  • since you suggest it I suppose you mean safe lazy? – Sadache Jun 20 '10 at 20:24
  • It certainly provides a lazy stream that can be transformed by iterating on it. There's nothing inherently unsafe about streams based on chunk-wise laziness, in fact, in this case, it is the perfect abstraction, in my opinion. See Duncan's views: http://stackoverflow.com/questions/2981582/haskell-lazy-i-o-and-closing-files/2984556#2984556 – Don Stewart Jun 20 '10 at 20:30
  • what about the lazy ByteString return of the basic HTTP package? – Sadache Jun 20 '10 at 21:51
  • lazy ByteString seems to work in HTTP package, how is it inferior to other lazy approaches? – Sadache Jun 22 '10 at 06:41
2

In general there is common problem related with parsing something lazily with validation. When you receives HTTP response which contains "Content-Length" header you have to check that you'll read all that data before connection will be closed. That means that you can't say that response is valid until you'll read till the very end. And your mapping will have to wait and then process the whole result.
To avoid that your library may be less strict and check only header correctness and probably first part of data (in case chunked or compressed) and return body with length less or equal to "Content-Length". Or you may use your own chunk-stream which returns Success or Fail as the last chunk.
Another approach is to sacrifice your CPU for processing response as you read it (ex. inside monad) and when there is no valid data for next read you abort all your previous calculation.

I'd suggest to look at http-monad also. Never used it, but I hope that with monad interface it implements that last approach.

ony
  • 12,457
  • 1
  • 33
  • 41