10

I am writing application which will download some files by HTTP. Up to some point I was using following code snippet to download page body:

import network.HTTP
simpleHTTP (getRequest "http://www.haskell.org/") >>= getResponseBody

It was working fine but it could not establish connection by HTTPS protocol. So to fix this I have switched to HTTP-Conduit and now I am using following code:

simpleHttp' :: Manager -> String -> IO (C.Response LBS.ByteString)
simpleHttp' manager url = do
     request <- parseUrl url
     runResourceT $ httpLbs request manager

It can connect to HTTPS but new frustrating problem appeared. About every fifth connection fails with exception:

getpics.hs: FailedConnectionException "i.imgur.com" 80

I am convinced that this is HTTP-Conduit problem because network.HTTP was working fine on same set of pages (excluding https pages).

Have anybody met such problem and know solution or better (and simple because this is simple task which should not take more than few lines of code) alternative to Conduit library?

Petr
  • 62,528
  • 13
  • 153
  • 317
Trismegistos
  • 3,821
  • 2
  • 24
  • 41
  • I have this same issue! I just thought it was the endpoints I was connecting to (stripe and postmark) until I saw this. Thanks for bringing it up. – Luke Hoersten Nov 16 '13 at 14:35
  • 1
    Some comments- 1. haskell.org is down this weekend, so the first snippet you showed will not work, 2. Fire up wireshark at see what happens.... you can watch the whole connection for http, for https the details will be missing but at least you can see if the tcp headers go through, 3. You mention https, but the error you showed shows port 80, which is for http. At any rate I tried the code and it worked for me, fetching http://google.com and https://google.com, even many times in a row. – jamshidh Nov 16 '13 at 19:48
  • I have some set of pages in which there are http and https pages that is why port number is 80. If I run program with one link it never fails. It fails when I try to get few links in a row in single execution. – Trismegistos Nov 18 '13 at 08:56
  • What version of http-conduit are you running? Also, I want to repeat that a good Wireshark log of the problem happening would tell me a lot. Do you know how to run Wireshark? – jamshidh Dec 22 '13 at 17:14

1 Answers1

2

One simple alternative would be to use the curl package. It supports HTTP, HTTPS and a bunch of other alternative protocols, as well as many options to customize its behavior. The price is introducing an external dependency on libcurl, required to build the package.

Example:

import Network.Curl

main :: IO ()
main = do
  let addr = "https://google.com/" 
  -- Explicit type annotation is required for calls to curlGetresponse_.
  -- Use ByteString instead of String for higher performance:
  r <- curlGetResponse_ addr [] :: IO (CurlResponse_ [(String,String)] String)

  print $ respHeaders r
  putStr $ respBody r

Update: I tried to replicate your problem, but everything works for me. Could you post a Short, Self Contained, Compilable, Example that demonstrates the problem? My code:

import Control.Monad
import qualified Data.Conduit as C
import qualified Data.ByteString.Lazy as LBS
import Network.HTTP.Conduit

simpleHttp'' :: String -> Manager -> C.ResourceT IO (Response LBS.ByteString)
simpleHttp'' url manager = do
     request <- parseUrl url
     httpLbs request manager

main :: IO ()
main = do
  let url = "http://i.imgur.com/"
      count = 100
  rs <- withManager $ \m -> replicateM count (simpleHttp'' url m)
  mapM_ (print . responseStatus) $ rs
Petr
  • 62,528
  • 13
  • 153
  • 317
  • I think this solution is Unix specific. – Trismegistos Dec 23 '13 at 08:52
  • @Trismegistos You need a non-Unix platform? Which one? Or you need your application to be portable on different platforms? Libcurl works on numerous platforms, including Windows and MacOS, so perhaps it could be made to work. – Petr Dec 23 '13 at 09:11
  • I need Unix and Windows but really I keep comparing it to Python and how painlessly urllib works and It makes me frustrated that Haskell community advertise its tool as some God creation but can not cope with mundane problem of fetching HTTP/HTTPS pages. – Trismegistos Dec 23 '13 at 10:14
  • @Trismegistos I updated the answer - I'm not able to demonstrate the problem (neither http nor https). Could you post a complete piece of code that shows the failures? – Petr Dec 23 '13 at 12:42
  • I will isolate right part of code and post it today night or more probably on Wednesday. – Trismegistos Dec 23 '13 at 14:38
  • @Trismegistos It's Wednesday. Happy Christmas! – not my job Dec 25 '13 at 23:41
  • @chunksOf50 I am working on isolating the code. Lately I have reinstalled linux and http-conduit from Cabal and now http-conduit is not able to make any connection to any server which is even worse. I will try to solve and then provide code which reproduces the error. – Trismegistos Dec 26 '13 at 18:25
  • @Trismegistos Excellent. I'm going to have to award the bounty tomorrow, which means the question will fall off the "featured" tab, so any attention you can give it today is probably worth it. – not my job Dec 26 '13 at 18:37
  • @chunksOf50 (1/2) :I have reinstalled http-conduit from cabal dozen times now and I am not able to do any connection I see two options: first that new version of conduit is broken (it was updated to 2.0.3 from 1.x.y since I created this post). Second option is that something is wrong with network because it is sluggish today but I do not really believe that because I am able to browse internet and none of dozen http queries made by http conduit worked. – Trismegistos Dec 26 '13 at 18:42
  • @chunksOf50 (2/2): If you would like to try newest http-conduit just remove your .cabal folder (back it up first) then do cabal update; cabal install http-conduit and run this program: -- start of program -- import Network.HTTP.Conduit as C -- Newline -- main = do simpleHttp "http://www.google.com" --End of program-- It does not work at all for me. It produces error: gp.hs: FailedConnectionException "www.google.com" 80 – Trismegistos Dec 26 '13 at 18:43
  • 1
    @Trismegistos I tested my above code snippet with http-conduit 2.0.0.3 several times, without a single failure. Either something is wrong with your network setup or you're using http-conduit in a different way that is prone to the error. Please post a [complete code sample](http://sscce.org/) that demonstrates the problem, this is probably the only way we can figure it out. – Petr Dec 27 '13 at 08:48
  • @PetrPudlák my post (2/2) provides complete example. Http-conduit does not work for me while network.HTTP module works Curl also works so this looks like conduit problem. – Trismegistos Dec 27 '13 at 20:59
  • 1
    @Trismegistos I tried it and it works without any problems. Have you tried running it on another computer? Perhaps it's some kind of a bug that manifests only under certain conditions. – Petr Dec 27 '13 at 21:18