4

I'm using httplib to grab bunch of resources from a website and i want it at minimum cost, so i set 'Connection: keep-alive' HTTP header on my requests but i'm not sure it actually uses the same TCP connection for as many requests as the webserver allows.

i = 0
    while 1:
        i += 1
        print i
        con = httplib.HTTPConnection("myweb.com")
        con.request("GET", "/x.css", headers={"Connection":" keep-alive"})
        result = con.getresponse()
        print result.reason, result.getheaders()

Is my implementation right? does keep-alive work? Should i put 'con = httplib.HTTPConnection("myweb.com")' out of the loop?

P.S: the web server's response to keep-alive is ok, i'm aware of urllib3

SamB
  • 9,039
  • 5
  • 49
  • 56
sia
  • 401
  • 3
  • 8
  • 20
  • 2
    @CrazyCasta: why do you think it is a duplicate? `urllib2` uses `Connection: close` i.e., one request -- one connection. `httplib` uses `HTTP/1.1` i.e., the connection may be reused by default. Related: [Persistence of urllib.request connections to a HTTP server](http://stackoverflow.com/q/9772854/4279) – jfs Jan 11 '14 at 00:37
  • If you look at the question, it's about how to do multiple HTTP requests in python. The urllib2 is somewhat misleading. If you look at the first answer it specifically relates to httplib. – CrazyCasta Jan 11 '14 at 01:41

2 Answers2

10

your example creates a new TCP connection each time through the loops, so no, it will not reuse that connection.

How about this?

con = httplib.HTTPConnection("myweb.com")
while True:
    con.request("GET", "/x.css", headers={"Connection":" keep-alive"})
    result = con.getresponse()
    result.read()
    print result.reason, result.getheaders()

also, if all you want is headers, you can use the HTTP HEAD method, rather than calling GET and discarding the content.

Corey Goldberg
  • 59,062
  • 28
  • 129
  • 143
  • If we keep sending http request, the server will never close the connection even it reach timeout setting? – Jcyrss Jun 17 '19 at 08:41
-1

It certainly can't reuse the connection if you scrap the HTTPConnection object every time through the loop …

SamB
  • 9,039
  • 5
  • 49
  • 56
  • i put httpconnection creation inside the loop to avoid reading the data every time i call request/getresponse. and when i put it outside the loop, i monitored the program (via wireshark) and i'm not really sure how the program is working? – sia Jan 11 '14 at 07:38