0

I would like to know if there is a way in python to "stream" httpRequests in order to avoid loading the whole page.

What I´m currently doing to get the html data of a given url is this:

req = urllib2.Request(url)
response = urllib2.urlopen(req)
return response.read()

This way I´m always loading the whole website, but since I only need a small part of it I´m using more bandwith then I need to. If I could stop loading the website after I found a specific value / expression, or even better if I could specify where to start / end loading the website eg. starting at character #3000 loading until #5000 I´d save a lot of bandwith.

thanks in advance tschery

Kara
  • 6,115
  • 16
  • 50
  • 57
tschery
  • 153
  • 2
  • 15

1 Answers1

1

This stackoverflow answer shows how to do partial HTTP loading in Python. You can also use response.read(N) (N being the number of bytes to read) but there is no guarantee that the exact amount you specify is downloaded.

Community
  • 1
  • 1
Elektito
  • 3,863
  • 8
  • 42
  • 72