2

Is it possible to download a large file in chunks using httplib2. I am downloading files from a Google API, and in order to use the credentials from the google OAuth2WebServerFlow, I am bound to use httplib2.

At the moment I am doing:

flow = OAuth2WebServerFlow(
    client_id=XXXX,
    client_secret=XXXX,
    scope=XYZ,
    redirect_uri=XYZ
)

credentials = flow.step2_exchange(oauth_code)

http = httplib2.Http()
http = credentials.authorize(http)

resp, content = self.http.request(url, "GET")
with open(file_name, 'wb') as fw:
    fw.write(content)

But the content variable can get more than 500MB.

Any way of reading the response in chunks?

Martin Taleski
  • 6,033
  • 10
  • 40
  • 78

3 Answers3

0

You could consider streaming_httplib2, a fork of httplib2 with exactly that change in behaviour.

in order to use the credentials from the google OAuth2WebServerFlow, I am bound to use httplib2.

If you need features that aren't available in httplib2, it's worth looking at how much work it would be to get your credential handling working with another HTTP library. It may be a good longer-term investment. (e.g. How to download large file in python with requests.py?.)

Community
  • 1
  • 1
Joe
  • 29,416
  • 12
  • 68
  • 88
0

About reading response in chunks (works with httplib, must work with httplib2)

import httplib
conn = httplib.HTTPConnection("google.com")
conn.request("GET", "/")
r1 = conn.getresponse()

try:
    print r1.fp.next()
    print r1.fp.next()
except:
    print "Exception handled!"

Note: next() may raise StopIteration exception, you need to handle it.

You can avoid calling next() like this

F=open("file.html","w")
for n in r1.fp:
    F.write(n)
    F.flush()
ForceBru
  • 43,482
  • 10
  • 63
  • 98
0

You can apply oauth2client.client.Credentials to a urllib2 request.

First, obtain the credentials object. In your case, you're using:

credentials = flow.step2_exchange(oauth_code)

Now, use that object to get the auth headers and add them to the urllib2 request:

req = urllib2.Request(url)
auth_headers = {}
credentials.apply(auth_headers)
for k,v in auth_headers.iteritems():
  req.add_header(k,v)
resp = urllib2.urlopen(req)

Now resp is a file-like object that you can use to read the contents of the URL

pgilmon
  • 848
  • 1
  • 7
  • 10