1

I am currently downloading .tar.gz files from a server like so:

conn = http.client.HTTPSConnection(host = host,
                                   port = port,
                                   cert_file = pem,
                                   key_file = key,
                                   context = ssl.SSLContext(ssl.PROTOCOL_TLS))

conn.request('GET', url)

rsp = conn.getresponse()

fp = r"H:\path\to\new.tar.gz"

with open(fp, 'wb') as f:
    while True:
        piece = rps.read(4096)
        if not piece:
            break
        f.write(piece)

However I am concerned that this method is causing compression issues as the files sometimes remain gzipped and other times they don't.

Question:

What is the appropriate way using the gzip module to save a file from a socket stream?

Supporting Information:

I've done the following:

conn = http.client.HTTPSConnection(host = host,
                                       port = port,
                                       cert_file = pem,
                                       key_file = key,
                                       context = ssl.SSLContext(ssl.PROTOCOL_TLS))

conn.request('GET', url)

rsp = conn.getresponse()

fp = r"H:\path\to\new.tar"

f_like_obj = io.BytesIO()
f_like_obj.write(rsp.read())
f_like_obj.seek(0)
f_decomp = gzip.GzipFile(fileobj=f_like_obj, mode='rb')

with open(fp, 'wb') as f:
    f.write(f_decomp.read())

This works however sometimes the same file, downloaded at two separate times, will error:

"Not a gzipped file (b'<!')".

  • https://stackoverflow.com/questions/19602931/basic-http-file-downloading-and-saving-to-disk-in-python – Joao Vitorino Jun 28 '17 at 16:11
  • @JoaoVitorino That is with the `urllib` module. I'm looking for an `http.client` solution. –  Jun 28 '17 at 16:17
  • 1
    @JoaoVitorino Beyond that, the `urllib.request.urlretrieve()` method tends to cause `urllib.error.HTTPError`'s, specifically the HTTP 500 Internal Server. The low level `http.client` allows me to specify all the details required in avoiding that. –  Jun 28 '17 at 16:34

1 Answers1

0

Try this:

import http.client import gzip

conn = http.client.HTTPSConnection(host = host,
                                       port = port,
                                       cert_file = pem,
                                       key_file = key,
                                       context = ssl.SSLContext(ssl.PROTOCOL_TLS))

conn.request('GET', url)

rsp = conn.getresponse()

fp = r"H:\path\to\new.tar"

with gzip.GzipFile(fileobj=rsp) as decomp, open(fp, 'wb') as f:
    f.write(decomp.read())
pstatix
  • 3,611
  • 4
  • 18
  • 40