6

I'm trying to use requests to download a large file (namely an android cts zip file) using the same technique as this answer. Intermittently it is failing to download the whole file, but I can't find any indication that something has gone wrong until I try to unzip the file

CTS_URL = 'http://dl.google.com/dl/android/cts/android-cts-8.0_r14-linux_x86-x86.zip'
CTS_ZIP = 'android-cts-8.0_r14-linux_x86-x86.zip'

import requests

req = requests.get(CTS_URL, stream=True)
with open(CTS_ZIP, 'wb') as cts_zip_file:
  for chunk in req.iter_content(chunk_size=4096):
    cts_zip_file.write(chunk)

later when I try to unzip I'm getting a BadZipFile("File is not a zip file") error, because the file hasn't been fully-downloaded

import zipfile
zipfile.ZipFile(CTS_ZIP)  # fails

However, I can't get any indication from the request object that something has gone wrong. req.status is 200, req.ok is True.

Does req know that something went wrong? I currently have one of these request objects in an interactive prompt so I can inspect it further.

Ryan Haining
  • 35,360
  • 15
  • 114
  • 174
  • 1
    You can manually check `Content-length`. If it doesn't match file's size, that is wrong. – Sraw Oct 23 '18 at 23:50
  • it is strange , i did not get any errors with your code. – KC. Oct 24 '18 at 02:24
  • 1
    Whats your download speed and latency? If it's slow and header `Connection` doesn't set as `Connection: Keep-Alive`, and the `Keep-Alive` header field has Keep-Alive with low timeout, then there could be chance that the webserver closes the connection to you – Attila Kis Oct 25 '18 at 08:35
  • I face the same problem and there is no solution now. Have you resolved this problem ? – W. Dan Apr 09 '21 at 03:06
  • 1
    @W.Dan I checked the `Content-length` and redid the request if the file size didn't match – Ryan Haining Apr 09 '21 at 04:36
  • 1
    @RyanHaining I finally found a workaround solution: https://stackoverflow.com/a/67053532/9265663 – W. Dan Apr 15 '21 at 04:28
  • 1
    This is `requests` issue [#4956](https://github.com/psf/requests/issues/4956). – Cees Timmerman Jul 23 '22 at 12:07

1 Answers1

0

The code you share worked for me. However, you might consider about following things:

  1. Like sraw mentioned, check Content-length value.
  2. Check the file you downloaded. If the size of file is correct, you can try to extract the file manually in shell (just wanted to make sure).
  3. You can try to use urllib (example)
  4. It could be also server problem as well -- just in case. You can try to download different file.
alican
  • 36
  • 3