28

I am using python 2.7 requests module to download a binary file using the following code, how to make this code "auto-resume" the download from partially downloaded file.

r = requests.get(self.fileurl, stream=True,  verify=False, allow_redirects=True)
if r.status_code == 200:
    CHUNK_SIZE = 8192
    bytes_read = 0
    with open(FileSave, 'wb') as f:
        itrcount=1
        for chunk in r.iter_content(CHUNK_SIZE):
            itrcount=itrcount+1
            f.write(chunk)
            bytes_read += len(chunk)
            total_per = 100 * float(bytes_read)/float(long(audioSize)+long(videoSize))


            self.progress_updates.emit('%d\n%s' % (total_per, 'Download Progress : ' + self.size_human(itrcount*CHUNK_SIZE) + '/' + Total_Size))
r.close()

I would prefer to use only requests module to achieve this if possible.

Stacked
  • 841
  • 2
  • 12
  • 23

1 Answers1

32

If the web server supports the range request then you can add the Range header to your request:

Range: bytes=StartPos-StopPos

You will receive the part between StartPos and StopPos. If dont know the StopPos just use:

Range: bytes=StartPos-

So your code would be:

def resume_download(fileurl, resume_byte_pos):
    resume_header = {'Range': 'bytes=%d-' % resume_byte_pos}
    return requests.get(fileurl, headers=resume_header, stream=True,  verify=False, allow_redirects=True)
Piotr Dabkowski
  • 5,661
  • 5
  • 38
  • 47
  • 15
    You would also need to change the file mode from 'wb' to 'ab' (to append, otherwise you will be overwriting the already saved part). – m.kocikowski Mar 14 '15 at 03:52
  • For future reference, would the resume_byte_pos, be the current size of the file, or the current size of the file minus one? – Klik Sep 05 '16 at 23:43
  • 4
    @Klik for sure not a current file size minus one - if you downloaded 0 bytes then you don't want to start from -1 :) Indexing starts form 0 so you should send current file size as starting byte. – Piotr Dabkowski Sep 06 '16 at 10:58
  • 3
    For example, you request bytes 0-2000000 with the Range header. Then, you check the file size with ``from pathlib import Path; path = Path(..); print(path.stat().st_size`` and it returns 2000001 Bytes. You can use this number for the Range header to requests the next part from 2000001-... . – tobiasraabe Jun 11 '18 at 10:49
  • 1
    Just as a note: [1] Simply use `f.tell()` as `resume_byte_pos` if already has opening file with `ab`. [2] It might response different headers if send `Range: bytes=` above, so be aware (In my case `Content-Length` missing if I send this, so I need parse `Content-Range` as total length for progress bar usage). – 林果皞 Feb 25 '20 at 19:20