The following code will download a file using pycurl
and display the current progress (as text):
import pycurl
# for displaying the output text
from sys import stderr as STREAM
# replace with your own url and path variables
url = "http://speedtest.tele2.net/100MB.zip"
path = 'test_file.dat'
# use kiB's
kb = 1024
# callback function for c.XFERINFOFUNCTION
def status(download_t, download_d, upload_t, upload_d):
STREAM.write('Downloading: {}/{} kiB ({}%)\r'.format(
str(int(download_d/kb)),
str(int(download_t/kb)),
str(int(download_d/download_t*100) if download_t > 0 else 0)
))
STREAM.flush()
# download file using pycurl
with open(path, 'wb') as f:
c = pycurl.Curl()
c.setopt(c.URL, url)
c.setopt(c.WRITEDATA, f)
# display progress
c.setopt(c.NOPROGRESS, False)
c.setopt(c.XFERINFOFUNCTION, status)
c.perform()
c.close()
# keeps progress on screen after download completes
print()
The output should look something like this:
Downloading: 43563/122070 kiB (35%)
If you want to use an actual progress bar, that can be done, too. But it takes more work.
The following code uses the tqdm
package to generate a progress bar. It updates in realtime as the file is downloading and also shows the download speed and the estimated time remaining. Due to a limitation of the way tqdm
works, the requests
package is also needed. That also has to do with the reason why the total_dl_d
variable is an array and not an integer.
import pycurl
# needed to predict total file size
import requests
# progress bar
from tqdm import tqdm
# replace with your own url and path variables
url = 'http://speedtest.tele2.net/100MB.zip'
path = 'test_file.dat'
# show progress % and amount in bytes
r = requests.head(url)
total_size = int(r.headers.get('content-length', 0))
block_size = 1024
# create a progress bar and update it manually
with tqdm(total=total_size, unit='iB', unit_scale=True) as pbar:
# store dotal dl's in an array (arrays work by reference)
total_dl_d = [0]
def status(download_t, download_d, upload_t, upload_d, total=total_dl_d):
# increment the progress bar
pbar.update(download_d - total[0])
# update the total dl'd amount
total[0] = download_d
# download file using pycurl
with open(path, 'wb') as f:
c = pycurl.Curl()
c.setopt(c.URL, url)
c.setopt(c.WRITEDATA, f)
# follow redirects:
c.setopt(c.FOLLOWLOCATION, True)
# custom progress bar
c.setopt(c.NOPROGRESS, False)
c.setopt(c.XFERINFOFUNCTION, status)
c.perform()
c.close()
Explanation of possible causes to the issues described:
(There was no code provided in the question, so I'll have to guess a little bit about what exactly was causing the mentioned issues...)
Based on the variable name (fp
i.e. file_path
)...
The file-write (WRITEDATA
) issue was likely due to a file path (str) being provided instead of a file object (io.BufferedWriter).
Based on my own experience...
The XFERINFOFUNCTION
callback is called repeatedly during file download. The callback only provides the total file size and the total that has already been downloaded as parameters. It does not calculate the delta (difference) since the last time it was called. The issue that was described with the progress bar ("the progress bar gets to 100% within a second and the zip file has not completed its download") is likely due to the total amount (downloaded) being used as the update
amount when an increment amount is expected. If the progress bar is being incremented each time by the total amount then it is not going to reflect the actual amount downloaded. It is going to show a much larger amount. Then, it will exceed 100% and have all sorts of glitches.
Sources: