I'm trying to download .pdfs using links from a xlsx file with urlretrieve(), one column has the links and the other has the names that the downloaded file should have.
The issue is that my code seems to just overwrite the same file over and over again as it downloads each item of the list.
from urllib.request import urlretrieve
from urllib.error import URLError, HTTPError
import os
import xlrd
workbook = xlrd.open_workbook('file.xlsx',on_demand=True)
sheet = workbook.sheet_by_name('Sheet1')
listofvalues = sheet.col_values(21, 1)
listofnames = sheet.col_values(2, 1)
for name in listofnames:
for value in listofvalues:
try:
results = 'C:\\results'
full_file_name = os.path.join(results, str(name + ".pdf"))
urlretrieve(value, full_file_name)
print(str(value) + ' DOWNLOADED')
except (HTTPError, ValueError, URLError) as e:
print("------------------------------------")
print(e)
print(value)
print("-----------------------------------")
continue
I think it has something to do with nested loops, but I can't find a solution.