I'm working on a script to download files and struggle a bit with the implementation of two variables in a function.
Imagine I have a data frame with two columns: url
and index
. I want to download the file for every url
and save the file as the the index
plus suffix (1.mov, 2.mov etc.).
import pandas as pd
import numpy as np
import os
import urllib.request
directory = 'videos/'
def download_multimedia(url, index):
try:
url = (url)
filename = os.path.join(index + '.mov')
# Download file
fullpath = os.path.join(directory, filename)
urllib.request.urlretrieve(url, fullpath)
except:
filename = np.nan
return filename
So I tried to pass the information from the two columns into a function that is embedded into a list.
downloads = [download_multimedia(url, index) for url, index in data.videourl, data.index]
However, this gives me:
ValueError: The truth value of a RangeIndex is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
How can the issue be solved, i.e. how do I handle the input information for each row correctly?
Thanks in advance!