0

I'm trying to add youtube thumbnail links to a dataframe with other youtube video data from an API.

pdr['thumbnail']=[]
pdr['url'] = pdr['url'].astype('string')
for index,rows in pdr.iterrows():
    if i['videoId']:
        exp = "^.*((youtu.be\/)|(v\/)|(\/u\/\w\/)|(embed\/)|(watch\?))\??v?=?([^#&?]*).*"
        s = re.findall(exp,url)[0][-1]
        thumbnail_url = f"https://i.ytimg.com/vi/{s}/maxresdefault.jpg"
        thumbnail.append(thumbnail_url)
    else:
        thumbnail.append('nan')

I keep getting this error message:

ValueError: Length of values (0) does not match length of index (604)

Dhia Djobbi
  • 1,176
  • 2
  • 15
  • 35
  • Fix your sample code. You've got `for index,rows in pdr.iterrows():` but you don't use `index` or `rows` variables in the loop. You've got `i` for `i['videoId']` out of nowhere. Then you've put `re.findall(exp,url)` but `url` is not defined anywhere. And post a sample of your dataframe. For further information, see [How to Ask](https://stackoverflow.com/questions/how-to-ask), and take the [tour](https://stackoverflow.com/tour). – aneroid Aug 21 '21 at 01:43
  • I changed the 'i' to 'index' however, its still giving the same error message. Also, the URL variable was defined in. previous cell. – Pratham Rathi Aug 21 '21 at 02:05
  • That doesn't help us help you. See how to create a [**minimal reproducible example**](https://stackoverflow.com/help/minimal-reproducible-example). That includes pre-defined variables, default values, etc. Also, if `url` is defined earlier, then the loop provides the same thumbnail for every row with a `videoId`. It doesn't clarify the problem for us. – aneroid Aug 21 '21 at 02:24

1 Answers1

0

Aside from my comments above, you're getting the "ValueError: Length of values (0) does not match length of index (604)" error is because of the line:

pdr['thumbnail']=[]

This is creating/assigning a new column on the dataframe with the values from the list but the list doesn't have any values in it. Your dataframe has 604 rows so any new column needs 604 values.

Work around that by creating a regular Python list, append all your values as per your loop and then assign that list to the new column afterwards:

thumnails = []  # normal list
for ...: # your loop here
    ...
    thumbnails.append(...)
    ...
# after the loop, assign the list to the column
pdr['thumbnail'] = thumbnails

Btw, don't use df.iterrows().

aneroid
  • 12,983
  • 3
  • 36
  • 66