0

I have a dataset called records, a dataset sample looks like:

user_id movie_id genre
1       1001     action
2       1002     drama
3       1003     comedy
4       1004     drama
...     ...      ...    

I would like to iterate over records in the following way:

for user, movie, genre in records:
    print(user, movie, genre)

It first prints some rows and then shows this error:

44892 113769 comedy
44892 113769 drama
...  
------------------------------------------------ 
ValueError Traceback (most recent call last) in 
----> 1 for user, movie, genre in records:
      2     print(user, movie, genre)

ValueError: too many values to unpack (expected 3)

What is wrong and how to fix it?

Azamat
  • 209
  • 1
  • 3
  • 10
  • On some iteration your `records` variable must have more than 3 elements. Maybe you could discard any surplus using `for user, movie, genre in records[:3]:` – alani Jul 18 '20 at 09:44

5 Answers5

0

Your variable names are different:

You called them user_id, movie_id, and genre in the dataset, but refeered to them as user and movie subsequently.

Try changing it to:

for user_id, movie_id, genre in records:
    print(user_id, movie_id, genre)
Gavin Wong
  • 1,254
  • 1
  • 6
  • 15
  • actually I don't have the column names in the original dataset, I wrote them here for clarity. Anyway it doesn't explain why code works for some rows (prints some rows) and suddenly gives an error. – Azamat Jul 18 '20 at 09:42
0

Kindly check your dataset. There might be some rows where data is seperated by tab and it's considering the row as new column. because of which the error is getting thrown.

Example :

userid movieid genre

44892 113769 horror comedy

here, if you've selected separator as tab/space then it'll consider the horror & comedy as different column.

or you can refer : "Too many values to unpack" Exception

Tushar Gupta
  • 171
  • 2
  • 9
0

I wanted to add this as a comment, but I cant add a code snippet. Hence adding as an answer

for item in records:
    print(item)
    user, movie, genre=item

You'll have your value of the record printed just before the code breaks via ValueError.

Once you show us at what record the failure occured, it will be easier to get a solution

Note. If you want to ignore a record which doesnt match expected pattern, you can do

for item in records:
    try:
        user, movie, genre=item
    except ValueError:
        print("Failed at %s" % repr(item))
    else:
        print(item)
0

Is your dataset loaded as a pandas dataframe? if yes you can do something like this

cols = ['user_id', 'movie_id', 'genre']
# Assuming df is the data frame you have
for ind in df.index:
  user_id, movie_id, genre = df.loc[ind, cols]
  print(user_id, movie_id, genre)

If you don't have column names in the dataset, perhaps adding them yourself is a valid step ?

Eeshaan
  • 1,557
  • 1
  • 10
  • 22
0

you can use this to discard any extra data being returned

for user, movie, genre,*_ in records:
    print(user, movie, genre)

wave
  • 61
  • 5