how to fix 'Column not found: score'?

Question

I have used the statement rename( columns={"user_id": "score"},inplace=True) to rename the user_id to score ,but why I get KeyError: 'Column not found: score' I do not know how to fix that. I used the code from https://www.geeksforgeeks.org/building-recommendation-engines-using-pandas/?ref=rp. why too many website give wrong code example? small example here :

import pandas as pd
df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
df.rename(columns={"A": "arr", "B": "c"},inplace=True)
print(df)

It works .

 import pandas as pd
      
# Get the column names
col_names = ['user_id', 'item_id', 'rating', 'timestamp']
  
# Load the dataset
path = 'https://media.geeksforgeeks.org/wp-content/uploads/file.tsv'
  
ratings = pd.read_csv(path, sep='\t', names=col_names)
  
# Check the head of the data
print(ratings.head())
  
# Check out all the movies and their respective IDs
movies = pd.read_csv(
    'https://media.geeksforgeeks.org/wp-content/uploads/Movie_Id_Titles.csv')
print(movies.head())
  
# We merge the data
movies_merge = pd.merge(ratings, movies, on='item_id')
movies_merge.head()
pop_movies = movies_merge.groupby("title")
pop_movies["user_id"].count().sort_values(
    ascending=False).reset_index().rename(
  columns={"user_id": "score"},inplace=True)
  
pop_movies['Rank'] = pop_movies['score'].rank(
  ascending=0, method='first')
pop_movies

The key error occurs on `pop_movies['score'].rank`, not on the rename. What exactly is the block above supposed to do, create a ranking of moves from best to worst for each individual user? — tobias_k, Aug 05 '22 at 08:09
Please provide a minimal, reproducible example, see [How to make good reproducible pandas examples](https://stackoverflow.com/a/20159305/15873043). What did you try to solve the error? What do you see when you print/display the line containing the `rename`? — fsimonjetz, Aug 05 '22 at 08:16
@tobias_k can't man. if you use print(pop_movies['score']) ,error happen. I do not know the block supposed to do . I stuck in that small thing. — zzzbei, Aug 05 '22 at 08:17
@fsimonjetz yes I did it , dataframe change to groupby object ,I do not know how to change back — zzzbei, Aug 05 '22 at 08:21

score 2 · Accepted Answer · answered Aug 05 '22 at 08:15

Note that movies_merge.groupby("title") does not return a df. Rather it returns a groupby object (see df.groupby):

pop_movies = movies_merge.groupby("title")
print(type(pop_movies))
<class 'pandas.core.groupby.generic.DataFrameGroupBy'>

Hence, the calculate you perform on this object produces a new df, which you first need to assign to a variable for the .rename( columns={"user_id": "score"},inplace=True) operation to be sensical:

pop_movies = pop_movies["user_id"].count().sort_values(
    ascending=False).reset_index()
print(type(pop_movies))
<class 'pandas.core.frame.DataFrame'>

Now, the rest will work:

pop_movies.rename(
  columns={"user_id": "score"},inplace=True)
  
pop_movies['Rank'] = pop_movies['score'].rank(
  ascending=0, method='first')

print(pop_movies.head())
                       title  score  Rank
0           Star Wars (1977)    584   1.0
1             Contact (1997)    509   2.0
2               Fargo (1996)    508   3.0
3  Return of the Jedi (1983)    507   4.0
4           Liar Liar (1997)    485   5.0

it is very nice of you , if someone do not help me, it will be a problem for me. Maybe three days. I wish next time I will find some nice people just like you — zzzbei, Aug 05 '22 at 08:47

how to fix 'Column not found: score'?

1 Answers1