0

Here is my code to calculate the square difference between imdbRating and imdbVotes

imdb_data['imdbVotes'] = imdb_data['imdbVotes'].astype(int)  
imdb_data['imdbRating'] = imdb_data['imdbRating'].astype(int)

imdb_data['new'] = imdb_data['imdbRating'] - imdb_data['imdbVotes']

This is the error I got with Python 3.7.0 + pandas 0.23.4:

TypeError: string indices must be integers

(imdb_data is a dataframe and the column-names referenced do exist)

smci
  • 32,567
  • 20
  • 113
  • 146

1 Answers1

2

The columns imdbRating, imdbVotes should be of datatype float. So convert them to float from string. Then do your calculations.

imdb_data = pd.read_csv('IMDB_data.csv', sep=',',encoding = 'ISO-8859-1')
imdb_data['imdbRating'] = pd.to_numeric(imdb_data['imdbRating'], errors='coerce', downcast='float')
imdb_data['imdbVotes'] = pd.to_numeric(imdb_data['imdbVotes'], errors='coerce', downcast='float')
imdb_data['new'] = imdb_data['imdbRating'] - imdb_data['imdbVotes']
imdb_data.head()

output is output as image

    Plot    Title   imdbVotes   Poster  imdbRating  Genre   imdbID  Year    Language    new
0   Despite his tarnished reputation after the eve...   The Dark Knight Rises   2679.0  http://ia.media-imdb.com/images/M/MV5BMTk4ODQz...   75.0    Action, Thriller    tt1345836   2012    English     -2604.0
1   0   0   0.0     0   0.0     0   0   0   0   0.0
2   Based on the novel written by Stephen Chbosky,...   The Perks of Being a Wallflower     1270.0  http://ia.media-imdb.com/images/M/MV5BMzIxOTQy...   71.0    Drama, Romance  tt1659337   2012    English     -1199.0
3   Mike Lane is a thirty-year old living in Tampa...   Magic Mike  2580.0  http://ia.media-imdb.com/images/M/MV5BMTQzMDMz...   51.0    Comedy, Drama   tt1915581   2012    English     -2529.0
4   When Bond's latest assignment goes gravely wro...   Skyfall     1807.0  http://ia.media-imdb.com/images/M/MV5BMjAyODkz...   68.0    Action, Thriller    tt1074638   2012    English     -1739.0
Prince Francis
  • 2,995
  • 1
  • 14
  • 22