System: WIN10
IDE: ANACONDA/Jupyter Lab
Language: Python version 3.7.3
Library: pandas version 1.0.1
Data source: https://grouplens.org/datasets/movielens/
Dataset: movies.csv; ratings.cvs (ml-25m.zip)
I am having an issue for some reason when trying to write a pivot table. The combined table has over 25M records and my code keeps throwing the following error: IndexError: index 993158425 is out of bounds for axis 0 with size 993157686
Steps were taken:
- tested shape of the data frame for nan values and cleaned those up
- searched online for the error code and could not find anything
- tried various ways of writing the pivot table: .pivot, and .pivot_table
- looked at crosstab as a workaround: this will not work
Code:
df1_movies = pd.read_csv('Data/movies.csv')
df1_ratings = pd.read_csv('Data/ratings.csv')
df1_main = pd.merge(df1_movies, df1_ratings, on='movieId')
table = df1_main.pivot_table(index='userId', columns='title', values='rating')
error
IndexError: index 993158425 is out of bounds for axis 0 with size 993157686