I try out some data analytics with pandas and got a problem.
Input:
import pandas as pd
path = "/PATH/TO/FILE/"
rnames = ["user_id", "movie_id", "rating", "timestamp"]
ratings = pd.read_csv(path + "ratings.csv", engine="python", sep=",", header=0, names=rnames)
mnames = ["movie_id", "title", "genres"]
movies = pd.read_csv(path + "movies.csv", engine="python", sep=",", header=0, names=mnames)
print(ratings[:5])
print(movies[:5])
print(movies.dtypes)
Output:
user_id movie_id rating timestamp
0 1 31 2.5 1260759144
1 1 1029 3.0 1260759179
2 1 1061 3.0 1260759182
3 1 1129 2.0 1260759185
4 1 1172 4.0 1260759205
movie_id ... genres
0 1 ... Adventure|Animation|Children|Comedy|Fantasy
1 2 ... Adventure|Children|Fantasy
2 3 ... Comedy|Romance
3 4 ... Comedy|Drama|Romance
4 5 ... Comedy
[5 rows x 3 columns]
movie_id int64
title object
genres object
dtype: object
Process finished with exit code 0
The movies.csv is from Movielens and looks like this:
movieId,title,genres
1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
2,Jumanji (1995),Adventure|Children|Fantasy
3,Grumpier Old Men (1995),Comedy|Romance
4,Waiting to Exhale (1995),Comedy|Drama|Romance
5,Father of the Bride Part II (1995),Comedy
As you can see the title won`t show correctly (... instead of the title).
Can someone help me, please? :)