Convert Pandas Dataframe to Numpy

Asked Mar 24 '17 at 10:06

Active Feb 03 '19 at 21:36

Viewed 332 times

I have a DataFrame reads from MovieLens dataset,it has the format like this:

   user_id  item_id  rating  timestamp
0      196      242       3  881250949
1      186      302       3  891717742
2       22      377       1  878887116
3      244       51       2  880606923
4      166      346       1  886397596

I would like to convert it to numpy.narray,here is working code:

MyCF.train_data_matrix = numpy.zeros((n_users, n_items))
for line in MyCF.train_data.itertuples():
    MyCF.train_data_matrix[line[1] - 1, line[2] - 1] = line[3]

but it is too slow when my DataFrame data is very big ,is there a efficient function in pandas to convert my pandas.DataFrame to numpy.array, the format of my numpy.array should like this:

matrix[user_id][item_id]=rating

edited Feb 03 '19 at 21:36

cs95

379,657
97
704
746

asked Mar 24 '17 at 10:06

Hailin FU

[`as_matrix`](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.as_matrix.html) – PidgeyUsedGust Mar 24 '17 at 10:07
do you rather want a dictionary according to the line `matrix[user_id][item_id]=rating` ? why not keeping the dataframe structure? – Colonel Beauvel Mar 24 '17 at 10:08
I need matrix to calculate user similarity,with matrix structure and numpy,sklearn,a lot of works can be done quickly,thanks for your reply – Hailin FU Mar 24 '17 at 13:33

Convert Pandas Dataframe to Numpy

0 Answers0