I have a data frame in pandas in python which resembles something like this -
contest_login_count contest_participation_count ipn_ratio
0 1 1 0.000000
1 3 3 0.083333
2 3 3 0.000000
3 3 3 0.066667
4 5 13 0.102804
5 2 3 0.407407
6 1 3 0.000000
7 1 2 0.000000
8 53 91 0.264151
9 1 2 0.000000
Now I want to apply a function to each row of this dataframe The function is written as this -
def findCluster(clusterModel,data):
return clusterModel.predict(data)
I apply this function to each row in this manner -
df_fil.apply(lambda x : findCluster(cluster_all,x.reshape(1,-1)),axis=1)
When I run this code, I get a warning saying -
DataConversionWarning: Data with input dtype object was converted to float64.
warnings.warn(msg, DataConversionWarning)
This warning is printed once for each row. Since, I have around 450K rows in my data frame, my computer hangs while printing all these warning messages that too on ipython notebook.
But to test my function I created a dummy dataframe and tried applying the same function on that and it works well. Here is the code for that -
t = pd.DataFrame([[10.35,100.93,0.15],[10.35,100.93,0.15]])
t.apply(lambda x:findCluster(cluster_all,x.reshape(1,-1)),axis=1)
The output to this is -
0 1 2
0 4 4 4
1 4 4 4
Can anyone suggest what am I doing wrong or what can I change to make this error go away?