Data Conversion Error while applying a function to each row in pandas Python

Question

I have a data frame in pandas in python which resembles something like this -

    contest_login_count  contest_participation_count  ipn_ratio
0                    1                            1   0.000000
1                    3                            3   0.083333
2                    3                            3   0.000000
3                    3                            3   0.066667
4                    5                           13   0.102804
5                    2                            3   0.407407
6                    1                            3   0.000000
7                    1                            2   0.000000
8                   53                           91   0.264151
9                    1                            2   0.000000

Now I want to apply a function to each row of this dataframe The function is written as this -

def findCluster(clusterModel,data):
    return clusterModel.predict(data)

I apply this function to each row in this manner -

df_fil.apply(lambda x : findCluster(cluster_all,x.reshape(1,-1)),axis=1)

When I run this code, I get a warning saying -

DataConversionWarning: Data with input dtype object was converted to float64.

warnings.warn(msg, DataConversionWarning)

This warning is printed once for each row. Since, I have around 450K rows in my data frame, my computer hangs while printing all these warning messages that too on ipython notebook.

But to test my function I created a dummy dataframe and tried applying the same function on that and it works well. Here is the code for that -

t = pd.DataFrame([[10.35,100.93,0.15],[10.35,100.93,0.15]])
t.apply(lambda x:findCluster(cluster_all,x.reshape(1,-1)),axis=1)

The output to this is -

   0  1  2
0  4  4  4
1  4  4  4

Can anyone suggest what am I doing wrong or what can I change to make this error go away?

What is `df_fil.info()` ? Maybe some column is not `float`. – jezrael Aug 29 '16 at 20:07 — jezrael, Aug 29 '16 at 20:07
@jezrael Can you add it as an answer. This worked! :) – dragster Aug 30 '16 at 06:07 — dragster, Aug 30 '16 at 06:07

score 15 · Accepted Answer · answered Aug 30 '16 at 06:10

15

I think there is problem dtype of some column is not float.

You need cast it by astype:

df['colname'] = df['colname'].astype(float)

answered Aug 30 '16 at 06:10

jezrael

822,522
95
1,334
1,252

Data Conversion Error while applying a function to each row in pandas Python

1 Answers1

Linked