0

pandas newbie here. I have a DataFrame with some values as '?' which I have successfully replaced with 'NaN'. I would like to replace 'NaN' with the average of the column, however, I am running into an issue where the 'NaN' is not removed. I've reviewed the solution below, but it does not work, per the below.

pandas DataFrame: replace nan values with average of columns

Code:

       df = pd.DataFrame(cancer)
       print(df)
       df['A7'] = df['A7'].replace(['?'],"NaN")
       print(df)
       # the code below is where my issue arises
       df.fillna(df.mean())
       print(df)

Before ? is replaced with NaN:

     Scn  A2  A3  A4  A5  A6  A7  A8  A9  A10  CLASS
     [.....]
     21   1054593  10   5   5   3   6   7   7  10    1      4
     22   1056784   3   1   1   1   2   1   2   1    1      2
     23   1057013   8   4   5   1   2   ?   7   3    1      4

Before NaN is replaced with mean:

     Scn  A2  A3  A4  A5  A6   A7  A8  A9  A10  CLASS
     [.....]
     21   1054593  10   5   5   3   6    7   7  10    1      4
     22   1056784   3   1   1   1   2    1   2   1    1      2
     23   1057013   8   4   5   1   2  NaN   7   3    1      4

After NaN is replaced with average:

     Scn  A2  A3  A4  A5  A6   A7  A8  A9  A10  CLASS
     [.....]
     21   1054593  10   5   5   3   6    7   7  10    1      4
     22   1056784   3   1   1   1   2    1   2   1    1      2
     23   1057013   8   4   5   1   2  NaN   7   3    1      4

I'm not sure what I am doing wrong.

0 Answers0