2

I have a dataframe with very mixed columns. I am trying to set all occurrences of None or NaN to None.

I am trying the answer to this question: Use None instead of np.nan for null values in pandas DataFrame But the accepted answer does not catch all null instances. Example:

my_array = ['1','2',None,4]
df = pd.DataFrame([my_array], columns=['Morning', 'Midday', 'Evening', 'Night'])
df = df.append({'Midday':'10'}, ignore_index=True)

which results in

  Morning Midday Evening  Night
0       1      2    None    4.0
1     NaN     10     NaN    NaN

Applying df.where() to find and replace all null vaules results in:

df.where(df.notnull(), None)

  Morning Midday Evening  Night
0       1      2    None    4.0
1    None     10    None    NaN

But I want output

  Morning Midday Evening  Night
0       1      2    None    4.0
1    None     10    None    None

What am I getting wrong, or is df.where() getting wrong?

KingOtto
  • 840
  • 5
  • 18

1 Answers1

2

You need to change the datatype to object

out = df.astype(object).where(df.notna(), None)
Out[392]: 
  Morning Midday Evening Night
0       1      2    None   4.0
1    None     10    None  None
BENY
  • 317,841
  • 20
  • 164
  • 234
  • Great answer, works for me. But please help me understand: Why is this? Both `df.where()` as well as `df.notna()` can be applied to numeric dtypes. Why the cast to `object`? – KingOtto Jan 21 '22 at 15:45
  • @KingOtto correct , the column type will impact the NaN and None change – BENY Jan 21 '22 at 15:48