52

I have a pandas DataFrame with mixed data types. I would like to replace all null values with None (instead of default np.nan). For some reason, this appears to be nearly impossible.

In reality my DataFrame is read in from a csv, but here is a simple DataFrame with mixed data types to illustrate my problem.

df = pd.DataFrame(index=[0], columns=range(5))
df.iloc[0] = [1, 'two', np.nan, 3, 4] 

I can't do:

>>> df.fillna(None)
ValueError: must specify a fill method or value

nor:

>>> df[df.isnull()] = None
TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value

nor:

>>> df.replace(np.nan, None)
TypeError: cannot replace [nan] with method pad on a DataFrame

I used to have a DataFrame with only string values, so I could do:

>>> df[df == ""] = None

which worked. But now that I have mixed datatypes, it's a no go.

For various reasons about my code, it would be helpful to be able to use None as my null value. Is there a way I can set the null values to None? Or do I just have to go back through my other code and make sure I'm using np.isnan or pd.isnull everywhere?

J Jones
  • 3,060
  • 4
  • 26
  • 43

3 Answers3

97

Use pd.DataFrame.where
Uses df value when condition is met, otherwise uses None

df.where(df.notnull(), None)

enter image description here

piRSquared
  • 285,575
  • 57
  • 475
  • 624
6

Expanding on the accpeted answer.. When you also need to catch NaN values within numeric dtype columns, you may need to change dtype to object first:

df.astype(object).where(df.notna(), None)

as per original reply by @BENNY

KingOtto
  • 840
  • 5
  • 18
3

Found this helpful for replacing NaN values with None: df.replace({pd.np.nan: None})

apinanyogaratnam
  • 628
  • 7
  • 14
  • 1
    Note that `pd.np.nan` is deprecated, but you can also just use `float("nan")` instead of importing numpy. – alexia Oct 14 '22 at 12:49