374

I have a Pandas Dataframe as shown below:

    1    2       3
 0  a  NaN    read
 1  b    l  unread
 2  c  NaN    read

I want to remove the NaN values with an empty string so that it looks like so:

    1    2       3
 0  a   ""    read
 1  b    l  unread
 2  c   ""    read
Maven Carvalho
  • 319
  • 1
  • 5
  • 14
user1452759
  • 8,810
  • 15
  • 42
  • 58

8 Answers8

625
df = df.fillna('')

This will fill na's (e.g. NaN's) with ''.

inplace is possible but should be avoided as it makes a copy internally anyway, and it will be deprecated:

df.fillna('', inplace=True)

To fill only a single column:

df.column1 = df.column1.fillna('')

One can use df['column1'] instead of df.column1.

fantabolous
  • 21,470
  • 7
  • 54
  • 51
  • 15
    @Mithril - `df[['column1','column2']] = df[['column1','column2']].fillna('')` – elPastor Oct 12 '17 at 01:29
  • 2
    This is giving me `SettingWithCopyWarning` – jss367 Nov 11 '20 at 22:44
  • 4
    @jss367 That's not due to this code, but rather because you've earlier created a partial view of a larger df. Very good answer here https://stackoverflow.com/a/53954986/3427777 – fantabolous Jan 26 '21 at 11:54
  • I'm curious as to why `str(np.nan)` doesn't return an empty string, which would seem to me to be the logical result. I'm sure it has something to do with the inner workings of the sausage factory. Can anyone point me to a good explanation? – JJL Jun 24 '21 at 22:14
395
import numpy as np
df1 = df.replace(np.nan, '', regex=True)

This might help. It will replace all NaNs with an empty string.

Ninjakannon
  • 3,751
  • 7
  • 53
  • 76
nEO
  • 5,305
  • 3
  • 21
  • 25
168

If you are reading the dataframe from a file (say CSV or Excel) then use :

df.read_csv(path , na_filter=False)
df.read_excel(path , na_filter=False)

This will automatically consider the empty fields as empty strings ''


If you already have the dataframe

df = df.replace(np.nan, '', regex=True)
df = df.fillna('')
Natesh bhat
  • 12,274
  • 10
  • 84
  • 125
  • na_filter is not available on read_excel() http://pandas.pydata.org/pandas-docs/stable/search.html?q=na_filter&check_keywords=yes&area=default – Marjorie Roswell Jul 31 '17 at 02:39
  • i have used it in my application . It does exist but for some reason , they haven't given this argument in the docs . It works nice for me though without errors. – Natesh bhat Aug 01 '17 at 06:40
  • It works, i'm using it in parse `xl.parse('sheet_name', na_filter=False)` – Dmitrii Nov 22 '17 at 17:33
  • I trawled through so many different threads for a fix and this is the only one that worked for my CSV file. Thanks. – Deskjokey Jan 09 '22 at 09:52
10

Use a formatter, if you only want to format it so that it renders nicely when printed. Just use the df.to_string(... formatters to define custom string-formatting, without needlessly modifying your DataFrame or wasting memory:

df = pd.DataFrame({
    'A': ['a', 'b', 'c'],
    'B': [np.nan, 1, np.nan],
    'C': ['read', 'unread', 'read']})
print df.to_string(
    formatters={'B': lambda x: '' if pd.isnull(x) else '{:.0f}'.format(x)})

To get:

   A B       C
0  a      read
1  b 1  unread
2  c      read
smci
  • 32,567
  • 20
  • 113
  • 146
Steve Schulist
  • 931
  • 1
  • 11
  • 18
  • 4
    `print df.fillna('')` by itself (without doing `df = df.fillna('')`) doesn't modify the original either. Is there a speed or other advantage to using `to_string`? – fantabolous Nov 27 '18 at 03:10
  • Fair enough, `df.fillna('')` it is! – Steve Schulist Nov 28 '18 at 15:35
  • @shadowtalker: Not necessarily, it would only be the correct answer if the OP wanted to keep the df in one format (e.g. more computationally-efficient, or saving memory on unnecessary/empty/duplicate strings), yet render it visually in a more pleasing way. Without knowing more about the use-case, we can't say for sure. – smci May 24 '19 at 23:05
7

Try this,

add inplace=True

import numpy as np
df.replace(np.NaN, '', inplace=True)
Vineesh TP
  • 7,755
  • 12
  • 66
  • 130
  • This is not an empty string, `''` and `' '` are not equivalent, While the first is treated as `False`, the value used above will be treated as `True`. – suvayu Apr 28 '21 at 09:26
4

using keep_default_na=False should help you:

df = pd.read_csv(filename, keep_default_na=False)
Bendy Latortue
  • 391
  • 5
  • 6
0

If you are converting DataFrame to JSON, NaN will give error so best solution is in this use case is to replace NaN with None.
Here is how:

df1 = df.where((pd.notnull(df)), None)
taras
  • 6,566
  • 10
  • 39
  • 50
0

I tried with one column of string values with nan.

To remove the nan and fill the empty string:

df.columnname.replace(np.nan,'',regex = True)

To remove the nan and fill some values:

df.columnname.replace(np.nan,'value',regex = True)

I tried df.iloc also. but it needs the index of the column. so you need to look into the table again. simply the above method reduced one step.

Subbu VidyaSekar
  • 2,503
  • 3
  • 21
  • 39