16

I have a pandas dataframe (that was created by importing a csv file). I want to replace blank values with NaN. Some of these blank values are empty and some contain a (variable number) of spaces '', ' ', ' ', etc.

Using the suggestion from this thread I have

df.replace(r'\s+', np.nan, regex=True, inplace = True)

which does replace all the strings that only contain spaces, but also replaces every string that has a space in it, which is not what I want.

How do I replace only strings with just spaces and empty strings?

Rajshekar Reddy
  • 18,647
  • 3
  • 40
  • 59
doctorer
  • 1,672
  • 5
  • 27
  • 50
  • I believe this is a duplicate question: https://stackoverflow.com/questions/13445241/replacing-blank-values-white-space-with-nan-in-pandas – Nick K9 Feb 17 '18 at 23:27
  • @NickK9 I linked to that thread in my question. The question and accepted answer replace cells containing _any_ white space. My question is about replacing cells containing _only_ white space. So, I disagree that it is a duplicate. – doctorer Feb 18 '18 at 23:51

2 Answers2

19

Indicate it has to start with blank and end with blanks with ^ and $ :

df.replace(r'^\s*$', np.nan, regex=True, inplace = True)
Zeugma
  • 31,231
  • 9
  • 69
  • 81
  • This would also trap strings embedded in spaces such as ' thisstring ' but it will work for my dataset. – doctorer Nov 21 '16 at 02:54
  • 1
    @Boud shouldn't the pattern to replace really be r'^\s*$' instead. the original answer doesn't match empty string ''. – ihightower Jun 18 '17 at 07:15
  • @Boud stumbled upon this answer when i was stuck at not able replace few empty cells with replace().. this worked perfectly. thanks – sudheer Jul 15 '20 at 09:20
6

If you are reading a csv file and want to convert all empty strings to nan while reading the file itself then you can use the option

skipinitialspace=True

Example code

pd.read_csv('Sample.csv', skipinitialspace=True)

This will remove any white spaces that appear after the delimiters, Thus making all the empty strings as nan

From the documentation http://pandas.pydata.org/pandas-docs/stable/io.html

enter image description here

Note: This option will remove preceding white spaces even from valid data, if for any reason you want to retain the preceding white space then this option is not a good choice.

Rajshekar Reddy
  • 18,647
  • 3
  • 40
  • 59