3

simple question here -- how do I replace all of the whitespaces in a column with a zero?

For example:

  Name       Age
  John       12
  Mary 
  Tim        15

into

  Name       Age
  John       12
  Mary       0
  Tim        15

I've been trying using something like this but I am unsure how Pandas actually reads whitespace:

 merged['Age'].replace(" ", 0).bfill()

Any ideas?

user3682157
  • 1,625
  • 8
  • 29
  • 55

4 Answers4

4
merged['Age'] = merged['Age'].apply(lambda x: 0 if x == ' ' else x)
paulo.filip3
  • 3,167
  • 1
  • 23
  • 28
3

Use the built in method convert_objects and set param convert_numeric=True:

In [12]:
# convert objects will handle multiple whitespace, this will convert them to NaN
# we then call fillna to convert those to 0
df.Age = df[['Age']].convert_objects(convert_numeric=True).fillna(0)
df
Out[12]:
   Name  Age
0  John   12
1  Mary    0
2   Tim   15
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • This is a little bit out of the scope of the OP's request, since it will kill any non-numeric data that he might have had in there, rather than just the whitespace. – Patrick Collins Aug 30 '14 at 19:52
  • @PatrickCollins there is nothing in the original question about additional situations to handle, this is the recommended method unless it doesn't fulfill the OP's specific data – EdChum Aug 30 '14 at 19:54
  • Thank you Patrick for the heads up, however in this particular isstance, I do not have any non numerical data so Ed's solution worked perfectly. Thank you (you have both received an upvote!) – user3682157 Aug 30 '14 at 20:30
  • 1
    df.convert_objects() has been deprecated since v0.21.0. The preferred solution is now df.Age = pd.to_numeric(df['Age']).fillna(0) – David Bridgeland Sep 16 '20 at 21:29
1

Here's an answer modified from this, more thorough question. I'll make it a little bit more Pythonic and resolve your basestring issue.

def ws_to_zero(maybe_ws):
    try:
        if maybe_ws.isspace():
            return 0
        else:
            return maybe_ws
    except AttributeError:
        return maybe_ws

d.applymap(ws_to_zero)

where d is your dataframe.

Community
  • 1
  • 1
Patrick Collins
  • 10,306
  • 5
  • 30
  • 69
0

if you want to use NumPy, then you can use the below snippet:

import numpy as np    
df['column_of_interest'] = np.where(df['column_of_interest']==' ',0,df['column_of_interest']).astype(float)

While Paulo's response is excellent, my snippet above may be useful when multiple criteria are required during advanced data manipulation.

Scott Grammilo
  • 1,229
  • 4
  • 16
  • 37