1

I import some data from excel to dataframe. In excel there are some cells which are blank (not empty) i.e. someone has pressed spacebar from keyboard to fill that cell in excel, therefore no characters but still looks blank. In dataframe i tried to clean it up with below function. But dataframe doesn't show as NAN after cleaning. Is there a function available so that it can be cleaned?

df.columns = df.columns.str.strip()
AAP
  • 169
  • 1
  • 2
  • 17

2 Answers2

1

I can't reply to your comment because I have no rep :(.

If I am understanding you correctly, you wish to place a NaN value where there are spaces?

I tried the following and it seems to work, let me know if this helps.

import pandas as pd
import numpy as np

df = pd.DataFrame({'Names': ['betty', 'chris',' ',  'steve', 'carly']})

df.loc[df['Names'] == ' '] = np.nan

If you need to iterate over each column you can put the df.loc within a loop like the following.

df = pd.DataFrame({'Names': ['betty', 'chris',' ',  'steve', 'carly'],'Age':\
               ['40', ' ', '32', '44', '69']})

for col in df.columns:
    df[col].loc[df[col] == ' '] = np.nan
KillerToilet
  • 196
  • 9
0

Are you sure df.columns = df.columns.str.strip() is what you want? That only changes the column names. If you want to change the values inside the cells, consider replace:

df.replace('^\s+$', np.nan, regex=True)
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74