3

I am attempting to drop the nan values in my DataFrame df, however I am having difficulty in dropping the for each column without effecting the entire row. An example of my df can be seen below.

Advertising No Advertising
nan          7.0
71.0         nan
65.0         nan
14.0         nan
76.0         nan
nan          36.0
nan          9.0
73.0         nan
85.0         nan
17.0         nan
nan          103.0

My desired output is shown below.

Advertising No Advertising
71.0        7.0
65.0        36.0 
14.0        9.0
76.0        103.0 
73.0         
85.0                     
17.0                       

The examples given are just a snippet of the total DataFrame.

Any help would be greatly appreciated.

moe_95
  • 397
  • 2
  • 17

1 Answers1

3

Use justify with DataFrame.dropna:

df = pd.DataFrame(justify(df.values, invalid_val=np.nan, axis=0, side='up'), 
                  index=df.index, 
                  columns=df.columns).dropna(how='all')
print (df)
    Advertising  No Advertising
0          71.0             7.0
1          65.0            36.0
2          14.0             9.0
3          76.0           103.0
4          73.0             NaN
5          85.0             NaN
6          17.0             NaN

Another slowier solution is use DataFrame.apply with Series.dropna:

df = df.apply(lambda x: pd.Series(x.dropna().values))
print (df)

   Advertising  No Advertising
0         71.0             7.0
1         65.0            36.0
2         14.0             9.0
3         76.0           103.0
4         73.0             NaN
5         85.0             NaN
6         17.0             NaN

Mixing numeric with strings (empty strings) is not good idea, because if need processes number later pandas functions failed, so rather dont do it.

But is is possible by :

df = df.fillna('')
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252