2

I have a dataframe as follows

    Name Age
0    Tom  20
1   nick  21
2           
3  krish  19
4   jack  18
5           
6   jill  26
7   nick

Desired output is

    Name Age
0    Tom  20
1   nick  21
3  krish  19
4   jack  18
6   jill  26
7   nick

The index should not be changed and if possible would be nice if I don't have to convert empty strings to NaN. It should be removed only if all the columns have '' empty strings

msanford
  • 11,803
  • 11
  • 66
  • 93
Arteezy
  • 193
  • 2
  • 12

3 Answers3

8

You can do:

# df.eq('') compare every cell of `df` to `''`
# .all(1) or .all(axis=1) checks if all cells on rows are True
# ~ is negate operator.
mask = ~df.eq('').all(1)

# equivalently, `ne` for `not equal`, 
# mask = df.ne('').any(axis=1)

# mask is a boolean series of same length with `df`
# this is called boolean indexing, similar to numpy's
# which chooses only rows corresponding to `True`
df = df[mask]

Or in one line:

df = df[~df.eq('').all(1)]
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
3

If they are NaN we can do dropna or we replace the empty to NaN

df.mask(df.eq('')).dropna(thresh=1)
Out[151]: 
    Name  Age
0    Tom   20
1   nick   21
3  krish   19
4   jack   18
6   jill   26
7   nick  NaN
BENY
  • 317,841
  • 20
  • 164
  • 234
3

Empty strings are actually interpreted as False, so removing rows with only empty strings is as easy as keeping rows in which at least one field is not empty (i.e. interpreted as True) :

df[df.any(axis=1)]

or shortly

df[df.any(1)]
Skippy le Grand Gourou
  • 6,976
  • 4
  • 60
  • 76