How to filter missing data rows using python

Question

I have a dataframe df and one of the features called mort_acc have missing data. I want to filter out those rows that contains missing data for mort_acc and I used the following way

df[df['mort_acc'].apply(lambda x:x == " ")]

It didn't work. I got output 0. So I used the following lambda way

df[df['mort_acc'].apply(lambda x:len(x)<0)]

It didn't work too and this time got error object of type 'float' has no len()

So I tried this way

df[df['mort_acc'].apply(lambda x:x == NaN)]

Error happened again name 'NaN' is not defined

Does anyone know how to do it?

there is no datatype as NaN in python use pd.isna() to check if it's nan — Epsi95, May 21 '20 at 08:00
We cannot help you properly without an example. (edit: Well, apparently this time the answerers guessed right.) Please read and apply [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). — timgeb, May 21 '20 at 08:02
I don't think there is a need to give an example for this. The question is so clear and got many answers. I even gave the codes that I tried. Anyway its okay if you downvote. — Anu, May 21 '20 at 08:07

score 2 · Accepted Answer · answered May 21 '20 at 07:59

2

bad_values_row_mask = df['mort_acc'].isna()
df[bad_values_row_mask]

sounds like what you want I guess

answered May 21 '20 at 07:59

Joran Beasley

110,522
12
160
179

oh yes...that is right. `isna` do the trick. But I am just curious why `lambda x:x == " "` didn't work? – Anu May 21 '20 at 08:01
1

because `numpy.nan != " "` ... you could have done `x == numpy.nan` but `is_na` is cleaner i think – Joran Beasley May 21 '20 at 08:02
1

(also, `numpy.nan != numpy.nan`) – timgeb May 21 '20 at 08:04

score 1 · Answer 2 · answered May 21 '20 at 08:01

1

there is no datatype as NaN in python use pd.isna() to check if it's nan.

df[df['mort_acc'].apply(lambda x:pd.isna(x))]

answered May 21 '20 at 08:01

Epsi95

8,832
1
16
34

score 1 · Answer 3 · answered May 21 '20 at 08:05

1

This will give you rows where the column value is having NaN values.

df[df.mort_acc.isnull()]

answered May 21 '20 at 08:05

Pygirl

12,969
5
30
43

How to filter missing data rows using python

3 Answers3