In pandas, How to select the rows that contains NaN?

Question

Suppose I have the following dataframe in df:

a     | b     | c
------+-------+-------
5     | 2     | 4
NaN   | 6     | 8
5     | 9     | 0
3     | 7     | 1

If I do df.loc[df['a'] == 5] it will correctly return the first and third row, but then if I do a df.loc[df['a'] == np.NaN] it returns nothing.

I think this is more a python thing than a pandas one. If I compare np.nan against anything, even np.nan == np.nan will evaluate as False, so the question is, how should I test for np.nan?

The target is a little more complicated but basically you do the null checking with `df['a'].isnull()` or `pd.isnull(df['a'])`. And the selection is easy after that: `df[df['a'].isnull()]` — ayhan, Sep 14 '16 at 16:05
you can use numpy.isnan() which gives you a boolean array of the same shape as the input array — dnalow, Sep 14 '16 at 16:08
In general, I'd avoid using `np.isnan` on DataFrames. It's not as robust as `pd.isnull`, which has the same functionality. For example, compare what happens when you try `np.isnan(df['a'])` with `pd.isnull(df['a'])` when `df = pd.DataFrame({'a': ['x', np.nan, 'y']})`. — root, Sep 14 '16 at 16:41
Thanks guys, I used both `ìsnull()` and `isnan()` and got the same results that did what I wanted. Why you didn't post your answers as answers? — luisfer, Sep 14 '16 at 17:38
This tutorial could be helpful: https://chartio.com/resources/tutorials/how-to-check-if-any-value-is-nan-in-a-pandas-dataframe — estebanpdl, Sep 14 '16 at 21:39

score 5 · Accepted Answer · answered Sep 14 '16 at 16:12

5

Try using isnull like so:

    import pandas as pd
    import numpy as np

    a=[1,2,3,np.nan,5,6,7]
    df = pd.DataFrame(a)

    df[df[0].isnull()]

answered Sep 14 '16 at 16:12

Tommy

622
5
8

In pandas, How to select the rows that contains NaN?

1 Answers1