Python Pandas find all rows where all values are NaN

Question

So I have a dataframe with 5 columns. I would like to pull the indices where all of the columns are NaN. I was using this code:

nan = pd.isnull(df.all)

but that is just returning false because it is logically saying no not all values in the dataframe are null. There are thousands of entries so I would prefer to not have to loop through and check each entry. Thanks!

piRSquared · Accepted Answer · 2016-08-10T22:48:18.363

38

It should just be:

df.isnull().all(1)

The index can be accessed like:

df.index[df.isnull().all(1)]

Demonstration

np.random.seed([3,1415])
df = pd.DataFrame(np.random.choice((1, np.nan), (10, 2)))
df

idx = df.index[df.isnull().all(1)]
nans = df.ix[idx]
nans

Timing

code

np.random.seed([3,1415])
df = pd.DataFrame(np.random.choice((1, np.nan), (10000, 5)))

edited Aug 10 '16 at 22:48

answered Aug 10 '16 at 22:29

piRSquared

285,575
57
475
624

1

Why `all(1)`? I see that's the correct answer to this problem but I can't wrap my head around it. We have a table of Trues and Falses, and we want all the *rows* where there are only (all) True values. So why look at the column axis (1) rather than the index (0)? – Jinx Jan 10 '22 at 18:19
@Jinx the `all(1)` is interesting isn't it? If you try just plain old `all()`, or more explicitly `all(axis=0)`, you'll find that Pandas calculates the value *per column*. By specifying `all(1)`, or more explicitly `all(axis=1)`, you're checking if all values are null *per row*. For more detail, see the documentation for [all](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.all.html) – Matthew Walker Feb 14 '22 at 21:15

Alexander · Answer 2 · 2016-08-10T22:35:19.863

Assuming your dataframe is named df, you can use boolean indexing to check if all columns (axis=1) are null. Then take the index of the result.

np.random.seed(0)
df = pd.DataFrame(np.random.randn(5, 3))
df.iloc[-2:, :] = np.nan
>>> df
          0         1         2
0  1.764052  0.400157  0.978738
1  2.240893  1.867558 -0.977278
2  0.950088 -0.151357 -0.103219
3       NaN       NaN       NaN
4       NaN       NaN       NaN

nan = df[df.isnull().all(axis=1)].index

>>> nan
Int64Index([3, 4], dtype='int64')

score 0 · Answer 3 · edited May 23 '17 at 12:32

0

From the master himself: https://stackoverflow.com/a/14033137/6664393

nans = pd.isnull(df).all(1).nonzero()[0]

edited May 23 '17 at 12:32

Community

1
1

answered Aug 10 '16 at 22:30

user357269

1,835
14
40

Python Pandas find all rows where all values are NaN

3 Answers3

Demonstration

Timing

Linked