Find integer index of rows with NaN in pandas dataframe

Question

I have a pandas DataFrame like this:

                    a         b
2011-01-01 00:00:00 1.883381  -0.416629
2011-01-01 01:00:00 0.149948  -1.782170
2011-01-01 02:00:00 -0.407604 0.314168
2011-01-01 03:00:00 1.452354  NaN
2011-01-01 04:00:00 -1.224869 -0.947457
2011-01-01 05:00:00 0.498326  0.070416
2011-01-01 06:00:00 0.401665  NaN
2011-01-01 07:00:00 -0.019766 0.533641
2011-01-01 08:00:00 -1.101303 -1.408561
2011-01-01 09:00:00 1.671795  -0.764629

Is there an efficient way to find the "integer" index of rows with NaNs? In this case the desired output should be [3, 6].

If you just want to select the rows with nan, you can do `df[np.isnan(df['b'])]` — Miki Tebeka, Dec 24 '12 at 03:38
Following up from @lazy1 - instead of using `numpy`'s `isnan` you can also use `df['b'].isnull()` — jmetz, Mar 31 '15 at 20:42

score 158 · Answer 1 · answered Dec 25 '12 at 18:41

158

Here is a simpler solution:

inds = pd.isnull(df).any(1).nonzero()[0]

In [9]: df
Out[9]: 
          0         1
0  0.450319  0.062595
1 -0.673058  0.156073
2 -0.871179 -0.118575
3  0.594188       NaN
4 -1.017903 -0.484744
5  0.860375  0.239265
6 -0.640070       NaN
7 -0.535802  1.632932
8  0.876523 -0.153634
9 -0.686914  0.131185

In [10]: pd.isnull(df).any(1).nonzero()[0]
Out[10]: array([3, 6])

answered Dec 25 '12 at 18:41

Wes McKinney

101,437
32
142
108

34

I ended up using this: `np.where(df['b'].notnull())[0]` – Dec 25 '12 at 19:16
thanks, `.nonzero()[0]` is better than `[i for i, k in enumerate(mask) if k]` .) – Winand Feb 04 '16 at 06:51
8

You could probably simplify this further: `r, _ = np.where(df.isna())` – cs95 Jan 22 '19 at 04:50
6

add `.to_numpy()` to convert in numpy array first - `pd.isnull(df).any(1).to_numpy().nonzero()` – 7bStan Nov 06 '19 at 07:21
6

AttributeError: 'Series' object has no attribute 'nonzero' – huang Jul 18 '21 at 15:17
1

for pandas version 0.25 and on use pd.isnull(df).any(1).to_numpy().nonzero() as [7bStan](https://stackoverflow.com/users/6780081/7bstan) mentioned. This will fix [Joe Huang](https://stackoverflow.com/users/3907561/joe-huang)'s problem. – wueb Nov 16 '22 at 07:40
Very simple pandas solution: inds = df[df.isnull()].index You can easily find the null index for a specific column or a list of columns. – Marcio Bernardo Apr 12 '23 at 14:36

score 54 · Accepted Answer · answered Dec 24 '12 at 03:02

54

For DataFrame df:

import numpy as np
index = df['b'].index[df['b'].apply(np.isnan)]

will give you back the MultiIndex that you can use to index back into df, e.g.:

df['a'].ix[index[0]]
>>> 1.452354

For the integer index:

df_index = df.index.values.tolist()
[df_index.index(i) for i in index]
>>> [3, 6]

answered Dec 24 '12 at 03:02

diliop

9,241
5
28
23

1

As intuitive as `ix` sounds, for some reasons it sounds like it has been [deprecated](https://stackoverflow.com/a/31593712/4288043) in favour of `iloc` – cardamom Apr 16 '18 at 13:36

score 27 · Answer 3 · answered Jan 15 '19 at 11:23

27

One line solution. However it works for one column only.

df.loc[pandas.isna(df["b"]), :].index

answered Jan 15 '19 at 11:23

Vasyl Vaskivskyi

917
12
15

This is what I was looking for. I made it into a list by wrapping it in a `list(...)` just like this:`list(df.loc[pandas.isna(df["b"]), :].index)` – Daniel Butler Jul 02 '20 at 17:50

score 12 · Answer 4 · answered Sep 07 '17 at 14:49

And just in case, if you want to find the coordinates of 'nan' for all the columns instead (supposing they are all numericals), here you go:

df = pd.DataFrame([[0,1,3,4,np.nan,2],[3,5,6,np.nan,3,3]])

df
   0  1  2    3    4  5
0  0  1  3  4.0  NaN  2
1  3  5  6  NaN  3.0  3

np.where(np.asanyarray(np.isnan(df)))
(array([0, 1]), array([4, 3]))

score 11 · Answer 5 · edited May 03 '19 at 20:43

11

Don't know if this is too late but you can use np.where to find the indices of non values as such:

indices = list(np.where(df['b'].isna()[0]))

edited May 03 '19 at 20:43

Gursewak Singh

172
1
6

answered Sep 11 '18 at 13:07

naturesenshi

330
6
15

score 6 · Answer 6 · answered May 03 '19 at 21:34

6

in the case you have datetime index and you want to have the values:

df.loc[pd.isnull(df).any(1), :].index.values

answered May 03 '19 at 21:34

Amirkhm

948
11
13

score 5 · Answer 7 · answered Aug 28 '19 at 18:02

Here are tests for a few methods:

%timeit np.where(np.isnan(df['b']))[0]
%timeit pd.isnull(df['b']).nonzero()[0]
%timeit np.where(df['b'].isna())[0]
%timeit df.loc[pd.isna(df['b']), :].index

And their corresponding timings:

333 µs ± 9.95 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
280 µs ± 220 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
313 µs ± 128 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
6.84 ms ± 1.59 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

It would appear that pd.isnull(df['DRGWeight']).nonzero()[0] wins the day in terms of timing, but that any of the top three methods have comparable performance.

score 3 · Answer 8 · answered Dec 18 '19 at 15:03

3

Another simple solution is list(np.where(df['b'].isnull())[0])

answered Dec 18 '19 at 15:03

karthikeyan

195
1
10

score 3 · Answer 9 · answered Jun 16 '21 at 03:12

3

This will give you the index values for nan in every column:

df.loc[pd.isna(df).any(1), :].index

answered Jun 16 '21 at 03:12

Xpie

43
6

This creates a new data frame with all rows containing Nan values, the returns its index – Nixon Kosgei Mar 21 '22 at 01:05

score 1 · Answer 10 · edited May 10 '18 at 04:22

1

Here is another simpler take:

df = pd.DataFrame([[0,1,3,4,np.nan,2],[3,5,6,np.nan,3,3]])

inds = np.asarray(df.isnull()).nonzero()

(array([0, 1], dtype=int64), array([4, 3], dtype=int64))

edited May 10 '18 at 04:22

Ryan Schaefer

3,047
1
26
46

answered May 03 '18 at 17:14

nonya beeswax

11
3

murthy10 · Answer 11 · 2018-10-04T16:07:33.503

I was looking for all indexes of rows with NaN values.
My working solution:

def get_nan_indexes(data_frame):
    indexes = []
    print(data_frame)
    for column in data_frame:
        index = data_frame[column].index[data_frame[column].apply(np.isnan)]
        if len(index):
            indexes.append(index[0])
    df_index = data_frame.index.values.tolist()
    return [df_index.index(i) for i in set(indexes)]

score 0 · Answer 12 · edited May 20 '19 at 14:40

0

Let the dataframe be named df and the column of interest(i.e. the column in which we are trying to find nulls) is 'b'. Then the following snippet gives the desired index of null in the dataframe:

   for i in range(df.shape[0]):
       if df['b'].isnull().iloc[i]:
           print(i)

edited May 20 '19 at 14:40

nassim

1,547
1
14
26

answered May 20 '19 at 11:33

Stone Austin

1

score 0 · Answer 13 · answered Dec 02 '21 at 11:27

0

    index_nan = []
        for index, bool_v in df["b"].iteritems().isna():
           if bool_v == True:
               index_nan.append(index)
    print(index_nan)

answered Dec 02 '21 at 11:27

KRUNALg

1

score 0 · Answer 14 · answered Apr 10 '23 at 05:30

0

The quick and fast solution to the question is:

# Find the integer index of nulls
nan_idx = np.where(df['column_name'].isnull())[0]

# Find actual index of the nan's
nan_idx = df.iloc[nan_idx].index

answered Apr 10 '23 at 05:30

Mainland

4,110
3
25
56

score 0 · Answer 15 · answered Apr 12 '23 at 14:42

0

Easy solution:

# Find the index of nulls

indx = df[df.isnull()].index

# Find the index of nulls of a column or a group of columns

indx_A = df[df['A'].isnull()].index 

col_list = ['A','B','C']

indx_col_list = df[df[col_list].isnull()].index

answered Apr 12 '23 at 14:42

Marcio Bernardo

71
7

Archie · Answer 16 · 2023-06-20T09:52:43.767

0

A DataFrame object has a built in function isna() these days, which means you could also solve it as follows:

In case one NaN value is sufficient to return the index:

index_na = df.index[df.isna().any(1)]

In case all of them have to be NaN:

index_na = df.index[df.isna().all(1)]

To return the numeric index for the first case:

index_na_num = np.where(df.isna().any(1)[0])

edited Jun 20 '23 at 09:52

answered Jun 20 '23 at 09:46

Archie

2,247
1
18
35

Find integer index of rows with NaN in pandas dataframe

16 Answers16

Linked