Removing NaNs in numpy arrays

Question

I have two numpy arrays that contains NaNs:

A = np.array([np.nan,   2,   np.nan,   3,   4])
B = np.array([   1  ,   2,     3   ,   4,  np.nan])

are there any smart way using numpy to remove the NaNs in both arrays, and also remove whats on the corresponding index in the other list? Making it look like this:

A = array([  2,   3, ])
B = array([  2,   4, ])

are you also using `pandas`? (there's [`dropna`](http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.dropna.html#pandas.DataFrame.dropna) that does what you want with `DataFrame` objects) — Kos, Mar 18 '15 at 11:18

EdChum · Accepted Answer · 2015-03-18T12:04:13.817

What you could do is add the 2 arrays together this will overwrite with NaN values where they are none, then use this to generate a boolean mask index and then use the index to index into your original numpy arrays:

In [193]:

A = np.array([np.nan,   2,   np.nan,   3,   4])
B = np.array([   1  ,   2,     3   ,   4,  np.nan])
idx = np.where(~np.isnan(A+B))
idx
print(A[idx])
print(B[idx])
[ 2.  3.]
[ 2.  4.]

output from A+B:

In [194]:

A+B
Out[194]:
array([ nan,   4.,  nan,   7.,  nan])

EDIT

As @Oliver W. has correctly pointed out, the np.where is unnecessary as np.isnan will produce a boolean index that you can use to index into the arrays:

In [199]:

A = np.array([np.nan,   2,   np.nan,   3,   4])
B = np.array([   1  ,   2,     3   ,   4,  np.nan])
idx = (~np.isnan(A+B))
print(A[idx])
print(B[idx])
[ 2.  3.]
[ 2.  4.]

@OliverW. Hmm. yes you are correct, I was originally thinking that I needed the integer indices but yes it's unnecessary here, I'll update my answer — EdChum, Mar 18 '15 at 12:02

score 8 · Answer 2 · answered Mar 18 '15 at 11:22

8

A[~(np.isnan(A) | np.isnan(B))]

B[~(np.isnan(A) | np.isnan(B))]

answered Mar 18 '15 at 11:22

FuzzyDuck

1,492
12
14

Removing NaNs in numpy arrays

2 Answers2

Linked