4

I want to delete nan from a numpy array. Lets say my numpy array contains:

np_array = ["123","pqr","123",nan,"avb", nan]

Expected output:

["123","pqr","123","avb"]

If we do it in pandas using pandas.dropna() it deletes the whole row which I don't want to do. I just want to delete the value and reduce the array size.

Is there any possible way to do so?

MSeifert
  • 145,886
  • 38
  • 333
  • 352
Rohan Nagalkar
  • 433
  • 2
  • 5
  • 15
  • 1
    These proposed duplicates only work for numerical arrays (so these are not really duplicates)! – MSeifert Feb 22 '17 at 10:35
  • I am assuming you would not like to delete the rows either? – AsheKetchum Feb 22 '17 at 13:52
  • Have you looked at pandas.fillna()? In general, if a row is considered an observation, we would try to conserve the entire row and not only erase the nan values within the row. – AsheKetchum Feb 22 '17 at 13:54

4 Answers4

2

You can't use np.isnan because the NaNs are strings in your array but you can use boolean indexing by comparing with the string: "nan":

>>> import numpy as np
>>> np_array = np.array(["123","pqr","123",np.nan,"avb", np.nan])
>>> np_array[np_array != 'nan']
array(['1234', 'pqr', '123', 'avb'], 
      dtype='<U4')
MSeifert
  • 145,886
  • 38
  • 333
  • 352
1

isnan() should do the trick. Working minimal example on how to do it:

>>> import numpy as np
>>> np_array = np.array([1,2,3,np.nan,4])
>>> np_array
array([  1.,   2.,   3.,  nan,   4.])
>>> np_array = np_array[~np.isnan(np_array)]
>>> np_array
array([ 1.,  2.,  3.,  4.])
Christian W.
  • 2,532
  • 1
  • 19
  • 31
  • 1
    does not work: error : *** TypeError: ufunc 'isnan' not supported for the input types, and the inputs c ting rule ''safe'' its dtype is object – Rohan Nagalkar Feb 22 '17 at 09:45
  • 1
    Check the `dtype` of your array. `isnan` raises that error for object arrays. If there's no reason to have an object array, you can do `arr.dtype=np.float64` or whatever you want and then `isnan` will work. If you do need objects, use MSeifert's answer above. – Daniel F Feb 22 '17 at 10:03
0

Try this

np_clean = [x for x in np_array if str(x) != 'nan']

It will remove nan from your list

pushpendra chauhan
  • 2,205
  • 3
  • 20
  • 29
0

This works for numerical arrays.

filter(lambda x: np.isfinite(x), np.array([1,2,3,np.nan]))

>>>[1.0, 2.0, 3.0]
AsheKetchum
  • 1,098
  • 3
  • 14
  • 29