How to delete numpy nan from a list of strings in Python?

Question

I have a list of strings

x = ['A', 'B', nan, 'D']

and want to remove the nan.

I tried:

x = x[~numpy.isnan(x)]

But that only works if it contains numbers. How do we solve this for strings in Python 3+?

@JoshLee The `non` object from numpy module which the OP is using. I change it to numpy so that the future askers can find the question easily. — Mazdak, Mar 23 '17 at 18:14

Mazdak · Accepted Answer · 2017-03-23T15:37:54.767

6

If you have a numpy array you can simply check the item is not the string nan, but if you have a list you can check the identity with is and np.nan since it's a singleton object.

In [25]: x = np.array(['A', 'B', np.nan, 'D'])

In [26]: x
Out[26]: 
array(['A', 'B', 'nan', 'D'], 
      dtype='<U3')

In [27]: x[x != 'nan']
Out[27]: 
array(['A', 'B', 'D'], 
      dtype='<U3')


In [28]: x = ['A', 'B', np.nan, 'D']

In [30]: [i for i in x if i is not np.nan]
Out[30]: ['A', 'B', 'D']

Or as a functional approach in case you have a python list:

In [34]: from operator import is_not

In [35]: from functools import partial

In [37]: f = partial(is_not, np.nan)

In [38]: x = ['A', 'B', np.nan, 'D']

In [39]: list(filter(f, x))
Out[39]: ['A', 'B', 'D']

edited Mar 23 '17 at 15:37

answered Mar 23 '17 at 15:32

Mazdak

105,000
18
159
188

aggregate things like: `[i for i in x if not i in ['nan', np.nan]]`, +1 otherwise – Colonel Beauvel Mar 23 '17 at 15:34
@ColonelBeauvel Yeah, that's a good idea if you don't know that kind of a data structure you're dealing with. – Mazdak Mar 23 '17 at 15:38
NaN is not a singleton. – Josh Lee Mar 23 '17 at 17:55
@JoshLee Why? I think as far as you can't create different instances from a particular object it would be refer as singleton. Is there anything special about `np.nan`? – Mazdak Mar 23 '17 at 18:10
`np.nan` is just some floating point constant. You wouldn't compare `is math.pi` either, for the same reason. – Josh Lee Mar 23 '17 at 18:16
@JoshLee Well, IMHO, it doesn't make any difference, almost everything in python is object even the code, and once something is object it can be singleton or a regular object (AFAIK) like integers between -5 to 256 or other single tones in python that get cached in memory instead of having multiple instances with different ids. – Mazdak Mar 23 '17 at 18:41

score 3 · Answer 2 · answered Mar 23 '17 at 15:28

3

You can use math.isnan and a good-old list comprehension.

Something like this would do the trick:

import math
x = [y for y in x if not math.isnan(y)]

answered Mar 23 '17 at 15:28

Horia Coman

8,681
2
23
25

Did you try `math.isnan('A')`? Test on the OP's `x`? – hpaulj Mar 23 '17 at 16:51

score 1 · Answer 3 · answered Mar 23 '17 at 15:30

1

You may want to avoid np.nan with strings, use None instead; but if you do have nan you could do this:

import numpy as np

[i for i in x if i is not np.nan]
# ['A', 'B', 'D']

answered Mar 23 '17 at 15:30

Psidom

209,562
33
339
356

NaN is not a singleton. – Josh Lee Mar 23 '17 at 17:56
@JoshLee I didn't say it's a singleton. I just said it might be better to use None instead of `nan` in string cases, which will be converted to a string `nan` but `None` stays as `None`. – Psidom Mar 23 '17 at 18:04
You're comparing with `is`. This will fail. – Josh Lee Mar 23 '17 at 18:05
@JoshLee I didn't get what you mean. It works for this case. – Psidom Mar 23 '17 at 18:12

score 1 · Answer 4 · answered Mar 23 '17 at 15:32

1

You could also try this:

[s for s in x if str(s) != 'nan']

Or, convert everything to str at the beginning:

[s for s in map(str, x) if s != 'nan']

Both approaches yield ['A', 'B', 'D'].

answered Mar 23 '17 at 15:32

blacksite

12,086
10
64
109

How to delete numpy nan from a list of strings in Python?

4 Answers4

Linked