How to get logical mask of two lists if lists contain np.NaN

Question

Logical OR and logical AND seems do not work when list have np.NaN. I produced simple example: If I will have ndarray filled with np.NaN, it do not work correctly:

import numpy as np
m =10
l1, l2 = np.array([np.NaN] * m), np.array([np.NaN] * m)
l1[3] = 5
l2[3] = 5
l1[5] = 6
l2[5] = 6
l2[7] = 7
l1[8] = 8

mask1 = (l1 != np.NaN) & (l2 != np.NaN)
mask0 = (l1 == np.NaN) | (l2 == np.NaN)
print("Lists:")
print(l1)
print(l2)
print()
print("Masks:")
print(mask1)
print(mask0)

It prints:

Lists:
[nan nan nan  5. nan  6. nan nan  8. nan]
[nan nan nan  5. nan  6. nan  7. nan nan]

Masks:
[ True  True  True  True  True  True  True  True  True  True] # not true
[False False False False False False False False False False] # not true

I expected:

Masks:
[False False False  True False  True False False False False]
[ True  True  True False  True False  True  True  True  True]

I made a test and I changed np.NaN to None and this fix a problem with logical operation, but before in my code I calculate items from lists and I have to compare items to value. Then I am getting TypeError:

TypeError: '<' not supported between instances of 'NoneType' and 'int'

How to change all np.NaN to None?

Don't confuse list and numpy array. Keep distinction clear in your mind and writing. Also `np.nan` is a special float value with unique equality properties. `None` is a unique python object. Pay attention to the `dtype` if your array has either. And, the string 'nan' is also different. — hpaulj, May 16 '23 at 20:57
Yes, I should pay attention for this difference. Perhaps i tried rewrite pice of code from pandas to numpy. Now I am wondering that comparation to np.NaN was a programming trick or just my fail. Comparation float or int to np.NaN is always False. — luki, May 16 '23 at 22:16
`None <4` would produce the last error. You didn't show exactly how you replaced and compared, but there isn't much you do with `None`.. `None` isn't a number. — hpaulj, May 17 '23 at 04:04

score 1 · Answer 1 · answered May 16 '23 at 16:55

1

Cannot use == and != on np.nan values nan is not equal to nan

here some deep explanation

you have to introduce np.isnan() in the algorithm

mask1 = (~np.isnan(l1)) & (~np.isnan(l2))
mask0 = (np.isnan(l1)) | (np.isnan(l2))

answered May 16 '23 at 16:55

Glauco

1,385
2
10
20

Thank you. I checked. Is working. I am wondering Which method is faster. Glauco VS @AkashDataScience. In my first steps I prefer your solution because after few months I will know what is going on. AkashDataScience's solution is also intriguing. Plus for you for link to deep explanation. – luki May 16 '23 at 21:04
Is that true if I say that we can calculate mask1 as mask1 = (~np.isnan(l1)) & (~np.isnan(l2)) and mask0 as mask0 = ~maks1? If @AkashDataScience did it in this way, here it should work. – luki May 18 '23 at 12:17
logically are the same, right – Glauco May 18 '23 at 12:19

score 1 · Answer 2 · answered May 16 '23 at 17:24

1

I think it's simple.

mask1 = l1==l2 
mask2 = ~mask1

answered May 16 '23 at 17:24

AkashDataScience

36
4

I see that bool's algebra is working on a list comparation. But is that possible to return bool mask for all logical operation? I see here that you show me something like logical AND for two lists and logical NAND for two lists. Have you solution for OR operation, NOR operation, EX-OR operation, EX-NOR operation on two lists? – luki May 16 '23 at 21:39

hpaulj · Answer 3 · 2023-05-17T05:12:35.360

Let's explore the alternatives in more detail

Your array (l2 similar)

In [18]: l1
Out[18]: array([nan, nan, nan,  5., nan,  6., nan, nan,  8., nan])

In [19]: l1.dtype
Out[19]: dtype('float64')

Using the isnan test:

In [20]: np.isnan(l1)
Out[20]: 
array([ True,  True,  True, False,  True, False,  True,  True, False,
        True])

and combineing them:

In [21]: np.isnan(l1)|np.isnan(l2)
Out[21]: 
array([ True,  True,  True, False,  True, False,  True,  True,  True,
        True])

since nan propagate, we could test after

In [25]: l1+l2
Out[25]: array([nan, nan, nan, 10., nan, 12., nan, nan, nan, nan])

l1==l2 is the equivalent of testing l1-l2 for 0s.

In [27]: l1-l2
Out[27]: array([nan, nan, nan,  0., nan,  0., nan, nan, nan, nan])

In [28]: l1==l2
Out[28]: 
array([False, False, False,  True, False,  True, False, False, False,
       False])

I could run some timeit tests, but for such small arrays they won't tell us a lot. np.isnan is a numpy compiled function and fast enough.

The unique thing about nan is that it never equals itself. Hense your true or false arrays.

In [36]: l1==np.nan
Out[36]: 
array([False, False, False, False, False, False, False, False, False,
       False])

Making your arrays with None, makes a object dtype (not float)

In [38]: l11
Out[38]: array([None, None, None, 5, None, 6, None, None, 8, None], dtype=object)

None is different that there is only one None, so is/== is true, but size comparisons are meaningless

In [51]: None is None
Out[51]: True

In [52]: None==None
Out[52]: True

In [53]: None==3
Out[53]: False

In [54]: None<3
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[54], line 1
----> 1 None<3

TypeError: '<' not supported between instances of 'NoneType' and 'int'

np.isnan can't be run on this l11 object dtype array, since it requires floats

I believe pandas has some broader tests thatn np.isnan but I don't have that installed on this computer.

How to get logical mask of two lists if lists contain np.NaN

3 Answers3