13

The future warning happens when you do something like this:

>>> numpy.asarray([1,2,3,None]) == None

Which currently returns False, but I understand will return an array containing [False,False,False,True] in a future version of Numpy.

As discussed on the numpy discussion list the way around this is to testa is None.

What confuses me is this behaviour of the in keyword with a 1D array compared to a list:

>>> None in [1,2,3,None]
True
>>> None in numpy.asarray([1,2,3,None])
__main__:1: FutureWarning: comparison to 'None' will result in an elementwise 
    object comparison in the future
False
>>> 1 in numpy.asarray([1,2,3,None])
True

EDIT (see comments) - There are really two different questions:

  1. Why does this cause a FutureWarning - what will the future behaviour of None in numpy.asarray(...) be compared to what it is now?
  2. Why the difference in behaviour of in from a list; can I test if my array contains None without converting it to a list or using a for loop?

Numpy version is 1.9.1, Python 3.4.1

denis
  • 21,378
  • 10
  • 65
  • 88
szmoore
  • 924
  • 10
  • 18
  • 1
    Why would your array contain `None`? Look at the `dtype` of an example with `None`. Is that what you want? Are you, by any chance, confusing `None` with `np.nan`? – hpaulj Feb 05 '15 at 06:06
  • I am using data that could contain `None`. I know that `None` is not `np.nan`. My exact situation is more complicated than my example but that's not really relevant to the question. Looking at the `dtype` does work though. Thanks. Incidentally, `np.nan in np.asarray([1,2,3,np.nan])` will also return `False`. So maybe my question should be about `in` and numpy arrays in general. – szmoore Feb 05 '15 at 06:17
  • Although, `dtype` will still be `object` if you take a slice from a matrix which has `None` in it, even if that slice doesn't have a `None` in it. – szmoore Feb 05 '15 at 06:22
  • The question of `in` for arrays came up in another question recently. It appears to work, but may actually be acting on `list(array...)`. I don't think it is the right test for arrays. – hpaulj Feb 05 '15 at 08:15
  • Look at the code for `np.in1d`. That seems to be the preferred array tester. But it still has issues when it comes to testing `None` and `nan`. – hpaulj Feb 05 '15 at 08:21
  • It's clear that the `FutureWarning` and the `in` behaviour are probably different issues, although I'm also curious as to why `None in array` causes a `FutureWarning` as well as `array == None`. – szmoore Feb 05 '15 at 08:34
  • 2
    @matches `np.nan in np.asarray([1,2,3,np.nan])` returns `False` because `np.nan != np.nan` (see [here](http://stackoverflow.com/a/1573715/1461210) for more details). What does a value of `None` *mean* in the context of your data? If you're using `None` to represent missing data, it would make much more sense to use `np.nan`, or better yet, switch to [masked arrays](http://docs.scipy.org/doc/numpy/reference/maskedarray.html). – ali_m Feb 05 '15 at 10:01
  • 1
    Also, in future versions of numpy your example `np.array([1, 2, 3, 4]) == None` would return `np.array([False ,False, False, False])`. The new behaviour will be consistent with how element-wise comparisons to scalar numerical values currently work, e.g. `np.array([1, 2, 3, 4]) == 1` returns `np.array([True ,False, False, False])` – ali_m Feb 05 '15 at 10:29
  • `np.nan != np.nan` and `np.nan in np.array([1,2,3,np.nan])` returns `False`... but `np.nan in [1,2,3,np.nan]` seems to return `True`. Strange. – szmoore Feb 05 '15 at 13:16
  • masked arrays do look like they will solve my problem, however I still think a more detailed explanation of the behaviour of `in` on arrays might be of interest to someone, so I won't delete the question. – szmoore Feb 05 '15 at 13:18

1 Answers1

7

The future warning happens when you do something like this:

numpy.asarray([1,2,3,4]) == None

Which currently returns False, but I understand will return an array containing [False,False,False,True] in a future version of Numpy.

As I mentioned in the comments, your example is incorrect. Future versions of numpy would return [False ,False, False, False], i.e. False for each element in the array that is not equal to None. This is more consistent with how element-wise comparisons to other scalar values currently work, e.g.:

In [1]: np.array([1, 2, 3, 4]) == 1
Out[1]: array([ True, False, False, False], dtype=bool)

In [2]: np.array(['a', 'b', 'c', 'd']) == 'b'
Out[2]: array([False,  True, False, False], dtype=bool)

What confuses me is this behaviour of the in keyword with a 1D array compared to a list

When you test x in y, you are calling y.__contains__(x). When y is a list, __contains__ basically does something along the lines of this:

for item in y:
    if (item is x) or (item == x):
        return True
return False

As far as I can tell, np.ndarray.__contains__(x) performs the equivalent of this:

if any(y == x):
    return True
else:
    return False

That is to say it tests element-wise equality over the whole array first (y == x would be a boolean array the size of y). Since in your case you are testing whether y == None, this will raise the FutureWarning for the reasons given above.

In the comments you also wanted to know why

np.nan in np.array([1, 2, 3, np.nan])

returns False, but

np.nan in [1, 2, 3, np.nan]

returns True. The first part is easily explained by the fact that np.nan != np.nan (see here for the rationale behind this). To understand why the second case returns True, remember that list.__contains__() first checks for identity (is) before checking equality (==). Since np.nan is np.nan, the second case will return True.

Community
  • 1
  • 1
ali_m
  • 71,714
  • 23
  • 223
  • 298
  • 1
    Sorry! I didn't notice my example was supposed to be `[1,2,3,None] == None` not `[1,2,3,4] == None` - I have corrected the question. – szmoore Feb 09 '15 at 00:46