1

I'm trying to check if a number is in an array generated by numpy's arange function. When I print out all the values in the array, I see what I'm looking for but when I use the in key word, it returns False.

Example:

d = np.arange(2.9, 3.7, 0.1)

>>> print d
[ 2.9  3.   3.1  3.2  3.3  3.4  3.5  3.6  3.7]
>>> 3.1 in d
True
>>> 3.6 in d
False
>>> 3.3 in d
False
>>> 3.2 in d
True

Any explanation for this? Thanks in advance!

Drewness
  • 5,004
  • 4
  • 32
  • 50
azdonald
  • 336
  • 1
  • 3
  • 16

3 Answers3

3

Floating point numbers are not infinitely precise. You cannot exactly specify the number 3.6 in IEEE floating point because the fractional part is expressed in binary instead of decimal. The 3.6 in the array is a slightly different number very close to 3.6 than the one you are testing inclusion with. For instance, if I do

d = numpy.arange(0, 1, 0.1)
0.3 in d
>>> False
3 * 0.1 in d
>>> True

I know that's confusing and annoying, but that's just the way floating point numbers work. The way to get around this is to try to use integers instead. What are you trying to do that requires testing for inclusion of a float in an array? Can you instead solve it using integers? In your specific case, you could do this by multiplying everything by 10 and converting to integers.

Zhehao Mao
  • 1,789
  • 13
  • 13
1

Consider Zhehao Mao's answer and whether you really need this. That said, if you really, really need to do this, consider using numpy's isclose function followed by numpy's any

As explained here and documented here

Community
  • 1
  • 1
Miquel
  • 15,405
  • 8
  • 54
  • 87
0

Due to floating point approximations, exact equality is ill-defined here. You can use a tolerance similar to the relative and absolute tolerance used in np.isclose(..., rtol=..., atol=...).

Here is an example encapsulating the test into a function. Note that the very last value of np.arange should be smaller than stop, but could be very close to stop due to floating point approximations. Note that this function doesn't generate all the intermediate values, which helps memory consumption when the arrays would be very large.

import numpy as np

def is_in_arange(val, start, stop, step=1.0, rtol=1e-05, atol=1e-08):
    if val >= start and val < stop:
        closest = start + np.round((val - start) / step) * step
        return np.isclose(val, closest, rtol=rtol, atol=atol)
    return False

print(is_in_arange(3.6, 2.9, 3.7, 0.1))
JohanC
  • 71,591
  • 8
  • 33
  • 66