3

Basicaly I want to compare a variable between two np.arange()

x = 22.03
first = np.arange(18.5, 24.99, 0.01)
second = np.arange(25.0, 29.99, 0.01)

if x in first:
    print("x is in first")
elif x in second:
    print("x is in second")

I expect to see "x isin first" but rather I get nothing printed on the terminal. If I add another else: statement it will execute whatever is in that.

I am using numpy because I want to have a range of floats. The native range() function doesn't support floats

There happens no comparison between the two, why is that?

petezurich
  • 9,280
  • 9
  • 43
  • 57
Neo
  • 33
  • 3
  • please, refer to ["In" operator for numpy arrays?](https://stackoverflow.com/questions/39452843/in-operator-for-numpy-arrays) – Suthiro Feb 10 '22 at 19:30
  • Did you see [this answer](https://stackoverflow.com/a/42999216/3281097) in another thread? It suggest using `np.linspace` instead of `np.arange` when using small non-integer steps as in your case – aaossa Feb 10 '22 at 19:33
  • 1
    `if np.isclose(first, x).any(): print("x is in first")` – Michael Szczesny Feb 10 '22 at 19:35
  • The problem is that `22.03` cannot be represented exactly in binary. It's an infinitely repeating "decimal", so what you get is an approximation. The same applies to `0.01`. `18.5` can be represented exactly, but as you add the increment one by one, you accumulate more and more rounding errors. So, you get to something close to `22.03`, but not as close as the constant `22.03`. That's why `isclose` is a better option. – Tim Roberts Feb 10 '22 at 19:50

2 Answers2

4

Your array:

In [254]: first = np.arange(18.5, 24.99, 0.01)
In [255]: first.shape
Out[255]: (649,)
In [256]: first[:10]
Out[256]: 
array([18.5 , 18.51, 18.52, 18.53, 18.54, 18.55, 18.56, 18.57, 18.58,
       18.59])
In [257]: x=22.03

x isn't "found"

In [258]: x in first
Out[258]: False

Let's look for a close match:

In [259]: np.nonzero(np.isclose(x,first))
Out[259]: (array([353]),)
In [260]: first[353]
Out[260]: 22.030000000000552

The closest match is still a bit off - due to floating point calculations. 'in/equal' tests on floating point values are not reliable.

arange recommends linspace when using float steps. The resulting values are a closer match to our expectations:

In [264]: first1 = np.linspace(18.5,24.98,len(first))
In [265]: np.nonzero(np.isclose(x,first1))
Out[265]: (array([353]),)
In [266]: first1[353]
Out[266]: 22.03
In [267]: x in first1
Out[267]: True

There still is a potential for a float mismatch.

To better see the full precision of the floats, lets display the arrays as lists

In [268]: first[:10].tolist()
Out[268]: 
[18.5,
 18.51,
 18.520000000000003,
 18.530000000000005,
 18.540000000000006,
 18.550000000000008,
 18.56000000000001,
 18.57000000000001,
 18.580000000000013,
 18.590000000000014]
In [269]: first1[:10].tolist()
Out[269]: [18.5, 18.51, 18.52, 18.53, 18.54, 18.55, 18.56, 18.57, 18.58, 18.59]
hpaulj
  • 221,503
  • 14
  • 230
  • 353
1
x = 22.03
first = np.arange(18.5, 24.99, 0.01)
second = np.arange(25.0, 29.99, 0.01)

type(first)
Out[13]: numpy.ndarray

# cast to list and proper value rounding for boolean comparison
first = [round(n, 2) for n in first]

x in first
Out[16]: True
Goran B.
  • 542
  • 4
  • 14
  • 1
    Although this seems to solve the problem at hand, that's just an accident. `round` is a dangerous function for floating point values. The result is STILL an approximation, and you have lost potential information. As a general rule, you should keep your data with its natural precision until the point where you need to show it to a human, and only then do the rounding. – Tim Roberts Feb 10 '22 at 19:55
  • you're correct Tim – Goran B. Feb 10 '22 at 21:59