4

Care needs to be taken when checking for equality between floating point numbers, and should usually be done with a tolerance in mind, using e.g. numpy.allcose.

Question 1: Is it safe to check for the occurrence of a specific floating point number using the "in" keyword (or are there similar keywords/functions for this purpose)? Example:

if myFloatNumber in myListOfFloats:
  print('Found it!')
else:
  print('Sorry, no luck.')

Question 2: If not, what would be a neat and tidy solution?

cglacet
  • 8,873
  • 4
  • 45
  • 60
fromGiants
  • 474
  • 1
  • 7
  • 16
  • Yeaass, why do you get a bad vibe for it? – DirtyBit Mar 19 '19 at 10:44
  • 1
    I think the remark is valid if you compute `myFloatNumber` with a different formula than for the elements of `myListOfFloats`. If you get these number with the exact same code then all errors will be the same and you won't have any problem. – cglacet Mar 19 '19 at 10:45
  • Or perhaps a sampl I/O would do it – DirtyBit Mar 19 '19 at 10:46
  • An example would be when myListOfFloats is read from a data file produced with e.g. Fortran or C++, and myFloatNumber is calculated in Python. – fromGiants Mar 19 '19 at 10:51
  • 1
    Yes you have to be careful with this, for instance `3.4 in [3.45, 5] -> False`. One option is to cut all floats in the list up to the legnth of the float in question using the `decimal` module and check then if it is in the list – yatu Mar 19 '19 at 10:53
  • 2
    Tidy solution: `np.isclose(, ).any()`. You can pass tolerances to `isclose` if needed. – Paul Panzer Mar 19 '19 at 11:36

2 Answers2

4

If you don't compute your floats in the same place or with the exact same equation, then you might have false negatives with this code (because of rounding errors). For example:

>>> 0.1 + 0.2 in [0.6/2, 0.3]  # We may want this to be True
False

In this case, we can just have a custom "in" function that will actually make this true (in this case it may be better/faster to use numpy.isclose instead of numpy.allclose):

import numpy as np 

def close_to_any(a, floats, **kwargs):
  return np.any(np.isclose(a, floats, **kwargs))

There is an important note in the documentation:

Warning The default atol is not appropriate for comparing numbers that are much smaller than one (see Notes). [...] if the expected values are significantly smaller than one, it can result in false positives.

The note adds that atol is not zero contrary to math.isclose's abs_tol. If you need a custom tolerance when using close_to_any, use the kwargs to pass rtol and/or atol down to numpy. In the end, your existing code would translate to this:

if close_to_any(myFloatNumber, myListOfFloats):
  print('Found it!')
else:
  print('Sorry, no luck.')

Or you could have some options close_to_any(myFloatNumber, myListOfFloats, atol=1e-12), note that 1e-12 is arbitrary and you shouldn't use this value unless you have a good reason to.

Coming back to the rounding error we observed in the first example, this would give:

>>> close_to_any(0.1 + 0.2, [0.6/2, 0.3])
True
cglacet
  • 8,873
  • 4
  • 45
  • 60
  • Thank you for the code example. I wonder why there is no official keyword / numpy function for this already... – fromGiants Mar 19 '19 at 10:55
  • 1
    I think that it would make sense for such a function to exist, but on the other hand if you look at my edit, this is basically a combination of `np.allclose` and `any`. I tried to find it in the documentation but I think it doesn't exist. – cglacet Mar 19 '19 at 11:21
1

Q1: Depends on how you are going to implement this. But as others mentioned with floats its not such a good idea to use in operator.

Q2: Do you have any restrictions performance-wise? Will myListOfFloats be sorted?

If it is a sorted list of float values and if you need to do it as fast as you possibly can, you can implement a binary search algorithm.

If the data is not sorted, depending on the ratio between number of queries you will be making and the size of the data, you might want to sort the data and keep it sorted.

If you dont have any requirements on performance and speed you can use the following example as a basis:

def inrng(number1,number2,prec):
   if(abs(number1-number2)<prec):
      return True
   else:
      return False


precision=0.001
for i in myListOfFloats:
   if(inrng(i,myInputNumber,precision)):
      #do stuff