1

While testing mpi4py's comm.reduce() and comm.Reduce() methods in python 2.7.3 I encountered the following behaviour:

  • sometimes subtracting two complex numbers (type 'numpy.complex128', which are the output of some parallel calculation) that appear identical when printed on the screen produces a non-zero result

  • comparing them with == occasionally yields False.

Example:

print z1, z2, z1-z2
(0.268870295763-0.268490433604j) (0.268870295763-0.268490433604j) 0j
print z1 == z2
True

but then

print z1, z2, z1-z2
(0.226804302192-0.242683516175j) (0.226804302192-0.242683516175j) (-2.77555756156e-17+5.55111512313e-17j)
print z1 == z2
False

I figured this had something to do with the finite precision of floats, so I resorted to just checking whether the difference abs(z1-z2) is bigger than 1e-16 (it never was - which is what one would expect if reduce() and Reduce() are equivalent). (EDIT: this is actually not a good way to check for equality. See here: What is the best way to compare floats for almost-equality in Python?)

I was wondering if there's a more straightforward way to compare complex numbers in python.

Also, why does this behaviour arise? After all, a float (and as far as I know a complex is basically a tuple of two floats) is stored on the machine in binary, as a sequence of bits. Isn't it true that if the two numbers are represented by the same sequence in binary, the difference should be zero and the comparison with == should yield True?

EDIT: OK, I found this What is the best way to compare floats for almost-equality in Python?, which basically boils down to the same thing.

But then the last part of the question remains: Why do floats work like that if in binary they are all basically represented by integers?

Community
  • 1
  • 1
the.real.gruycho
  • 608
  • 3
  • 17
  • 1
    Just because they are close enough to look the same for 12 decimal places doesn't make them the same number. You are not looking at the "sequence of bits"; only a decimal approximation. – khelwood Mar 07 '16 at 14:46
  • @khelwood, good point. I also realised that just now. See the edit. – the.real.gruycho Mar 07 '16 at 14:49
  • 1
    http://stackoverflow.com/questions/1089018/why-cant-decimal-numbers-be-represented-exactly-in-binary has a very nice discussion about floating point issues. – mtrw Mar 07 '16 at 14:56

1 Answers1

1

float values derived in a way that produces the same logical result won't always have the same representation in binary, because float is not infinite precision, and there are limitations to its representation. The same logically equivalent steps in different orders will sometimes have precision errors that leave you with slightly different results.

Usually, the way to check for "close" equality with floats when you know the values are small or in a narrow range is to do something like:

if abs(a - b) < 1e-9:  # Substitute your own threshold for equality

Whether that's appropriate for your complex values is problem specific; you may need to check for closeness of the real and imaginary components independently.

If you could use Python 3.5, it provides cmath.isclose to simplify this (and allow for scaled "closeness", not just absolute closeness), but on 2.7, it's probably easier to fudge it as I demonstrated above, or you can borrow the "equivalent" code given by the cmath.isclose docs:

 abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)

That equivalent code scales based on a relative tolerance, so if your values can span the whole range of complex types, you'll want to use something like that (where rel_tol and abs_tol are chosen by you appropriate to your problem set).

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • I think using something like abs(a-b) < 1e-9 is quite dangerous. Imagine subtracting 3.9e-100 and 2.7e-100. They are quite different, but according to the above they would be considered equal. The second method you suggest is much better. – the.real.gruycho Mar 07 '16 at 15:03
  • @the.real.gruycho: Agreed. It's problem space dependent; if you know the numbers are smallish, then an absolute threshold is fine, but if the numbers could be anywhere in the representable range, you need a relative threshold. I've updated the answer to make that a bit more clear. – ShadowRanger Mar 07 '16 at 15:04