3

Using Python 3, how does the following return True ?

a = 2/3
b = 4/6
print(a == b)

I have an algorithm that requires sorting a list of numbers which are each of the form x/y where x and y are integers. (y != 0).

I was concerned that the numerical precision of the division would result in instability and arbitrary ordering of cases such as above. This being an example of relevant comments.But, as per the example and for larger integers as well, it does not seem to be an issue.

Does Python remove the common factor of 2 from the numerator and denominator of b, and retain information that a and b are not just floats?

Jake
  • 413
  • 5
  • 11
  • Sorry I couldnt get your question. Isnt it right!? What is there to be concerned here? – Keerthana Prabhakaran Oct 23 '17 at 03:11
  • I think you are talking about different things: identity and equality, 2/3 equals 4/6, thus `2/3==4/6` returns `True`. Now, `2/3 is 4/6` returns `False` as they are different elements. Assign a variable to each and compare their `id()` and you shall see. More info: https://stackoverflow.com/questions/2239737/is-it-better-to-use-is-or-for-number-comparison-in-python – tokenizer_fsj Oct 23 '17 at 03:21
  • 1
    @tokenizer_fsj: OP is asking about floating point accuracy, not object identity. They're already using the correct comparison operator, so I don't see why you feel the need to correct them anyway. – Kevin Oct 23 '17 at 03:22

3 Answers3

4

Python follows the IEEE 754 floating point specification.* (64-bit) IEEE floats are essentially a form of base 2 scientific notation, broken down as follows:

  • One bit for the sign (positive or negative)
  • 53 bits for the mantissa or significand, including the implied leading one.
  • 11 bits for the exponent.

Multiplying or dividing a floating point value by two, or any power of two, only affects the exponent, and not the mantissa.** As a result, it is normally a fairly "stable" operation by itself, so 2/3 should yield the same result as 4/6. However, IEEE floats still have the following problems:

  • Most operations are not associative (e.g. (a * b) * c != a * (b * c) in the general case).
  • More complicated operations are not required to be correctly rounded (however, as Tim Peters points out, division certainly is not a "more complicated" operation and will be correctly rounded).***
  • Intermediate results are always rounded to 53 bits.

You should be prepared to handle these issues and assume that most mathematically-equivalent floating point expressions will not result in identical values. In Python specifically, you can use math.isclose() to estimate whether two floats are "close enough" to be "probably the same value."


* Actually, this is a lie. Python follows C's double, which nearly always follows IEEE 754 in some fashion, but might deviate from it on sufficiently exotic architectures. In such cases the C standard provides few or no guarantees, so you will have to look to your architecture or compiler's floating point documentation.

** Provided the exponent does not overflow or underflow. If it does, then you will typically land on an appropriately-signed infinity or zero, respectively, or you might underflow to a denormal number depending on architecture and/or how Python was compiled.

*** The exact set of "more complicated" operations varies somewhat because IEEE 754 made a lot of operations optional while still demanding precision. As a result, it is seldom obvious whether a given operation conforms to IEEE 754 or only conforms to the notoriously lax C standard. In some cases, an operation may conform to no standard whatsoever.

Kevin
  • 28,963
  • 9
  • 62
  • 81
  • Thanks Kevin, that's just what I wanted to know. In reality it's just lucky that for simple enough cases they remain equal but any reliance on this will lead to brittle code in general. – Jake Oct 23 '17 at 03:28
3

Just noting that so long as integers x and y are exactly representable as Python floats, x / y is - on all current machines - the correctly rounded value of the infinitely precise quotient. That's what the IEEE 754 floating-point standard requires, and all current machines support that.

So the important part in your specific example isn't that the numerator and denominator in b = 4/6 have a factor of (specifically!) 2 in common, it's that (a) they have some factor in common; and, (b) 4 and 6 are both exactly representable as Python floats.

So, for example, it's guaranteed that

(2 * 9892837) / (3 * 9892837) == 2 / 3

is also true. Because the infinitely precise value of (2 * 9892837) / (3 * 9892837) is the same as the infinitely precisely value of 2/3, and IEEE 754 division acts as if the infinitely precise quotient were computed. And you can replace 9892837 with any other non-zero integer in that, provided that the products remain exactly representable as Python floats.

Tim Peters
  • 67,464
  • 13
  • 126
  • 132
  • 2 and 2.0 really have nothing to do with why this example works. `(n*x)/(n*y) == x/y` for any integers such that `n*y` isn't 0, and `n*x` and `n*y` are both exactly representable as Python floats. – Tim Peters Oct 23 '17 at 03:43
  • Yeah, I realized that. I was confusing your "infinitely precise quotient" with the infinitely precise integers that are going into the divisor and dividend. – Kevin Oct 23 '17 at 03:46
0

2/3 is the same as 4/6. (2/3)*(2/2) = 2/2 = 1, the identity element. The response is correct.

Python Jim
  • 11
  • 3
  • This does not answer the actual question being asked. The question being asked is about the precision and possible instability between the two expressions. It's certainly equivalent in theory, but can you guarantee that this is the same for these kinds of expressions on your CPU? Kevin's answer above actually answers the question at hand. – rayryeng Oct 23 '17 at 03:31