0

And the breaking example would be this:

>> (1000000000000000000 + 1e6 - 1) == (1000000000000000000 + 1e6)
True
>> (1000000000000000000 + 1000000 - 1) == (1000000000000000000 + 1000000)
False

Would love it if someone can shed some light on which mechanism in python causes this bug? My guess is something to do with 64-bit integer, 32-bit integer and how scientific expression is actually in float.

michaelgbj
  • 290
  • 1
  • 10

3 Answers3

1

TL;DR

It is because of floating point math. You only have a finite number of fractional bits, to represent exact values you need to have infinite bits, so numbers are rounded to powers of two, the extra bits are cut off. So multiple numbers are mapped to the same number.


Python integers are unbounded, but 1e6 is a float, so the integer was converted to a float then operated upon. The return type of the sum is a float.

Python uses Double precision floating point, which uses binary to represent fractions, the representation is a fraction in binary whose integer part is one and with 52 fractional places, raised to a power of two less than 2048 (11 exponent bits), I won't go into the details here, you can check them at the linked page if you want.

My point is integers between 0 and 253 = 9007199254740992 are exactly represented in double precision, meaning if you convert the integer to a float and then back, you get exactly the same integer. For numbers between 253 and 254 it is rounded to the nearest even number, meaning if you convert an odd number to float and back you get the nearest even number.

For numbers between 254 and 255 it is rounded to the next multiple of 4, et cetera, your number is 1000000000000000000 + 1e6 == 1.000000000001e+18, which is greater than 259=576460752303423488 so it is rounded to the nearest multiple of 64.

(1000000000000000000 + 1e6 - 65) != (1000000000000000000 + 1e6)

The above expression is True.

Ξένη Γήινος
  • 2,181
  • 1
  • 9
  • 35
0

I found this behavior is due to float arithmetics rather than because of large numbers Similar question, as people mention to avoid issues you can use integers or decimal depending on what u need to do, hope this helps

Cristian
  • 17
  • 1
0

The int type in Python has an upper limit determined by the platform's memory and the version of Python you are using.

You can use specialized libraries like GMPY2, mpmath, or SymPy.

  • Python has arbitrary precision integers. Clearly there's *some* limit: for example you can't store an int on the order 2**16,000,000,000 if you have less than 2GB of memory. But that isn't at all relevant for the size of numbers discussed in the question. – slothrop Jul 11 '23 at 19:03