does anyone understand where does python scientific expression fail when dealing with every large integers?

Question

And the breaking example would be this:

>> (1000000000000000000 + 1e6 - 1) == (1000000000000000000 + 1e6)
True
>> (1000000000000000000 + 1000000 - 1) == (1000000000000000000 + 1000000)
False

Would love it if someone can shed some light on which mechanism in python causes this bug? My guess is something to do with 64-bit integer, 32-bit integer and how scientific expression is actually in float.

python does not use 64 or 32 bit integers, it uses numbers with base 2^31 — rioV8, Jul 11 '23 at 16:37
Obligatory, [Is floating point math broken?](https://stackoverflow.com/questions/588004/is-floating-point-math-broken) — Woodford, Jul 11 '23 at 17:28
@Barmar that works for `1e6`, but it's not so good for (say) `1e30`. In that case, `10**30` gives the correct int, but `int(1e30)` gives 1000000000000000019884624838656. — slothrop, Jul 11 '23 at 17:41

score 1 · Accepted Answer · answered Jul 11 '23 at 18:06

TL;DR

It is because of floating point math. You only have a finite number of fractional bits, to represent exact values you need to have infinite bits, so numbers are rounded to powers of two, the extra bits are cut off. So multiple numbers are mapped to the same number.

Python integers are unbounded, but 1e6 is a float, so the integer was converted to a float then operated upon. The return type of the sum is a float.

Python uses Double precision floating point, which uses binary to represent fractions, the representation is a fraction in binary whose integer part is one and with 52 fractional places, raised to a power of two less than 2048 (11 exponent bits), I won't go into the details here, you can check them at the linked page if you want.

My point is integers between 0 and 2⁵³ = 9007199254740992 are exactly represented in double precision, meaning if you convert the integer to a float and then back, you get exactly the same integer. For numbers between 2⁵³ and 2⁵⁴ it is rounded to the nearest even number, meaning if you convert an odd number to float and back you get the nearest even number.

For numbers between 2⁵⁴ and 2⁵⁵ it is rounded to the next multiple of 4, et cetera, your number is 1000000000000000000 + 1e6 == 1.000000000001e+18, which is greater than 2⁵⁹=576460752303423488 so it is rounded to the nearest multiple of 64.

(1000000000000000000 + 1e6 - 65) != (1000000000000000000 + 1e6)

The above expression is True.

score 0 · Answer 2 · answered Jul 11 '23 at 17:29

0

I found this behavior is due to float arithmetics rather than because of large numbers Similar question, as people mention to avoid issues you can use integers or decimal depending on what u need to do, hope this helps

answered Jul 11 '23 at 17:29

Cristian

17
1

score 0 · Answer 3 · answered Jul 11 '23 at 18:16

0

The int type in Python has an upper limit determined by the platform's memory and the version of Python you are using.

You can use specialized libraries like GMPY2, mpmath, or SymPy.

answered Jul 11 '23 at 18:16

Cătălin George Feștilă

1,364
27
48

Python has arbitrary precision integers. Clearly there's *some* limit: for example you can't store an int on the order 2**16,000,000,000 if you have less than 2GB of memory. But that isn't at all relevant for the size of numbers discussed in the question. – slothrop Jul 11 '23 at 19:03
I understand now. – Cătălin George Feștilă Jul 11 '23 at 20:33

does anyone understand where does python scientific expression fail when dealing with every large integers?

3 Answers3

TL;DR