3

I was trying to process some rather large numbers in python and came across an overflow error. I decided to investigate a little bit more and came across an inequality I cannot explain. When I evaluate 10^26 I get:

>>> 10**26
100000000000000000000000000

Which is perfectly logical. However when I evaluate 10e26 and convert it to an int I get:

>>>int(10e26)
1000000000000000013287555072

Why is this? Do I not understand the e notation properly? (From what I know 10e26 is 10*10^26 as seen in this answer: 10e notation used with variables?)

10^26 is way past the max integer size so I was also wondering if there was any mechanism in python which could allow to work with numbers in scientific format (not considering all those zeros) in order to be able to compute operations with numbers past the max size.

A.D
  • 427
  • 1
  • 4
  • 12
  • There is no max integer size in Python; `int` can represent an arbitrarily large integer, subject only to the available memory to store the value. `10**26` (since both operands are `int` literals) evaluates to an `int` as well. `10e26`, on the other hand, is a `float` literal, and `float` is limited in which integer values it can precisely represent. – chepner Jan 24 '19 at 18:33
  • I'm not clear on why this happens despite being familiar with that dupe target. Why is the syntax handling this differently? This could be done internally with only integers? – roganjosh Jan 24 '19 at 18:33
  • Nm, it's answered in the question the OP linked. `10e26` is a float literal. Knowing that is sufficient to understand the dupe. – roganjosh Jan 24 '19 at 18:35
  • The normal floating point inaccuracy would make sense, but wouldn't the interpreter be a little smarter in its conversion to int? Or does that simply mean that 10.0 in float is represented by 10.00000000000000013287555072? – A.D Jan 24 '19 at 18:38
  • @A.D. 10.0 can be represented precisely as a floating-point value, because there is enough precision in the mantissa to store a small integer exactly. `10e26`, however, is too large to be stored exactly in the mantissa, and so has to be stored as a smaller approximate mantissa with an appropriate exponent. – chepner Jan 24 '19 at 18:42
  • Folks, please do the math before marking something as a duplicate. `10e26` is not equal to `10**26` because the former represents 10 times 10^26, which is 10^27, and the latter represents 10^26. This difference has nothing to do with floating-point. – Eric Postpischil Jan 24 '19 at 19:00
  • Furthermore, if somebody had asked about `1e26 == 10^26`, knowing about the floating-point format in insufficient; one also has to know how Python evaluations expressions (such as which arithmetics are used for `1e26` and for `10^26` and what format or algorithm is used for the comparison), so a general floating-point question is inadequate. [The question marked as an original](https://stackoverflow.com/questions/588004/is-floating-point-math-broken) does not address these adequately. – Eric Postpischil Jan 24 '19 at 19:00

2 Answers2

5

The short answer is that 10e26 and 10**26 do not represent identical values.

10**26, with both operands being int values, evaluates to an int. As int represents integers with arbitrary precision, its value is exactly 1026 as intended.

10e26, on the other hand, is a float literal, and as such the resulting value is subject to the limited precision of the float type on your machine. The result of int(10e26) is the integer value of the float closest to the real number 1027.

chepner
  • 497,756
  • 71
  • 530
  • 681
  • Ah alright, that makes more sense and 10**27.0 does evaluate to 10e26. I'll accept that ! – A.D Jan 24 '19 at 18:41
4

10e26 represents ten times ten to the power of 26, which is 1027.

10**26 represents represents ten to the power of 26, 1026.

Obviously, these are different, so 10e26 == 10**26 is false.

However, if we correct the mistake so we compare 1e26 and 10**26 by evaluating 1e26 == 10**26, we get false for a different reason:

  • 1e26 is evaluated in a limited-precision floating-point format, producing 100000000000000004764729344 in most implementations. (Python is not strict about the floating-point format.) 100000000000000004764729344 is the closest one can get to 1026 using 53 significant bits.
  • 10**26 is evaluated with integer arithmetic, producing 100000000000000000000000000.
  • Comparing them reports they are different.

(I am uncertain of Python semantics, but I presume it converts the floating-point value to an extended-precision integer for the comparison. If we instead convert the integer to floating-point, with float(10**26) == 1e26, the conversion of 100000000000000000000000000 to float produces the same value, 100000000000000004764729344, and the comparison returns true.)

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • Note that it is not necessary to convert the floating point to large integer: it might be faster to check if the int can be converted to float exactly. First check that highBit does not exceed maximum float exponent (case of overflow), and if not, then compute how many bits span the integer significand (highBitRank - lowBitRank + 1) and check if it exceeds float precision (24 or 53 bits for standard IEEE). See method isAnExactFloat in http://source.squeak.org/trunk/Kernel-nice.853.diff. I did not check Python implementation though, but could be same spirit. – aka.nice Jan 25 '19 at 08:30
  • On Python semantics: the actual details are messy, but yes, Python jumps through hoops to compare the exact values correctly. The current code (for CPython, at least), starts here: https://github.com/python/cpython/blob/62c35a8a8ff5854ed470b1c16a7a14f3bb80368c/Objects/floatobject.c#L332 – Mark Dickinson Jan 25 '19 at 08:30