8

The documentation about numeric types states that:

Python fully supports mixed arithmetic: when a binary arithmetic operator has operands of different numeric types, the operand with the “narrower” type is widened to that of the other, where integer is narrower than floating point, which is narrower than complex. Comparisons between numbers of mixed type use the same rule.

This is supported by the following behavior:

>>> int.__eq__(1, 1.0)
NotImplemented
>>> float.__eq__(1.0, 1)
True

However for large integer numbers something else seems to happen since they won't compare equal unless explicitly converted to float:

>>> n = 3**64
>>> float(n) == n
False
>>> float(n) == float(n)
True

On the other hand, for powers of 2, this doesn't seem to be a problem:

>>> n = 2**512
>>> float(n) == n
True

Since the documentation implies that int is "widened" (I assume converted / cast?) to float I'd expect float(n) == n and float(n) == float(n) to be similar but the above example with n = 3**64 suggests differently. So what rules does Python use to compare int to float (or mixed numeric types in general)?


Tested with CPython 3.7.3 from Anaconda and PyPy 7.3.0 (Python 3.6.9).

a_guest
  • 34,165
  • 12
  • 64
  • 118
  • 4
    You're probably just hitting float precision limits and `float(3**64)` is *not exactly equal* to `int(3**64)`. – deceze Jan 31 '20 at 14:35
  • 2
    `float(2**512)`, on the other hand, *can* be represented precisely, despite being much larger, *because* it is a power of 2. The mantissa needs only 1 bit for full precision, and the exponent only needs 9. – chepner Jan 31 '20 at 14:38
  • 3
    I understand all of this behavior but the the documentation states that the "narrower" type, in this case `int`, would be widened to the other, in this case `float`. So there shouldn't be any issue with precision loss due to round tripping; I'd expect `float(n) == n` to be internally handled such that the r.h.s. is converted to `float`, i.e. to be similar to `float(n) == float(n)`. – a_guest Jan 31 '20 at 14:42
  • 3
    If I recall correctly, Python, or at least some implementations of it, goes to pains to compare values exactly. When `3**64` is compared to `float(n)`, it is not converted to `float`. Rather, the exact mathematical value of `3**64` is compared to the exact mathematical value of `float(n)`. Since they differ, the result is `False`. When you convert `3**64` to `float`, it converts to a value representable in the `float` type, introducing some error. The wording in the documentation is unfortunate in saying that `float` is “wider” than `int`. Generally, it cannot represent all `int` values… – Eric Postpischil Jan 31 '20 at 14:44
  • 4
    https://github.com/python/cpython/blob/master/Objects/floatobject.c#L382 appears to be responsible for float-and-int comparison. It's certainly doing more than just converting the int to a float and comparing, but my C-reading abilities are not strong enough to know what exactly it is doing. – Kevin Jan 31 '20 at 14:44
  • 1
    … and so it is not clear whether an implementation that, in effect, converts to a truly wider format that can represent all `int` values and all `float` values therefore conforms to this documentation. – Eric Postpischil Jan 31 '20 at 14:45
  • "I understand all of this behavior but the the documentation states that the "narrower" type, in this case int, would be widened to the other, in this case float." Python has infinite-precision integers, so int absolutely is not narrower than float. – Masklinn Jan 31 '20 at 14:55
  • @Masklinn According to the quoted documentation snippet, it is… – deceze Jan 31 '20 at 14:57
  • @Masklinn The documentation also states that *"... where integer is narrower than floating point, which is narrower than complex."*. One might disagree that this is the case on a more general level but within their definition of "wide" and "narrow" this is (or at least should be) consistent. – a_guest Jan 31 '20 at 14:57
  • @Kevin If my analysis is correct it should follow [this branch](https://github.com/python/cpython/blob/58a4054760bffbb20aff90290dd0f3554f7bea42/Objects/floatobject.c#L431) but from there on it's not completely clear what happens. I'll need more time to analyze this in more detail. – a_guest Jan 31 '20 at 14:58
  • "@Masklinn According to the quoted documentation snippet, it is…" I guess the wording is misleading, though it leads to the funny result that `factorial(512) + 1.0` complains that int is too large. – Masklinn Jan 31 '20 at 15:00
  • @EricPostpischil When you say "exact mathematical value" do you mean it compares the corresponding base 10 value and tries to extract that from the `float` object? – a_guest Jan 31 '20 at 15:02
  • @Masklinn Since for Python 3 the only limit on `int` is the memory in your computer while `float` is bound to 64 bits, I think the meaning of the terms "wide" and "narrow" shouldn't be taken too literal, but rather should be considered within the documentation's own definition of them which is "integer is narrower than floating point, which is narrower than complex". So following that definition `int` should really be converted to `float`, or at least the results should be similar as if it was, but the actual behavior seems to be different. – a_guest Jan 31 '20 at 15:06
  • @a_guest: Base-10 numerals are irrelevant. The fact that twelve is “12” in decimal and “14” in octal is just a matter of representation; decimal 12 is equal to octal 14. The result of `3**64` is 3433683820292512484657849089281. The result of `float(3**64)` is 3433683820292512441173561835520. They are not equal, so `3**64 == float(3**64)` is false. – Eric Postpischil Jan 31 '20 at 15:07
  • @EricPostpischil So you mean that in those cases the behavior is similar as if the `float` was converted to `int` (rather than the other way round)? Looking at the [source code](https://github.com/python/cpython/blob/58a4054760bffbb20aff90290dd0f3554f7bea42/Objects/floatobject.c#L412) a `float` conversion seems to happen if the `int` uses less than or equal to 48 bits. Indeed checking for `n = 3**33` and `n = 3**34` it seems to have to do with round tripping `int(float(n)) == n`. – a_guest Jan 31 '20 at 15:17
  • @a_guest: Convert an integer to floating-point can lose information because the floating-point type does not have enough bits to represent the integer, as you saw with `float(3**64)`. Converting from floating-point to integer can lose information because the integer does not represent fractional parts. The comparison code, linked to from another comment above, effectively compares **the exact mathematical values**, as far as I know. It does not need to do a complete conversion to do this. If `n` is an integer and `f` is a `float` that is not a NaN, then they each represent a specific number,… – Eric Postpischil Jan 31 '20 at 15:31
  • … and `n == f` evaluates to `True` if and only if they represent the same number. – Eric Postpischil Jan 31 '20 at 15:32
  • @EricPostpischil So if I understand correctly, the "exact mathematical value" of `3**34` (which is `16677181699666569`) is just `16677181699666569.0` even though that can't be represented in `float` exactly, since `float(3**34)` is `16677181699666568.0` which is also its exact mathematical value. The conversion `int(float(3**34))` consequently yields `16677181699666568`, i.e. `3**34 - 1`. However by doing bitwise analysis of the involved numbers one can access these exact values without the need to convert to either data type. Is this what happens? – a_guest Jan 31 '20 at 15:53
  • 1
    @a_guest: Yes. The analysis can be bitwise or mixed things. E.g., check whether the floating-point number has a fraction, by examining its bits. If so, it does not equal the integer. Check whether it is out of integer bounds. If so, it does not equal the integer. Otherwise, convert it to an integer exactly, then compare. – Eric Postpischil Jan 31 '20 at 16:04
  • 1
    @a_guest: The exact mathematical value of 3\*\*34 is 16677181699666569. `16677181699666569` and `16677181699666569.0` are numerals which also represent that number. (Numerals are strings of characters that represent numbers. They are not actually numbers, anymore than the word “fox” is a fox.) – Eric Postpischil Jan 31 '20 at 16:04
  • 1
    slightly related for those interested in how? more than what? https://stackoverflow.com/questions/58734034/how-to-properly-compare-an-integer-and-a-floating-point-value – aka.nice Jan 31 '20 at 16:43
  • @Kevin I am not sure if this is how Python actually checks numeric equality, but: `float` objects in Python have a built-in `.as_integer_ratio` method, and so do `int` objects. That is, Python "understands" that both floats and ints represent rational numbers--and it is pretty straightforward to check equality of rationals, assuming that you can do exact integer arithmetic. (The integer ratio `a/b` should be equal to the integer ratio `c/d`, if and only if the integers `a*d` and `b*c` are equal.) Python can of course perform exact integer multiplication on arbitrary-size integers. – mathmandan Aug 24 '23 at 21:57

1 Answers1

4

The language specification on value comparisons contains the following paragraph:

Numbers of built-in numeric types (Numeric Types — int, float, complex) and of the standard library types fractions.Fraction and decimal.Decimal can be compared within and across their types, with the restriction that complex numbers do not support order comparison. Within the limits of the types involved, they compare mathematically (algorithmically) correct without loss of precision.

This means when two numeric types are compared, the actual (mathematical) numbers that are represented by these objects are compared. For example the numeral 16677181699666569.0 (which is 3**34) represents the number 16677181699666569 and even though in "float-space" there is no difference between this number and 16677181699666568.0 (3**34 - 1) they do represent different numbers. Due to limited floating point precision, on a 64-bit architecture, the value float(3**34) will be stored as 16677181699666568 and hence it represents a different number than the integer numeral 16677181699666569. For that reason we have float(3**34) != 3**34 which performs a comparison without loss of precision.

This property is important in order to guarantee transitivity of the equivalence relation of numeric types. If int to float comparison would give similar results as if the int object would be converted to a float object then the transitive relation would be invalidated:

>>> class Float(float):
...     def __eq__(self, other):
...         return super().__eq__(float(other))
... 
>>> a = 3**34 - 1
>>> b = Float(3**34)
>>> c = 3**34
>>> a == b
True
>>> b == c
True
>>> a == c  # transitivity demands that this holds true
False

The float.__eq__ implementation on the other hand, which considers the represented mathematical numbers, doesn't infringe that requirement:

>>> a = 3**34 - 1
>>> b = float(3**34)
>>> c = 3**34
>>> a == b
True
>>> b == c
False
>>> a == c
False

As a result of missing transitivity the order of the following list won't be changed by sorting (since all consecutive numbers appear to be equal):

>>> class Float(float):
...     def __lt__(self, other):
...         return super().__lt__(float(other))
...     def __eq__(self, other):
...         return super().__eq__(float(other))
... 
>>> numbers = [3**34, Float(3**34), 3**34 - 1]
>>> sorted(numbers) == numbers
True

Using float on the other hand, the order is reversed:

>>> numbers = [3**34, float(3**34), 3**34 - 1]
>>> sorted(numbers) == numbers[::-1]
True
a_guest
  • 34,165
  • 12
  • 64
  • 118