12

I came across a confusing problem when unit testing a module. The module is actually casting values and I want to compare this values.

There is a difference in comparison with == and is (partly, I'm beware of the difference)

>>> 0.0 is 0.0
True   # as expected
>>> float(0.0) is 0.0
True   # as expected

As expected till now, but here is my "problem":

>>> float(0) is 0.0
False
>>> float(0) is float(0)
False

Why? At least the last one is really confusing to me. The internal representation of float(0) and float(0.0) should be equal. Comparison with == is working as expected.

Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253
Günther Jena
  • 3,706
  • 3
  • 34
  • 49
  • 2
    Related: http://stackoverflow.com/questions/132988/is-there-a-difference-between-and-is-in-python – Elazar Aug 08 '16 at 17:21
  • 2
    Your question deserves an answer, but if you came across this problem in real code, the code is probably erroneous and should be fixed. There is (almost) no reason to test reference identity between floats in such a way. – Elazar Aug 08 '16 at 17:24
  • 1
    The strange thing is, that although I can reproduce this, all of `id(0.0)`, `id(float(0.0))` and `id(float(0))` return the same value. ... That is, the value is the same if I execute those one after the other in the interactive shell, but if I do `id(float(0.0)), id(float(0))` (as a tuple) then the ids differ. Any explanation? – tobias_k Aug 08 '16 at 17:25
  • Are you sure you understand `is`? Why did you expect `0.0 is 0.0` to be true, as that is actually *surprising* to some. – Martijn Pieters Aug 08 '16 at 17:26
  • @MartijnPieters, @Elazar: I have to admit I was not aware that `is` is comparing ids. So yes `float(1.0) is 1.0` is actually suprising ;-) – Günther Jena Aug 08 '16 at 17:30
  • 1
    @Elazar: Thx for adding cpython tag as it seems very cpython related and nothing you should depend on (as you already mentioned) – Günther Jena Aug 08 '16 at 17:39
  • Can I cast an automatic reopen vote? This question is *absolutely* not a duplicate of the related question, and it is definitely not given any answer there. – Elazar Aug 08 '16 at 19:01
  • 3
    @tobias_k: two reasons: immutable literals in code are stored as constants with the code object (so `0.0 is 0.0` produces just one object that is reused), Python reuses memory (so `id(0.0)` followed by another `id(someobject)` could easily produce the same id, since the previous one has been garbage collected) and producing a tuple can't re-use memory locations since you still need *all* objects to be part of that tuple. – Martijn Pieters Aug 08 '16 at 19:26

2 Answers2

25

This has to do with how is works. It checks for references instead of value. It returns True if either argument is assigned to the same object.

In this case, they are different instances; float(0) and float(0) have the same value ==, but are distinct entities as far as Python is concerned. CPython implementation also caches integers as singleton objects in this range -> [x | x ∈ ℤ ∧ -5 ≤ x ≤ 256 ]:

>>> 0.0 is 0.0
True
>>> float(0) is float(0)  # Not the same reference, unique instances.
False

In this example we can demonstrate the integer caching principle:

>>> a = 256
>>> b = 256
>>> a is b
True
>>> a = 257
>>> b = 257
>>> a is b
False

Now, if floats are passed to float(), the float literal is simply returned (short-circuited), as in the same reference is used, as there's no need to instantiate a new float from an existing float:

>>> 0.0 is 0.0
True
>>> float(0.0) is float(0.0)
True

This can be demonstrated further by using int() also:

>>> int(256.0) is int(256.0)  # Same reference, cached.
True
>>> int(257.0) is int(257.0)  # Different references are returned, not cached.
False
>>> 257 is 257  # Same reference.
True
>>> 257.0 is 257.0  # Same reference. As @Martijn Pieters pointed out.
True

However, the results of is are also dependant on the scope it is being executed in (beyond the span of this question/explanation), please refer to user: @Jim's fantastic explanation on code objects. Even python's doc includes a section on this behavior:

[7] Due to automatic garbage-collection, free lists, and the dynamic nature of descriptors, you may notice seemingly unusual behaviour in certain uses of the is operator, like those involving comparisons between instance methods, or constants. Check their documentation for more info.

Community
  • 1
  • 1
ospahiu
  • 3,465
  • 2
  • 13
  • 24
  • `float(17.0) is float(17.0)` returns `True` but `float(17.0) is float('17.0')` returns `False` -> is `float( some_value )` just returning the original float so no new instance is made (in case some_value is a float...)? – Günther Jena Aug 08 '16 at 17:21
  • @Jim answered this – Günther Jena Aug 08 '16 at 17:22
  • 1
    CPython doesn't cache floats. You're seeing a combination of constant folding and the `float` constructor passing floats through directly for the cases where `is` gives `True`. – user2357112 Aug 08 '16 at 17:24
  • 3
    *CPython implementation also caches integers/floats as singleton objects in this range -> [x >= -5 | x <= 256 ]*. No, only `int` objects are interned. What is happening here is that the *literals* are stored as constants with the code object, and there is no point in storing two `0.0` values in the same code object constants array. – Martijn Pieters Aug 08 '16 at 17:24
  • 1
    Try out `compile('0.0 is 0.0', '', 'exec').co_consts`; you'll find *one* `0.0` object, and `None`. – Martijn Pieters Aug 08 '16 at 17:27
  • @user2357112 Yes, you are correct, updated my answer accordingly. – ospahiu Aug 08 '16 at 17:28
11

If a float object is supplied to float(), CPython* just returns it without making a new object.

This can be seen in PyNumber_Float (which is eventually called from float_new) where the object o passed in is checked with PyFloat_CheckExact; if True, it just increases its reference count and returns it:

if (PyFloat_CheckExact(o)) {
    Py_INCREF(o);
    return o;
}

As a result, the id of the object stays the same. So the expression

>>> float(0.0) is float(0.0) 

reduces to:

>>> 0.0 is 0.0

But why does that equal True? Well, CPython has some small optimizations.

In this case, it uses the same object for the two occurrences of 0.0 in your command because they are part of the same code object (short disclaimer: they're on the same logical line); so the is test will succeed.

This can be further corroborated if you execute float(0.0) in separate lines (or, delimited by ;) and then check for identity:

a = float(0.0); b = float(0.0) # Python compiles these separately
a is b # False 

On the other hand, if an int (or a str) is supplied, CPython will create a new float object from it and return that. For this, it uses PyFloat_FromDouble and PyFloat_FromString respectively.

The effect is that the returned objects differ in ids (which used to check identities with is):

# Python uses the same object representing 0 to the calls to float
# but float returns new float objects when supplied with ints
# Thereby, the result will be False
float(0) is float(0) 

*Note: All previous mentioned behavior applies for the implementation of python in C i.e CPython. Other implementations might exhibit different behavior. In short, don't depend on it.

Community
  • 1
  • 1
Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253