id() of numpy.float64 objects are the same, even if their values differ?

Question

I get the following:

import numpy
print id(numpy.float64(100)) ==  id(numpy.float64(10))
print numpy.float64(100) == numpy.float64(10)

gives:

True
False

Note that if I create the two float64 objects and then compare them then it appears to work as expected:

a = numpy.float64(10)
b = numpy.float64(100)
print a==b, id(a)==id(b)

gives:

False
False

Based on https://docs.python.org/2/reference/datamodel.html shouldn't the id of two objects always differ if their values differ? How can I get the id to match if the values are different?

Is this some sort of bug in numpy?

This is fascinating. Same deal in Python 3: typing `id(numpy.float64(*))` into the interpreter, where `*` is a bunch of random numbers always prints the same thing. — Mad Physicist, Jan 06 '16 at 18:09

Alex Riley · Accepted Answer · 2016-01-06T18:18:23.453

This looks like a quirk of memory reuse rather than a NumPy bug.

The line

id(numpy.float64(100)) == id(numpy.float64(10))

first creates a float numpy.float64(100) and then calls the id function on it. This memory is then immediately freed by Python's garbage collector because there are no more references to it. The memory slot is free to be reused by any new objects that are created.

When numpy.float64(10) is created, it occupies the same memory location, hence the memory addresses returned by id compare equal.

This chain of events is perhaps clearer when you look at the bytecode:

>>> dis.dis('id(numpy.float64(100)) ==  id(numpy.float64(10))')
  0 LOAD_NAME                0 (id)
  3 LOAD_NAME                1 (numpy)
  6 LOAD_ATTR                2 (float64)
  9 LOAD_CONST               0 (100)
 12 CALL_FUNCTION            1 (1 positional, 0 keyword pair) # call numpy.float64(100)
 15 CALL_FUNCTION            1 (1 positional, 0 keyword pair) # get id of object

 # gc runs and frees memory occupied by numpy.float64(100)

 18 LOAD_NAME                0 (id)                   
 21 LOAD_NAME                1 (numpy)
 24 LOAD_ATTR                2 (float64)
 27 LOAD_CONST               1 (10)
 30 CALL_FUNCTION            1 (1 positional, 0 keyword pair) # call numpy.float64(10)
 33 CALL_FUNCTION            1 (1 positional, 0 keyword pair) # get id of object

 36 COMPARE_OP               2 (==) # compare the two ids 
 39 RETURN_VALUE

Good point, I agree that this is most likely the reason for this behaviour. Thinking a bit further, I tried to validate this by turning off gc with "gc.disable()" but this still gave the same behaviour, and even "gc.set_debug(gc.DEBUG_SAVEALL)" wasn't enough to keep the first float64 from being collected. Is there some other garbage collector that could be turned off? — Daniel Abel, Jan 06 '16 at 18:13
Some other garbage collector: yes. That can be turned off: no. The gc module controls the cyclic garbage collector that runs on top of the regular reference-counting based gc; there's no way to turn off the basic reference-counting based gc. — Mark Dickinson, Jan 06 '16 at 18:21

score 4 · Answer 2 · answered Jan 06 '16 at 18:10

Imagine that you put a book on a shelf, and then someone notices that you're not using it and takes the book off the shelf to free some space. You then move to put a different book on the shelf at a convenient location.

If you suddenly realize that you had two different books at the very same location, do you think there's a bug in reality? ;-)

After you call id on numpy.float64(100), you have no way to refer to that float object you made, and so the interpreter is perfectly free to reuse that id.

rafaelc · Answer 3 · 2016-01-06T18:17:41.180

According to the Docs, the id() function is described like this

id(object)

Return the “identity” of an object. This is an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.

So that may happen, but not necessarily always. As they do not exist simultaneously in time, they end up catching the same id.

However, in this case

a = numpy.float64(10)
b = numpy.float64(100)
print a==b, id(a)==id(b)

They do exist simultaneously, so they cannot have the same id.

id() of numpy.float64 objects are the same, even if their values differ?

3 Answers3