I've been experimenting with sys.getrefcount()
and noticed a strange thing. Why does this script output 108
instead of expected 2
(single reference + argument to getrefcount()
)?
import sys
x = 1
print(sys.getrefcount(x)) # 108
I've been experimenting with sys.getrefcount()
and noticed a strange thing. Why does this script output 108
instead of expected 2
(single reference + argument to getrefcount()
)?
import sys
x = 1
print(sys.getrefcount(x)) # 108
The exact numbers will depend on the details of your installation. For me:
Python 3.6.1 (v3.6.1:69c0db5, Mar 21 2017, 18:41:36) [MSC v.1900 64 bit (AMD64)]
on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.getrefcount(1)
101
>>> x = 1
>>> sys.getrefcount(x)
102
>>> sys.getrefcount(1)
102
>>> sys.getrefcount(2)
77
>>> sys.getrefcount(3)
32
>>> x = 256
>>> y = 256
>>> x is y
True
>>> x = 256
>>> y = 257
>>> x is y
False
>>> sys.getrefcount(255)
4
>>> sys.getrefcount(256)
19
>>> sys.getrefcount(257)
3
>>> x = -5
>>> y = -5
>>> x is y
True
>>> x = -6
>>> y = -6
>>> x is y
False
>>> sys.getrefcount(-5)
3
>>> sys.getrefcount(-6)
2
The example is constructed so as to highlight:
It is not x
that has a reference count, but 1
. Variable names don't have references; they are the references. The referred-to things are the values - under-the-hood chunks of memory that store the information needed to represent concepts like 1
.
Even if it made sense to talk about references counts of variables, sys.getrefcount
- like any other function - will receive a value, not a variable as its argument. Even if you write sys.getrefcount(x)
, there is no way for the internals of sys.getrefcount
to actually know about x
.
Different integer values will have varying reference counts. This is because a bunch of stuff is set up ahead of time, and that setup requires internal stuff to store numeric values, producing references to the objects representing those values.
The internal logic of the reference C implementation of Python ensures that, for small integers (apparently, -5 to 256 inclusive, which matches my recollection - but do not rely on this behaviour; it is at best not useful for any real purpose and generally a really good way to introduce bugs and confusion into your code), the same object is used to represent a given value everywhere it's used (when the value 1
results from a computation, the existing 1
object is returned, instead of creating a new object representing the same value). So values inside this range have at least one extra reference, because there is something keeping track of those objects for recycling purposes that needs to track them. They may, of course, have a lot more references; evidently the number 256
is more important than the number 255
for the internal implementation details.
However, even numbers outside that range will report having two or three references. This is, I am pretty sure, something to do with how the garbage collector works (handwaving wildly). I have no idea why positive integers appear to report one more reference than negative ones.