1

The original problem i am dealing with is outlined here. I would like to ask an additional question (about Python reference counting) related to the original problem.

Lets say that i have the following script:

from bitarray import bitarray
from array import array

list1=[bitarray('00011'), bitarray('00010'), bitarray('11011')]
list2=[array('i',[0,0,0,0]),array('i',[1,1,1,1]),array('i',[2,2,2,2])]

def calculate(l1,l2):
    result1=l1[0]&l1[1]&l1[2]
    result2=l2[0][0]+l2[1][1]+l2[2][2]
return result1, result2

print calculate(list1,list2)

Does the reference count of list1, list2 or any of the objects in either lists changes at some point when i call calculate(list1,list2)?

Just to clarify: I do not mean if the reference count will be the same before and after calling calculate(list1,list2). I mean if the reference count changes at any point during the execution of calculate(list1,list2).

Community
  • 1
  • 1
FableBlaze
  • 1,785
  • 3
  • 16
  • 21

1 Answers1

3

The reference count of list1 and list2 doesn't change, they are just variables and thus string keys in a locals() namespace.

The Python list objects that these two variables point to, though, yes, their reference count changes when passed to a function. During the call to the function two new variables refer to those lists (l1 and l2) increasing the count, and when the function returns those variables are cleaned up and the ref count goes down again.

Inside the calculate() function, you are accessing items of these two lists (l1[0], etc.). Item access could use the __getitem__ method of objects; methods are created on-the-fly when accessed, and hold a reference to the instance and the underlying function. For a list that is another reference to the list object, and another temporary reference count increase. Once the function has been called and returned it's value, the method is discarded again (nothing is referencing it) and the ref count for the list drops again.

As delnan rightly points out in the comments, for list subscription the BINARY_SUBSCR opcode optimizes access (provided the index is an integer) and no method is created in that specific case.

The python interpreter, when handling bytecode and values on the stack, is increasing and decreasing reference counts all the time though. Take a look through the Python bytecode evaluation loop and count the number of Py_INCREF and Py_DECREF occurrences to to get an idea of how common this is.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • +1 There may also be (even shorter-lived) references during the calls to `__getitem__` inside `calculate`, or in preparation of that call (`LOAD_FAST 0 (l1); LOAD_CONST 0; BINARY_SUBSCR`). –  Jan 03 '13 at 11:07
  • Do I understand correctly that even the line `result1=l1[0]&l1[1]&l1[2]` will modify the reference counts of related objects? – FableBlaze Jan 03 '13 at 11:10
  • @anti666: Yes, ref counts can go up and down rapidly. – Martijn Pieters Jan 03 '13 at 11:11
  • Nitpick: As special method lookup happens on the class, and indexing has its own opcode, `l1[0]` etc. probably don't create a bound method object. They still put a temporary reference on the stack though. (And as always, this is all CPython-specific.) –  Jan 03 '13 at 11:25
  • @delnan: The `BINARY_SUBSCR` will use `__getitem__` in all cases but a list access with an integer. :-) It's special cased, see [ceval.c](http://hg.python.org/cpython/file/f26c91bf61bf/Python/ceval.c#l1373). – Martijn Pieters Jan 03 '13 at 11:29
  • Even better, but unrelated to my point ;) It never creates a bound method in the general/slow path either (it calls `PyObject_GetItem`, which calls `PySequence_GetItem` for non-mappings, which uses `s->ob_type->tp_as_sequence->sq_item`), but it always puts an additional reference on the stack (before even reaching `BINARY_SUBSCR`), which requires a refcount change. –  Jan 03 '13 at 11:41
  • @delnan: `PyObject_GetItem()` will use the `__getitem__` method if one is there to use, of course. I expanded the answer to include information about the bytecode evaluation loop and it's stack. – Martijn Pieters Jan 03 '13 at 11:42
  • Yes, but still not my point. Your third paragraph speaks of **bound methods** for `__getitem__`, but those aren't created by `BINARY_SUBSCR`. `PyObject_GetItem` always directly calls a C-level function pointer stored in the type object, without creating a bound method object for it. It's a valid point for many other cases, and it may be useful to mention that among many other seemingly innocuous things that modify the refcount, but as it stands, it's misleading. –  Jan 03 '13 at 11:44
  • @delnan: you making me search now.. if a custom object has a `__getitem__` method, I believe that `PyObject_GetItem` ultimately will have to create a method object to call that. – Martijn Pieters Jan 03 '13 at 11:47
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/22115/discussion-between-delnan-and-martijn-pieters) –  Jan 03 '13 at 11:48