6

I'm just trying to understand how to deal with the reference counts when using the Python C API.

I want to call a Python function in C++, like this:

PyObject* script;
PyObject* scriptRun;
PyObject* scriptResult;

// import module
script = PyImport_ImportModule("pythonScript");
// get function objects
scriptRun = PyObject_GetAttrString(script, "run");
// call function without/empty arguments
scriptResult = PyObject_CallFunctionObjArgs(scriptRun, NULL);

if (scriptResult == NULL)
    cout << "scriptResult  = null" << endl;
else
    cout << "scriptResult  != null" << endl;

cout << "print reference count: " << scriptResult->ob_refcnt << endl;

The Python code in pythonScript.py is very simple:

def run():
    return 1

The documentation of "PyObject_CallFunctionObjArgs" says that you get a new reference as return value. So I would expect "scriptResult" to have a reference count of 1. However the output is:

scriptResult  != null
print reference count: 72

Furthermore I would expect a memory leak if I would do this in a loop without decreasing the reference count. However this seems not to happen.

Could someone help me understand?

Kind regards!

user1774143
  • 192
  • 1
  • 7
  • A follow-up question: Thanks to @ecatmur and @KayZhu I now understand why there is no memory leak. Nevertheless if I run this code in a long loop my complete os crushes anyway. The reference count to `1` is increasing, every iteration, but I do not see why this should cause a system failure. – user1774143 Oct 25 '12 at 14:05
  • Are you looping until `ob_refcnt` cycles back through 0? The reference count of 1 fluctuates a lot. When you wrap around past 0, normal operations could `Py_DECREF` to 0 and cause `int` 1 to get deallocated, followed quickly by a segfault. Try it with a less common interned `int` such as 13. – Eryk Sun Oct 25 '12 at 16:39
  • At least, I was not even looping until 'sys.maxint' (which is '9223372036854775807' on my system). Today it seems I cannot reproduce the error and I guess I should stop trying to shoot my working desktop down. Thanks for your help! – user1774143 Oct 26 '12 at 08:42

2 Answers2

5

The confusion is that small integers (also True, False, None, single-character strings, etc.) are interned ( "is" operator behaves unexpectedly with integers ), which means that wherever they are used or obtained in a program the runtime will try to use the same object instance:

>>> 1 is 1
True
>>> 1 + 1 is 2
True
>>> 1000 + 1 is 1001
False

This means that when you write return 1, you're returning an already existing int object instance with (as you've seen) a considerable reference count. Because the same instance is used elsewhere, failing to dereference it won't result in a memory leak.

If you change your script to return 1001 or return object() then you will see an initial reference count of 1 and a memory leak.

Community
  • 1
  • 1
ecatmur
  • 152,476
  • 27
  • 293
  • 366
  • Thank you for the quick reply, I did not know about the small integer thing! However, if I change the return value to 1001, or some other funny value, I still get a reference count of 2 instead of 1. – user1774143 Oct 25 '12 at 13:28
  • okay I see now that for returning object() the reference count is in fact 1! – user1774143 Oct 25 '12 at 13:40
2

ecatmur is right, numbers and strings are interned in Python, so instead you can try with a simple object() object.

A simple demo with gc:

import gc


def run():
    return 1

s = run()
print len(gc.get_referrers(s))  # prints a rather big number, 41 in my case

obj = object()
print len(gc.get_referrers(obj))  # prints 1

lst = [obj]
print len(gc.get_referrers(obj))  # prints 2

lst = []
print len(gc.get_referrers(obj))  # prints 1 again

A bit more: when CPython creates a new object, it calls a C macro _Py_NewReference to initialize the reference count to 1. Then uses Py_INCREF(op) and Py_DECREF(op) to increase and decrease the reference count.

K Z
  • 29,661
  • 8
  • 73
  • 78
  • Thank you! However if I replace s by a number like 123456.7565 I still get 1 referrer. If I do it in my python code called by my C++ code I still get 2. Any idea why this might be? – user1774143 Oct 25 '12 at 13:47
  • 1
    @user1774143: Are returning `123456.7565` from the function `run()`? If so the 2nd reference is the code object's tuple of constants: `run.__code__.co_consts`. – Eryk Sun Oct 25 '12 at 13:54
  • Ah I understand. In this tuple all constants within the method are stored so I have one additional reference here. Thanks! – user1774143 Oct 25 '12 at 14:02
  • @user1774143 it is also referenced by the tuple: (None, 123456.7565) – K Z Oct 25 '12 at 14:06