1

I've been trying to track down an issue with some Python code compiled from C++ (using swig).

The following information was retrieved by executing gdb using python3-dbg (python version 3.8).

Python crashes with this error:

Fatal Python error: deallocating None
Python runtime state: initialized

The offending line of python code on the call stack is:

value = self.get(name)

Note that the value of name should be a valid string. There is nothing here that seems like it could cause a problem, but it usually happens while processing the same set of data. This line of code is executed hundreds of times before the fault.

The first call to none_dealloc in the stack trace is here:

#26 0x000000000046ef6e in none_dealloc (ignore=<optimized out>) at ../Objects/object.c:1585
#27 0x00000000004706da in _Py_Dealloc (op=<optimized out>) at ../Objects/object.c:2215
#28 0x00000000004470e8 in _Py_DECREF (op=<optimized out>, lineno=430, filename=0x699244 "../Objects/frameobject.c") at ../Include/object.h:478
#29 frame_dealloc (f=0x25938d0) at ../Objects/frameobject.c:430
#30 0x00000000004706da in _Py_Dealloc (op=op@entry=0x25938d0) at ../Objects/object.c:2215
#31 0x00000000004e02e6 in _Py_DECREF (op=0x25938d0, lineno=4314, filename=0x6e3343 "../Python/ceval.c") at ../Include/object.h:478
#32 _PyEval_EvalCodeWithName (_co=0x7fdc678c0520, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=2, kwnames=0x0, kwargs=0x7fdc6468cf90, kwcount=<optimized out>, kwstep=1, defs=0x7fdc678bf6f8, defcount=1, kwdefs=0x0, closure=0x0, name=0x7fdc67d0c270, qualname=0x7fdc678c2280) at ../Python/ceval.c:4314

This is confusing to me since since the Python code initiating the dealloc at frameobject.c 430 in Python 3.8 is:

/* Kill all local variables */
valuestack = f->f_valuestack;
for (p = f->f_localsplus; p < valuestack; p++)
    Py_CLEAR(*p);

The definition of Py_CLEAR checks for NULL before trying to decrease the reference count and potentially deallocate the pointer. How could this cause any sort of fault? When I look at the value of op or _py_tmp in the call stack, it reads as "optimized out".

#define Py_CLEAR(op)                            \
    do {                                        \
        PyObject *_py_tmp = _PyObject_CAST(op); \
        if (_py_tmp != NULL) {                  \
            (op) = NULL;                        \
            Py_DECREF(_py_tmp);                 \
        }                                       \
    } while (0)

What does this error mean? What should I look for?

Tails86
  • 507
  • 5
  • 19

1 Answers1

3

None is a special, global object in Python. What the "deallocating None" error means is that the reference count on "None" has reached 0, and None is to be deallocated (AKA deleted). This is a bad thing!

It likely means that a function within your imported Python module written in C/C++ is returning Py_None without first calling Py_INCREF(Py_None). The error then shows itself elsewhere in a seemingly random, innocuous line of code when someone throws away its reference to None because the ref count on None is now 1 too few. Refer to this question for more information: Why should Py_INCREF(Py_None) be required before returning Py_None in C?

Tails86
  • 507
  • 5
  • 19
  • (This is the advice I would have given myself with the knowledge I know now.) – Tails86 Oct 15 '21 at 17:46
  • 1
    @Tails86: As someone who knows, this is 99% likely to be the case (the alternatives are weird memory corruption issues involving out of bounds pointers rewriting stuff randomly). Reference count hygiene in CPython is as important as precisely matching your `malloc`s and `free`s in regular C; add an extra `INCREF` or miss a required `DECREF` and you've leaked memory; miss a required `INCREF` or add an extra `DECREF` and you'll eventually hit a use-after-free error (reference count drops to 0 when at least one other holder exists, and things explode when they try to access it). – ShadowRanger Oct 15 '21 at 18:54
  • 1
    You got lucky in this case; the mistake you made was with `None` which shouts when it would be deallocated (only the other singletons like `True`, `False`, `NotImplemented`, and `Ellipsis` are likely to check this for you). As a side-note, when you want to return `None` and don't already have an owned reference to it, you can [use the `Py_RETURN_NONE` macro](https://docs.python.org/3/c-api/none.html#c.Py_RETURN_NONE) to increment the reference count and return it as a single line (it's still doing `Py_INCREF` then `return`, but it looks like a single logical operation). – ShadowRanger Oct 15 '21 at 18:56
  • @ShadowRanger Out of bounds writing is actually what I spent days scouring my code for! Thanks for the tip on Py_RETURN_NONE. I'll keep that in mind! – Tails86 Oct 15 '21 at 19:00