74

Assume we have the type Noddy as defined in the tutorial on writing C extension modules for Python. Now we want to create a derived type, overwriting only the __new__() method of Noddy.

Currently I use the following approach (error checking stripped for readability):

PyTypeObject *BrownNoddyType =
    (PyTypeObject *)PyType_Type.tp_alloc(&PyType_Type, 0);
BrownNoddyType->tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE;
BrownNoddyType->tp_name = "noddy.BrownNoddy";
BrownNoddyType->tp_doc = "BrownNoddy objects";
BrownNoddyType->tp_base = &NoddyType;
BrownNoddyType->tp_new = BrownNoddy_new;
PyType_Ready(BrownNoddyType);

This works, but I'm not sure if it is The Right Way To Do It. I would have expected that I have to set the Py_TPFLAGS_HEAPTYPE flag, too, because I dynamically allocate the type object on the heap, but doing so leads to a segfault in the interpreter.

I also thought about explicitly calling type() using PyObject_Call() or similar, but I discarded the idea. I would need to wrap the function BrownNoddy_new() in a Python function object and create a dictionary mapping __new__ to this function object, which seems silly.

What is the best way to go about this? Is my approach correct? Is there an interface function I missed?

Update

There are two threads on a related topic on the python-dev mailing list (1) (2). From these threads and a few experiments I deduce that I shouldn't set Py_TPFLAGS_HEAPTYPE unless the type is allocated by a call to type(). There are different recommendations in these threads whether it is better to allocate the type manually or to call type(). I'd be happy with the latter if only I knew what the recommended way to wrap the C function that is supposed to go in the tp_new slot is. For regular methods this step would be easy -- I could just use PyDescr_NewMethod() to get a suitable wrapper object. I don't know how to create such a wrapper object for my __new__() method, though -- maybe I need the undocumented function PyCFunction_New() to create such a wrapper object.

Community
  • 1
  • 1
Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
  • 3
    As I know, that is the way to do it:( But I'm not sure. I think so because the requirement, to overwrite the __new__ method, is kind of peculiar.. – zchenah Dec 09 '11 at 11:36
  • @CHENZhao: In my use case, the base type is wrapping a C++ class with virtual member functions. The derived types only need to overwrite `__new__()` to allocate a different C++ class. Methods do not need do be overwritten since they call the virtual member function. Note that I solved this problem meanwhile by a completely different design using template techniques. The original question still remains, though. – Sven Marnach Dec 09 '11 at 15:04
  • If you find no answer here maybe you can ask python-dev mailing list and come back here with the answer – Xavier Combelle Dec 15 '11 at 12:38
  • 3
    @XavierCombelle: The python-dev mailing list is meant to coordinate the development of Python itself. It's not meant for users asking questions. – Sven Marnach Dec 15 '11 at 15:08
  • @SvenMarnach other possibility python-list@python.org – Xavier Combelle Dec 15 '11 at 16:28
  • @Sven Marnach, you have determined that this works, as it looks like it ought to. What you want to know is: will future changes to CPython break it? To answer that you need to ask on python-dev and get a ruling from the BDFL. – Ben Jan 03 '12 at 16:16
  • @Ben: My question is not whether this will work in future versions of CPython. My questions are whether my approach is really correct and whether there is a better way to do it. (Didn't I write this in the post above?) – Sven Marnach Jan 06 '12 at 15:38
  • what does the update in the title suggest? I can't figure that out :S – 0xc0de Jan 11 '12 at 12:05
  • @0xc0de: No idea, I rolled it back. I also wonder how a new user was able to make this edit. Thanks for pointing this out! – Sven Marnach Jan 11 '12 at 12:10
  • Did you ever get it fully working? I tried to do the same but somehow the GC tries to clean-up my dynamically allocated classes and ends up with an assertion failure: python: Objects/typeobject.c:2683: type_traverse: Assertion `type->tp_flags & Py_TPFLAGS_HEAPTYPE' failed. Though it does not cause any problem with nondebug-builds of Python (same issue here: https://bugs.launchpad.net/meliae/+bug/893461), I suspect the right way to do this is to define a metaclass in C and define the dynamic classes as its instances. – subhacom Apr 20 '12 at 10:13
  • @subhacom: I completely redesigned my approach in a way that this was no longer needed. I'm pretty sure the right approach is to call `type()` just as you would do from Python. The only point I'm not sure about is what kind of wrapper object to use for `__new__()`. It would be easy enough to try if `PyCFunction_New()` does the trick, as this is my best guess. Unfortunately, this function is undocumented, but this seems to be an oversight. – Sven Marnach Apr 20 '12 at 13:21
  • @SvenMarnach Based on your question, I continued and asked folks at Python CAPI-SIG for canonical means of dynamic construction of extension types. You may find it useful too http://mail.python.org/pipermail/capi-sig/2012-May/000465.html – mloskot May 10 '12 at 15:11

3 Answers3

5

I encountered the same problem when I was modifying an extension to be compatible with Python 3, and found this page when I was trying to solve it.

I did eventually solve it by reading the source code for the Python interpreter, PEP 0384 and the documentation for the C-API.

Setting the Py_TPFLAGS_HEAPTYPE flag tells the interpreter to recast your PyTypeObject as PyHeapTypeObject, which contains additional members that must also be allocated. At some point the interpreter attempts to refer to these extra members and, if you leave them unallocated, it will cause a segfault.

Python 3.2 introduced the C structures PyType_Slot and PyType_Spec and the C function PyType_FromSpec that simplify the creation of dynamic types. In a nutshell, you use PyType_Slot and PyType_Spec to specify the tp_* members of the PyTypeObject and then call PyType_FromSpec to do the dirty work of allocating and initialising the memory.

From PEP 0384, we have:

typedef struct{
  int slot;    /* slot id, see below */
  void *pfunc; /* function pointer */
} PyType_Slot;

typedef struct{
  const char* name;
  int basicsize;
  int itemsize;
  int flags;
  PyType_Slot *slots; /* terminated by slot==0. */
} PyType_Spec;

PyObject* PyType_FromSpec(PyType_Spec*);

(The above isn't a literal copy from PEP 0384, which also includes const char *doc as a member of PyType_Spec. But that member doesn't appear in the source code.)

To use these in the original example, assume we have a C structure, BrownNoddy, that extends the C structure for the base class Noddy. Then we would have:

PyType_Slot slots[] = {
    { Py_tp_doc, "BrownNoddy objects" },
    { Py_tp_base, &NoddyType },
    { Py_tp_new, BrownNoddy_new },
    { 0 },
};
PyType_Spec spec = { "noddy.BrownNoddy", sizeof(BrownNoddy), 0,
                      Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, slots };
PyTypeObject *BrownNoddyType = (PyTypeObject *)PyType_FromSpec(&spec);

This should do everything in the original code, including calling PyType_Ready, plus what is necessary for creating a dynamic type, including setting Py_TPFLAGS_HEAPTYPE, and allocating and initialising the extra memory for a PyHeapTypeObject.

I hope that's helpful.

Andy Wood
  • 51
  • 1
  • 1
2

I apologize up front if this answer is terrible, but you can find an implementation of this idea in PythonQt, in particular I think the following files might be useful references:

This fragment from PythonQtClassWrapper_init jumps out at me as being somewhat interesting:

static int PythonQtClassWrapper_init(PythonQtClassWrapper* self, PyObject* args, PyObject* kwds)
{
  // call the default type init
  if (PyType_Type.tp_init((PyObject *)self, args, kwds) < 0) {
    return -1;
  }

  // if we have no CPP class information, try our base class
  if (!self->classInfo()) {
    PyTypeObject*  superType = ((PyTypeObject *)self)->tp_base;

    if (!superType || (superType->ob_type != &PythonQtClassWrapper_Type)) {
      PyErr_Format(PyExc_TypeError, "type %s is not derived from PythonQtClassWrapper", ((PyTypeObject*)self)->tp_name);
      return -1;
    }

    // take the class info from the superType
    self->_classInfo = ((PythonQtClassWrapper*)superType)->classInfo();
  }

  return 0;
}

It's worth noting that PythonQt does use a wrapper generator, so it's not exactly in line with what you're asking for, but personally I think trying to outsmart the vtable isn't the most optimal design. Basically, there are many different C++ wrapper generators for Python and people use them for a good reason - they're documented, there are examples floating around in search results and on stack overflow. If you hand roll a solution for this that nobody's seen before, it'll be that much harder for them to debug if they run into problems. Even if it's closed-source, the next guy who has to maintain it will be scratching his head and you'll have to explain it to every new person who comes along.

Once you get a code generator working, all you need to do is maintain the underlying C++ code, you don't have to update or modify your extension code by hand. (Which is probably not too far away from the tempting solution you went with)

The proposed solution is an example of breaking the type-safety that the newly introduced PyCapsule provides a bit more protection against (when used as directed).

So, while its possible it might not be the best long term choice to implement derived/subclasses this way, but rather wrap the code and let the vtable do what it does best and when the new guy has questions you can just point him at the documentation for whatever solution fits best.

This is just my opinion though. :D

synthesizerpatel
  • 27,321
  • 5
  • 74
  • 91
  • Sorry for taking so much time to comment on your answer. I did not yet find the time to carefully look through all the linked QT source files. Unfortunately I fail to see how the example code you provided deals with the specific problems I had -- how do I allocate memory for the type correctly? Is it possible to have a dynamically allocated type garbage collected? etc. – Sven Marnach Jan 18 '12 at 12:42
  • I did evaluate a few of the C++ wrapper generators -- in particular SWIG, boost.python and PyCXX. While the least versatile of these, PyCXX came closest to what I needed, but I figured that writing this myself from scratch would be the best option in my specific situation. (I won't explain my reasons in detail here.) – Sven Marnach Jan 18 '12 at 12:44
1

One way to try and understand how to do this is to create a version of it using SWIG. See what it produces and see if it matches or is done a different way. From what I can tell the people who have been writing SWIG have an in depth understanding of extending Python. Can't hurt to see how they do things at any rate. It may help you understand this problem.

Demolishun
  • 1,592
  • 12
  • 15
  • Thanks for your answer. As far as I am aware, SWIG does not generate types dynamically, but rather uses the static approach described in the tutorial (see the link at the beginning of my post). boost.python does, in a way, create types dynamically, but it uses a rather complex technique that is not applicable in my case for various reasons, one of them being that I'd like to avoid static variables since my library is a header-only. – Sven Marnach Jan 09 '12 at 15:32
  • Ahh, yes, I totally missed the dynamic part of the title. – Demolishun Jan 10 '12 at 07:10