4

I'm writing Python 3 extensions in C++ and I'm trying to find a way to check if a PyObject is related to a type (struct) defining its instance layout. I'm only interested in static-size PyObject, not PyVarObject. The instance layout is defined by a struct with certain well-defined layout: mandatory PyObject header and (optional) user-defined members.

Below, is example of PyObject extension based on the well-known Noddy example in Defining New Types:

// Noddy struct specifies PyObject instance layout
struct Noddy {
    PyObject_HEAD
    int number;
};

// type object corresponding to Noddy instance layout
PyTypeObject NoddyType = {
    PyObject_HEAD_INIT(NULL)
    0,                         /*ob_size*/
    "noddy.Noddy",             /*tp_name*/
    sizeof(Noddy),             /*tp_basicsize*/
    0,                         /*tp_itemsize*/
    ...
    Noddy_new,                 /* tp_new */
};

It is important to notice that the Noddy is a type, a compile-time entity, but NoddyType is an object present in memory at run-time. The only obvious relation between the Noddy and NoddyType seems to be value of sizeof(Noddy) stored in tp_basicsize member.

The hand-written inheritance implemented in Python specifies rules which allow to cast between PyObject and type used to declare the instance layout of that particular PyObject:

PyObject* Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    // When a Python object is a Noddy instance,
    // its PyObject* pointer can be safely cast to Noddy
    Noddy *self = reinterpret_cast<Noddy*>(type->tp_alloc(type, 0));

    self->number = 0; // initialise Noddy members

    return reinterpret_cast<PyObject*>(self);
}

In circumstances like various slot functions, it is safe to assume "a Python object is a Noddy" and cast without any checks. However, sometimes it is necessary to cast in other situations, then it feels like a blind conversion:

void foo(PyObject* obj)
{
    // How to perform safety checks?
    Noddy* noddy = reinterpret_cast<Noddy*>(obj);
    ...
}

It is possible to check sizeof(Noddy) == Py_TYPE(obj)->tp_basicsize, but it is insufficient solution due to:

1) If a user will derive from Noddy

class BabyNoddy(Noddy):
    pass

and obj in foo points to instance of the BabyNoddy, Py_TYPE(obj)->tp_basicsize is diferent. But, it is still safe to cast to reinterpret_cast<Noddy*>(obj) to get pointer to the instance layout part.

2) There can be other struct declaring instance layout of the same size as Noddy:

struct NeverSeenNoddy {
    PyObject_HEAD
    short word1;
    short word2;
};

In fact, C langauge level, NeverSeenNoddy struct is compatible with the NoddyType type object - it can fit into NoddyType. So, cast could be perfectly fine.

So, my big question is this:

Is there any Python policy which could be used to determine if a PyObject is compatible with the Noddy instance layout?

Any way to check if PyObject* points to the object part which is embedded in the Noddy?

If not policy, is there any hack possible?

EDIT: There are a few questions which seem to be similar, but in my opinion they are different to the one I have asked. For example: Accessing the underlying struct of a PyObject

EDIT2: In order to understand why I marked Sven Marnach's response as the answer, see comments below that answer.

Community
  • 1
  • 1
mloskot
  • 37,086
  • 11
  • 109
  • 136

2 Answers2

5

In Python, you can check if obj is of type Noddy or a derived type by using the test isinstance(obj, Noddy). The test in the C-API whether some PyObject *obj is of type NoddyType or a derived type is basically the same, you use PyObject_IsInstance():

PyObject_IsInstance(obj, &NoddyType)

As for your second question, there is no way to achieve this, and if you think you need this, your design has severe shortcomings. It would be better to derive NeverSeenNoddyType from NoddyType in the first place -- then the above check will also recognize an object of the derived type as an instance of NoddyType.

Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
  • Thanks for the answer, but I know PyObject_IsInstance (and all other API working with types) and they don't provide solution to my question. As you have indicated, they check against NoddyType (run-time object), but I'm asking about checking if obj is related to C++ type Noddy (see struct in my example) - struct Noddy is PyObject extension, defines instance layout of PyObject. – mloskot Dec 11 '11 at 01:22
  • @mloskot: Maybe my answer is not very clear. You ask essentially to questions, marked with 1) and 2) in your post. `PyObject_IsInstance()` is the answer to 1). Regarding 2), this seems a design problem to me. If you want to do something like this, make sure all types participating have a common base type. This is my answer to question 2). So how does this not answer your question? (Note that I'm aware that 1) and 2) don't actually enumerate questions but rather problematic cases in your post.) – Sven Marnach Dec 11 '11 at 01:26
  • +1 as your answer does help. These are issues I also discuss, but my question actually is "Any way to check if `PyObject*` points to the object part which is embedded in the Noddy". Analogy from C++ world would be: is there any static-type property in `PyObject` that mimics `struct PyObject { typedef Noddy base_type; };` I have realised I expect too much from PyObject, unless I implement custom compile-time machinery like trairts for PyObject. Thanks! – mloskot Dec 12 '11 at 11:25
  • @mosklot: I just noticed that I misread a part of your question. You actually want to avoid `NeverSeenNoddy` to be mistaken as a `Noddy`. I thought you wanted to reinterpret something completely unrelated to `Noddy` as a `Noddy`, hence my talk about "design problems" etc. Having realised this, I don't see what question actually remains. Why doesn't `PyObject_IsInstance()` do the job? – Sven Marnach Dec 12 '11 at 13:24
  • Generally, `PyObject_IsInstance` does the job, but I wanted to be able to tell if `PyObject` is `Noddy` in situation and scope when `NoddyType` is not accessible. Noddy can be declared in a header included in number of .cpp files, but `NoddyType` name is local to particular translation unit. So, I brainstormed if it is possible to tell `PyObject` is related `Noddy` without access to `NoddyType` object. p.s. no idea why (at)sven-marnach does not link to your username, sorry. – mloskot Dec 12 '11 at 13:33
  • Why not simply add an `extern PyTypeObject NoddyType` to the header? If you need in in other translation units, just export it. I'm seriously confused now. – Sven Marnach Dec 12 '11 at 13:43
  • @mloskot: Note that if you comment on my posts, you don't need an @... -- I will be notified anyway because it's my post. – Sven Marnach Dec 12 '11 at 13:47
  • Sven, ye sindeed. The extern (or a function returning PyObjectType per instance layout type) is what I do now. I was wondering if there is any other way to achieve it, so I asked this question. Because your answer comes to similar conclusions, thus it confirms the solution I have worked out so far, I mark it as the answer. Thanks – mloskot Dec 12 '11 at 17:52
1

Becuase every object starts with PyObject_HEAD, it is always safe to access the fields defined by this header. One of the fields is ob_type (usually accessed using the Py_TYPE macro). If this points to NoddyType or any other type derived from NoddyType (which is what PyObject_IsInstance tells you), then you can assume the object's layout is that of struct Noddy.

In other words, an object is compatible with Noddy instance layout if its Py_TYPE points to NoddyType or any of its subclasses.

In the second question, the cast wouldn't be fine. The layouts of Noddy and NeverSeenNoddy are different, even though the size might be the same.

Assuming that NeverSeenNoddy is layout of a NeverSeenNoddy_Type type, you should never cast to NeverSeenNoddy if PyObject_IsInstance(obj, &NeverSeenNoddy_Type) is false.

If you want to have two C-level types with common fields, you should derive both types from common base that has only the common fields in the instance layout.

The subtypes should then include the base layout at the top of their layouts:

struct SubNoddy {
    // No PyObject_HEAD because it's already in Noddy
    Noddy noddy;
    int extra_field;
};

Then, if PyObject_IsInstance(obj, &SubNoddy_Type) returns true, you can cast to SubNoddy and access the extra_field field. If PyObject_IsInstance(obj, &Noddy_Type) returns true, you can cast to Noddy and access the common fields.

yak
  • 8,851
  • 2
  • 29
  • 23
  • +1 as Similarly to Sven Marnach's answer, it helps me to realise mistakes in my assumptions about the PyObject and type objects. – mloskot Dec 12 '11 at 11:26