4

Several Python structures seem to need a sentinel (probably in order to know when to "stop"). But why do some, like arrays of PyMethodDef, have a sentinel element initialized with multiple NULLs?

For example zip:

static PyMethodDef zip_methods[] = {
    {"__reduce__",   (PyCFunction)zip_reduce,   METH_NOARGS, reduce_doc},
    {NULL,           NULL}           /* sentinel */
};

Why does the last PyMethodDef in the "sentinel array" have the two NULLs? Why not just 1? Or given that __reduce__ has 4 entries why not 4 NULLs as sentinel element?

MSeifert
  • 145,886
  • 38
  • 333
  • 352
  • @Olaf Why remove the C tag? It's C code I'm investigating here after all. – MSeifert Apr 12 '17 at 14:08
  • "_I realize that c needs a sentinel to know when it has to "stop"_" - That's wrong and no requirement of the language, nor used for most arrays in C. – too honest for this site Apr 12 '17 at 14:09
  • No, you are investigating a Python implementation with a wrong prerequisite. – too honest for this site Apr 12 '17 at 14:10
  • C'mon @Olaf, this is about defining and initialising a C-array. I agree that the title not explicitly tells this, but still ... – alk Apr 12 '17 at 14:10
  • This has nothing to do with the C language, but the Python framework for modules. And there is no `NULL` in the array. – too honest for this site Apr 12 '17 at 14:12
  • 1
    @Olaf question is about _CPython_ implementation, which **is** implemented in C... While post introduction may contain some invalid presumptions, actual _question_ asked does not deserve such reaction (IMHO). – Łukasz Rogalski Apr 12 '17 at 14:15
  • 1
    @Olaf I may be mistaken but `static PyMethodDef zip_methods[]` creates a C array of `PyMethodDef` structs and the last one (sentinel) contains two NULLs. I think this has more to do with `C` or the `python-C-API` than with python itself, right? However I tried to re-formulate the question, I hope it's clearer now. – MSeifert Apr 12 '17 at 14:20
  • @MSeifert: (Ok, I'm fine with the `python-c-api` tag.) Yes, but the `NULL`s are not elements of the array, but of an _element of the array_ (very important difference). So, there are no "multiple terminators" (as they are typically called). Why both fields must be `NULL` (and why no designated initialisers are used) depends on how they ae processed. Reading the Python source code would be a good start. – too honest for this site Apr 12 '17 at 14:39
  • @Olaf "Why both fields must be NULL (and why no designated initialisers are used) depends on how they ae processed. Reading the Python source code would be a good start." - That's not really helpful. If I knew what and where to look for in the source code I wouldn't have asked the question. – MSeifert Apr 12 '17 at 15:41
  • It is the best advice, given your question does not show any effort on your own. What is the problem? The CPython sources are free and easily grep-able. OTOH, you could just keep the pattern; it is not clear what your problem is anyway, von mir aus omit the second null pointer constant; that will not change anything. – too honest for this site Apr 12 '17 at 15:45
  • The question is more aimed to understand the pattern that seems to be reused everywhere. If you have a suggestion what to `grep` for that would definetly help. – MSeifert Apr 12 '17 at 15:48
  • 1
    There's a least [one example](https://github.com/python/cpython/blob/65c5b096ac2c6608d296f1603cd4792086108c95/Python/import.c#L3368) of a single-element sentinel in Python 2.7 (gone in 3.x). The [docs](https://docs.python.org/2.7/extending/extending.html#the-module-s-method-table-and-initialization-function) ([Py3](https://docs.python.org/3/extending/extending.html#the-module-s-method-table-and-initialization-function)) want you to use `{NULL, NULL, 0, NULL}`, which happens basically nowhere. Of course: [`{}`](http://stackoverflow.com/questions/30359255/python-sentinel-in-c-extension) – dhke Apr 12 '17 at 16:13

1 Answers1

4

I don't think it does. For two reasons:

1) In the Python source code it only checks the name against NULL.

As far as I'm aware, PyMethodDef arrays are used in two places: when attaching methods to a type, and when attaching methods to a module.

To find the relevant bit of code start by noting that all types go through PyType_Ready and most modules go through PyModule_Init so start the search there. PyModule_Create forwards to PyModule_Create2. In PyType_Ready the methods get dealt with by the internal function add_methods. In PyModule_Create2 there is all call to PyModule_AddFunctions which is actually a public function if you want to do low level stuff yourself and which in turn calls the internal function _add_methods_to_object.

Both of these internal functions have a for loop to loop over the methods and add them to the relevant dictionary. In both cases the condition to continue looping is meth->ml_name!=NULL.

Therefore, at least currently, only the name is checked.

2) In both C and C++ partial initialization guarantees that the remaining fields are zero/default initialized. Therefore just initializing the first element of the sentinel to 0 ensures that all the other elements are initialized to 0. You can even just use {}.

(As a side note, Python uses this automatic zero initialization a lot with the large structs it defines, for example PyTypeObject which is huge and which you rarely bother filling in completely.)

After writing this answer I found that this had already been discussed.


So in summary - Python only checks the ml_name (although that's an implementation detail so I guess could change in future if they find a use for a NULL name with a non-NULL method), and C automatically zeros the sentinel anyway. I don't know why the convention appears to be to set two elements, but there's something to be said from following convention.

Community
  • 1
  • 1
DavidW
  • 29,336
  • 6
  • 55
  • 86