-6

I was reading about implementation of builtin functions of python when I came across this C implementation of len function

static PyObject *
builtin_len(PyObject *module, PyObject *obj)
/*[clinic end generated code: output=fa7a270d314dfb6c input=bc55598da9e9c9b5]*/
{
Py_ssize_t res;

res = PyObject_Size(obj);
if (res < 0) {
    assert(PyErr_Occurred());
    return NULL;
}
return PyLong_FromSsize_t(res);

I am not able to understand what is going on in this code. I have no idea how C works. Can someone explain what this code is doing?

I got the code from https://github.com/python/cpython/blob/master/Python/bltinmodule.c

Edit: I was just curious how len function is so fast and stumbled across this code. I just want to know why function PyObject_Size is used to check size of object is zero and then PyLong_FromSsize_t to return the actual size.

Sim101011
  • 305
  • 1
  • 13
  • 4
    *"I have no idea how C works"* - then learn it first? SO is not a tutorial service – UnholySheep Dec 31 '18 at 13:41
  • 4
    "I have no idea how C works" Start by at least getting a basic understanding of C syntax. No one can teach you C within an SO answer – DeepSpace Dec 31 '18 at 13:42
  • I was just curious how len function is so fast and stumbled across this code. I am not asking to get a C tutorial. I just wanted to know why function PyObject_Size is used to check size of object is zero and then PyLong_FromSsize_t to return the actual size – Sim101011 Dec 31 '18 at 13:47
  • 2
    Did you check what those functions do? Do you understand how the Python type system is implemented in CPython? (Hint: you cannot return a `Py_ssize_t` to Python) – UnholySheep Dec 31 '18 at 13:50
  • I am not a programmer, I just use python to dabble into data science, machine learning for fun. My intent was not to become a python or CPython expert. I was not aware that I have to be an expert to ask a question on SO. I was just curious. And yes I tried to find implementation of these functions but was not able to. – Sim101011 Dec 31 '18 at 13:59
  • The internals of the CPython implementation *are* an expert domain - unless you plan to become an expert on it (or are implementing your own programming language) you won't be able to learn much from glancing at isolated code segments. And explaining these internals is too broad for a SO question – UnholySheep Dec 31 '18 at 14:06
  • I referred to this answer https://stackoverflow.com/a/20302670/4652493, and thought that explanation might be as simple as this. But thanks, from now on I will just limit myself to asking questions about errors I am getting and not get into territory of experts. Thanks. – Sim101011 Dec 31 '18 at 14:13

1 Answers1

8

There is nothing special with this function. Usually the functions written in C, especially those that do not call Python code, are much faster than ones written in Python.

I am specifically taking the stance here that a reader knows how C works, otherwise the explanation would rather be a book.

The builtin_len is the one that is called when len(foo) is executed in Python code. The PyObject *obj argument to the function references the object given as the argument (foo), and PyObject *self will contain a reference to the containing module of builtin_len.

Each container in Python must have a length between 0 and the maximum value allowed by Py_ssize_t. PyObject_Size(obj); is a function/macro that gets the size of the given object through its obj->ob_type->tp_as_sequence->sq_length or obj->ob_type->tp_as_mapping->mp_length. Upon error an exception is set as raised for the current thread and a number < 0 (-1) is returned.

The return NULL; signifies the caller that an exception has occurred and it must act accordingly - if it is a function call instruction in Python bytecode, it will cause an exception to be raised; if it is C code, then it will behave in a manner similar to this function - returning NULL or invalid value if an exception occurred; or it can clear the exception or replace it with another one.

Otherwise if greater than or equal to 0, the Py_ssize_t res which is of a C integer type, is converted to a Python int object, by either returning an existing int object or constructing a new one. The Python int object is called PyLong in CPython 3 for historical reasons. PyLong_FromSsize_t() is one of many functions - this one is able to convert any value of type Py_ssize_t to a Python int with the same value. The reference to this object, like all other objects, are kept as pointers to the (semi-opaque) PyObject structure, and this is returned.

The assert(PyErr_Occurred()); is an assertion that is in effect in debug builds of Python only. It asserts that upon getting a negative number from PyObject_Size, signifying an exception being thrown, the exception has also been properly set; if not present, it will abort the entire CPython process outright. It isn't in effect in release builds of Python because "asserts never fail".