Calling C from Python: passing list of numpy pointers

Question

I have a variable number of numpy arrays, which I'd like to pass to a C function. I managed to pass each individual array (using <ndarray>.ctypes.data_as(c_void_p)), but the number of array may vary a lot.

I thought I could pass all of these "pointers" in a list and use the PyList_GetItem() function in the C code. It works like a charm, except that the values of all elements are not the pointers I usually get when they are passed as function arguments.

Though, if I have :

from numpy import array
from ctypes import py_object

a1 = array([1., 2., 3.8])
a2 = array([222.3, 33.5])

values = [a1, a2]

my_cfunc(py_object(values), c_long(len(values)))

And my C code looks like :

void my_cfunc(PyObject *values)
{
    int i, n;

    n = PyObject_Length(values)
    for(i = 0; i < n; i++)
    {
        unsigned long long *pointer;
        pointer = (unsigned long long *)(PyList_GetItem(values, i);
        printf("value 0 : %f\n", *pointer);
    }
}

The printed value are all 0.0000

I have tried a lot of different solutions, using ctypes.byref(), ctypes.pointer(), etc. But I can't seem to be able to retrieve the real pointer values. I even have the impression the values converted by c_void_p() are truncated to 32 bits...

While there are many documentations about passing numpy pointers to C, I haven't seen anything about c_types within Python list (I admit this may seem strange...).

Any clue ?

This is probably because `PyList_GetItem` returns you a `PyObject*` which is the ndarray itself, to get underlying data you need to apply [`PyArray_DATA`](http://docs.scipy.org/doc/numpy/reference/c-api.array.html#PyArray_DATA) from `numpy.h`. — immerrr, Oct 15 '14 at 17:39
Unfortunately, I can't use Cython since this is one of a few hundreds modules, imported in Python web handler (handler.py/uwsgi). But I'll keep an eye on Cython :-) — dcexcal, Oct 17 '14 at 13:31
@immerr: Thank you... Your comment put me back on the right track, as I was getting dragged away by wrestling inefficiently in 'ctypes'... — dcexcal, Oct 17 '14 at 13:35

score 6 · Answer 1 · edited Jun 20 '20 at 09:12

After a few hours spent reading many pages of documentation and digging in numpy include files, I've finally managed to understand exactly how it works. Since I've spent a great amount of time searching for these exact explanations, I'm providing the following text as a way to avoid anyone to waste its time.

I repeat the question :

How to transfer a list of numpy arrays, from Python to C

(I also assume you know how to compile, link and import your C module in Python)

Passing a Numpy array from Python to C is rather simple, as long as it's going to be passed as an argument in a C function. You just need to do something like this in Python

from numpy import array
from ctypes import c_long

values = array([1.0, 2.2, 3.3, 4.4, 5.5])

my_c_func(values.ctypes.data_as(c_void_p), c_long(values.size))

And the C code could look like :

void my_c_func(double *value, long size)
{
    int i;
    for (i = 0; i < size; i++)
        printf("%ld : %.10f\n", i, values[i]);
}

That's simple... but what if I have a variable number of arrays ? Of course, I could use the techniques which parses the function's argument list (many examples in Stackoverflow), but I'd like to do something different.

I'd like to store all my arrays in a list and pass this list to the C function, and let the C code handle all the arrays.

In fact, it's extremely simple, easy et coherent... once you understand how it's done ! There is simply one very simple fact to remember :

Any member of a list/tuple/dictionary is a Python object... on the C side of the code !

You can't expect to directly pass a pointer as I initially, and wrongly, thought. Once said, it sounds very simple :-) Though, let's write some Python code :

from numpy import array

my_list = (array([1.0, 2.2, 3.3, 4.4, 5.5]),
           array([2.9, 3.8. 4.7, 5.6]))

my_c_func(py_object(my_list))

Well, you don't need to change anything in the list, but you need to specify that you are passing the list as a PyObject argument.

And here is the how all this is being accessed in C.

void my_c_func(PyObject *list)
{
    int i, n_arrays;

    // Get the number of elements in the list
    n_arrays = PyObject_Length(list);

    for (i = 0; i LT n_arrays; i++)
    {
        PyArrayObject *elem;
        double *pd;

        elem = PyList_GetItem(list,
                              i);
        pd = PyArray_DATA(elem);
        printf("Value 0 : %.10f\n", *pd);
    }
}

Explanation :

The list is received as a pointer to a PyObject
We get the number of array from the list by using the PyObject_Length() function.
PyList_GetItem() always return a PyObject (in fact a void *)
We retrieve the pointer to the array of data by using the PyArray_DATA() macro.

Normally, PyList_GetItem() returns a PyObject *, but, if you look in the Python.h and ndarraytypes.h, you'll find that they are both defined as (I've expanded the macros !):

typedef struct _object {
    Py_ssize_t ob_refcnt;
    struct _typeobject *ob_type;
} PyObject;

And the PyArrayObject... is exactly the same. Though, it's perfectly interchangeable at this level. The content of ob_type is accessible for both objects and contain everything which is needed to manipulate any generic Python object. I admit that I've used one of its member during my investigations. The struct member tp_name is the string containing the name of the object... in clear text; and believe me, it helped ! This is how I discovered what each list element was containing.

While these structures don't contain anything else, how is it that we can access the pointer of this ndarray object ? Simply using object macros... which use an extended structure, allowing the compiler to know how to access the additional object's elements, behind the ob_type pointer. The PyArray_DATA() macro is defined as :

#define PyArray_DATA(obj) ((void *)((PyArrayObject_fields *)(obj))->data)

There, it's casting the PyArayObject * as a PyArrayObject_fields * and this latest structure is simply (simplified and macros expanded !) :

typedef struct tagPyArrayObject_fields {
    Py_ssize_t ob_refcnt;
    struct _typeobject *ob_type;
    char *data;
    int nd;
    npy_intp *dimensions;
    npy_intp *strides;
    PyObject *base;
    PyArray_Descr *descr;
    int flags;
    PyObject *weakreflist;
} PyArrayObject_fields;

As you can see, the first two element of the structure are the same as a PyObject and PyArrayObject, but additional elements can be addressed using this definition. It is tempting to directly access these elements, but it's a very bad and dangerous practice which is more than strongly discouraged. You must rather use the macros and don't bother with the details and elements in all these structures. I just thought you might be interested by some internals.

Note that all PyArrayObject macros are documented in http://docs.scipy.org/doc/numpy/reference/c-api.array.html

For instance, the size of a PyArrayObject can be obtained using the macro PyArray_SIZE(PyArrayObject *)

Finally, it's very simple and logical, once you know it :-)

In my case I was iterating a list of numpy arrays, and without the `.ctypes.data_as(c_void_p)` it would send the same memory address to C for each element. Thanks for the solution! — Addison Klinke, Jun 03 '21 at 14:22
Hi dcexcal, I am getting the following error. *** stack smashing detected ***: terminated — Prince Patel, Feb 12 '22 at 02:37
Wow, never go this one :-) Are you sure about the values of your pointers in C ? It might be that you are stepping out of your arrays... The compiler uses a 'canary' mechanism to detect wrong access behavior. This could be disabled with the option '-fno-stack-protector'. However, the problem will not go away with this option ! Are you playing a lot with your pointers and accessing locations with your own computation ? This might be the cause. There is a good, and extensive, explanation on : https://stackoverflow.com/questions/1345670/stack-smashing-detected — dcexcal, Feb 14 '22 at 11:09
I have to admit that I haven't played with C/Python interaction since a long time (2014 in fact). Python has changed a lot since 2.7 and we are now in 3.11; which is likely to have different bindings. I should probably look at https://nanobind.readthedocs.io/en/latest/why.html for the future. I will certainly work on C/Python binding in the near future (compaction of multiple list of numbers), and I'll update this post if needed. — dcexcal, Mar 30 '23 at 19:47

Calling C from Python: passing list of numpy pointers

1 Answers1

Linked