After a few hours spent reading many pages of documentation and digging in numpy include files, I've finally managed to understand exactly how it works. Since I've spent a great amount of time searching for these exact explanations, I'm providing the following text as a way to avoid anyone to waste its time.
I repeat the question :
How to transfer a list of numpy arrays, from Python to C
(I also assume you know how to compile, link and import your C module in Python)
Passing a Numpy array from Python to C is rather simple, as long as it's going to be passed as an argument in a C function. You just need to do something like this in Python
from numpy import array
from ctypes import c_long
values = array([1.0, 2.2, 3.3, 4.4, 5.5])
my_c_func(values.ctypes.data_as(c_void_p), c_long(values.size))
And the C code could look like :
void my_c_func(double *value, long size)
{
int i;
for (i = 0; i < size; i++)
printf("%ld : %.10f\n", i, values[i]);
}
That's simple... but what if I have a variable number of arrays ? Of course, I could use the techniques which parses the function's argument list (many examples in Stackoverflow), but I'd like to do something different.
I'd like to store all my arrays in a list and pass this list to the C function, and let the C code handle all the arrays.
In fact, it's extremely simple, easy et coherent... once you understand how it's done ! There is simply one very simple fact to remember :
Any member of a list/tuple/dictionary is a Python object... on the C side of the code !
You can't expect to directly pass a pointer as I initially, and wrongly, thought. Once said, it sounds very simple :-) Though, let's write some Python code :
from numpy import array
my_list = (array([1.0, 2.2, 3.3, 4.4, 5.5]),
array([2.9, 3.8. 4.7, 5.6]))
my_c_func(py_object(my_list))
Well, you don't need to change anything in the list, but you need to specify that you are passing the list as a PyObject argument.
And here is the how all this is being accessed in C.
void my_c_func(PyObject *list)
{
int i, n_arrays;
// Get the number of elements in the list
n_arrays = PyObject_Length(list);
for (i = 0; i LT n_arrays; i++)
{
PyArrayObject *elem;
double *pd;
elem = PyList_GetItem(list,
i);
pd = PyArray_DATA(elem);
printf("Value 0 : %.10f\n", *pd);
}
}
Explanation :
- The list is received as a pointer to a PyObject
- We get the number of array from the list by using the PyObject_Length() function.
PyList_GetItem()
always return a PyObject (in fact a void *
)
- We retrieve the pointer to the array of data by using the
PyArray_DATA()
macro.
Normally, PyList_GetItem()
returns a PyObject *, but, if you look in the Python.h and ndarraytypes.h, you'll find that they are both defined as (I've expanded the macros !):
typedef struct _object {
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
} PyObject;
And the PyArrayObject... is exactly the same. Though, it's perfectly interchangeable at this level. The content of ob_type is accessible for both objects and contain everything which is needed to manipulate any generic Python object. I admit that I've used one of its member during my investigations. The struct member tp_name is the string containing the name of the object... in clear text; and believe me, it helped ! This is how I discovered what each list element was containing.
While these structures don't contain anything else, how is it that we can access the pointer of this ndarray object ? Simply using object macros... which use an extended structure, allowing the compiler to know how to access the additional object's elements, behind the ob_type pointer. The PyArray_DATA()
macro is defined as :
#define PyArray_DATA(obj) ((void *)((PyArrayObject_fields *)(obj))->data)
There, it's casting the PyArayObject *
as a PyArrayObject_fields *
and this latest structure is simply (simplified and macros expanded !) :
typedef struct tagPyArrayObject_fields {
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
char *data;
int nd;
npy_intp *dimensions;
npy_intp *strides;
PyObject *base;
PyArray_Descr *descr;
int flags;
PyObject *weakreflist;
} PyArrayObject_fields;
As you can see, the first two element of the structure are the same as a PyObject and PyArrayObject, but additional elements can be addressed using this definition. It is tempting to directly access these elements, but it's a very bad and dangerous practice which is more than strongly discouraged. You must rather use the macros and don't bother with the details and elements in all these structures. I just thought you might be interested by some internals.
Note that all PyArrayObject macros are documented in http://docs.scipy.org/doc/numpy/reference/c-api.array.html
For instance, the size of a PyArrayObject can be obtained using the macro PyArray_SIZE(PyArrayObject *)
Finally, it's very simple and logical, once you know it :-)