0

I am trying to pass a cython-numpy array to a c struct. Which works sometimes but crashes if the array becomes too 'large'. While this is not really large (like <100k double type entries) and I cannot figure out where this limit is coming from. A few code snippets (full code attached).

Building on Win7, Cython 0.28.4, Python 3.6, mingw64

I tried this with different lengths of the array. The latest thing I discovered that it always crashes if the array length is larger than 2**16-512 entries. But I have no clue why.

The cython file:

//cy_file.pyx
//...
cdef public struct Container:
    double* np_array
// ...
cdef public void create_np_array(Container *container):
  cdef numpy.ndarray[numpy.float_t, ndim=1, mode = 'c'] np_array
  # just create a numpy array 
  longarray = np.linspace(1.0, 5.0, 100000)
  np_array = numpy.ascontiguousarray(numpy.array(longarray), dtype=float)
  container.np_array = <double*> np_array.data

And for the c-file:

//c_file.c
#include "Python.h"
#include "cy_file.h"

struct Container container;
//...
Py_Initialize();
PyInit_cy_file();
// and call the cython function that fills up the numpy array
create_np_array(&container, start, stop, n_elements, n);
// shutdown of python interpreter
Py_Finalize();

// *** here comes the crash if the array longarray is 'too long' ***
container.np_array[0]

Can anyone give me a hint what goes wrong here or how to debug?

Thanks and cheers, Tim

ToolTim
  • 29
  • 5

2 Answers2

0

The memory is owned by the Numpy array and gets freed when the reference count of the numpy array drops to zero, which is most likely at the end of create_np_array. Failing that Py_Finalize() attempts to free all remaining Python objects.

Your attempt to access this memory is always invalid - the fact that it only fails for arrays of certain sizes is just "luck".

There isn't any one good solution, but here are some suggestions:

  1. Use Py_INCREF to manually increase the reference count of the owning numpy array (and then decrease it again manually when you are done with the memory it holds) so that it is not destroyed at the end of the function. Make sure you keep your memory accesses before Py_Finalize.

  2. Handle the memory allocation in your C code and assign it to the Numpy array so that the numpy array does not own it using PyArray_SimpleNewFromData. You must be careful that the Numpy array does not outlive the memory.

DavidW
  • 29,336
  • 6
  • 55
  • 86
  • Thanks for pointing towards the right direction for me. I found some additional discussions on stackoverflow regarding `PyArray_SimpleNewFromData`, this one was particularly helpful: [link](https://stackoverflow.com/questions/27829946/extend-python-with-c-return-numpy-array-gives-garbage) – ToolTim Jan 17 '19 at 08:10
  • Another comment with helpful insight to the topic can be found here: [link](http://gael-varoquaux.info/programming/cython-example-of-exposing-c-computed-arrays-in-python-without-data-copies.html) – ToolTim Mar 06 '19 at 16:49
0

Ok, finally found the answer with the helpful hints:

The idea is best described in a blog post including a complete code sample on github. So I just refer to this post here which answers the question in great detail.

Thanks!

ToolTim
  • 29
  • 5
  • This is a [link-only answer](https://meta.stackexchange.com/q/225370/284827) at the moment. Could you edit it in such a way that if the link dies, your answer is still useful for others? – Wai Ha Lee Mar 06 '19 at 17:03