1

In cython, how do I create an ndarray object with defined properties without allocating memory for its contents?

My problem is that I want to call a function that requires a ndarray but my data is in a pure c array. Due to some restrictions I cannot switch to using an ndarray directly.

Code-segement to illustrate my intention:

cdef:
    ndarray[npy_uint64] tmp_buffer
    uint64_t * my_buffer

tmp_buffer = np.empty(my_buffer_size, dtype='uint64')

my_buffer = <uint64_t *> malloc(my_buffer_size * sizeof(uint64_t))
(... do something with my_buffer that cannot be done with a ndarray ...)

tmp_buffer.data = my_buffer
some_func(tmp_buffer)

This seems inefficient since for tmp_buffer memory is allocated and zero-filled which will never be used. How do I avoid this?

ARF
  • 7,420
  • 8
  • 45
  • 72
  • It's worse than inefficient, it's a memory leak. Numpy will try to free the memory it allocated by calling `free(tmp_buffer.data)`, but of-course that'll free my_buffer instead. – Bi Rico Mar 04 '15 at 22:55
  • you can create a Cython memory view and return it as a Numpy array doing `return np.asarray(mymemoryview)` – Saullo G. P. Castro Mar 04 '15 at 23:18
  • I encountered that memory deallocation issue. If I included a `free(my_buffer)` statement I got a 'double free' error message. – hpaulj Mar 04 '15 at 23:33

1 Answers1

3

Efficiency aside, does this sort of assignment compile?

np.empty does not zero fill. np.zeros does that, and even that is done 'on the fly'.

Why the performance difference between numpy.zeros and numpy.zeros_like? explores how empty, zeros and zeros_like are implemented.


I'm just a beginner with cython, but I have to use:

tmp_buffer.data = <char *>my_buffer

How about going the other way, making my_buffer the allocated data of tmp_buffer?

array1 = np.empty(bsize, dtype=int)
cdef int *data
data = <int *> array1.data
for i in range(bsize):
    data[i] = bsize-data[i]

http://gael-varoquaux.info/programming/cython-example-of-exposing-c-computed-arrays-in-python-without-data-copies.html suggests using np.PyArray_SimpleNewFromData to create an array from an existing data buffer.

Regarding memoryviews http://docs.cython.org/src/userguide/memoryviews.html

Community
  • 1
  • 1
hpaulj
  • 221,503
  • 14
  • 230
  • 353