7

There is a C++ function that returns a vector of floats. How to convert this vector to NumPy array without copying? Now I'm doing this:

cdef np.ndarray arr = np.ascontiguousarray(cpp_vector, dtype=np.float)
return arr

but this works very slow (assume copying occurs) on large vectors.

0x1337
  • 1,074
  • 1
  • 14
  • 33
  • 5
    The problem is, can you ensure, that the cpp_vector is long enough alive? Otherwise you will get dangling pointers in the numpy-array. – ead Jan 09 '20 at 14:47
  • related https://stackoverflow.com/a/55959886/5769463 – ead Jan 09 '20 at 14:47
  • Note that the C++ standard doest not garantee that IEEE 754 format is used, even if it is the case with most (all ?) compilers. Should not it be a problem here? – Damien Jan 09 '20 at 14:55
  • Once you get buffer interface you can use https://docs.scipy.org/doc/numpy-1.17.0/reference/generated/numpy.frombuffer.html to get a numpy array without copying. If buffer interface is too much you can slightly change memory-nanny-approach from the first link to use std::vector. Btw with std::move (C++11) or std::swap (also C++98) you can change the owership of the data in std::vector. – ead Jan 09 '20 at 21:04

1 Answers1

0

Casting the vector to float array and telling it to numpy should do the trick.

cdef float[::1] arr = <float [:cpp_vector.size()]>cpp_vector.data()
return arr

# arr is of type Memoryview. To cast into Numpy:
np_arr = np.asarray(arr)

The [::1] notation refers to a Typed MemoryView (link). In the link you'll get more examples. We also use np.asarray to turn tne MemoryView into a numpy array (answered in SO Here). The idea is to tell Cython to look at that memory space with a predefined format, avoiding any copying. Expanding from this section of the docs named Coertion to Numpy:

Memoryview (and array) objects can be coerced to a NumPy ndarray, without having to copy the data. You can e.g. do:

cimport numpy as np
import numpy as np

numpy_array = np.asarray(<np.float_t[:10, :10]> my_pointer)

Of course, you are not restricted to using NumPy’s type (such as np.float_ here), you can use any usable type.

Source: https://cython.readthedocs.io/en/latest/src/userguide/memoryviews.html#coercion-to-numpy

ibarrond
  • 6,617
  • 4
  • 26
  • 45
  • I get `Pointer base type does not match cython.array base type`. I think it is because vector has plain C `float`, but I'm trying to convert it to `np.float_t`. How to fix this? – 0x1337 Jan 09 '20 at 14:52
  • 1
    plain C `float` might be equivalent to `np.float32` (32 bits) and not `np.float_` (64 bits). Play a bit with the output type and you should get it right. – ibarrond Jan 09 '20 at 14:57
  • Note: please modify or comment my answer when you get it right! We all want to learn how to do it. – ibarrond Jan 09 '20 at 15:02
  • I tried `cdef float[::1] arr = cpp_vector.data(); return arr`, it returns `` (it's OK). If I do `return np.asarray(arr)` I get `Segmentation fault`... – 0x1337 Jan 09 '20 at 15:06
  • what about `return np.asarray( cpp_vector)` – ibarrond Jan 09 '20 at 15:09
  • or use `np.asarray` outside of the function (once you have your MemoryView). – ibarrond Jan 09 '20 at 15:16
  • `cdef float[::1] arr = cpp_vector.data()` works for getting MW, but when I wrap it by `np.asarray` (outside) I got Segmentation fault. – 0x1337 Jan 09 '20 at 15:29
  • 6
    This has memory management issues - it doesn't link the lifetime of the vector to that of the numpy array – DavidW Jan 09 '20 at 15:47
  • You are right! Any clue on how to solve it? – ibarrond Jan 09 '20 at 15:58
  • @ibarrond The vector needs to be held by a Python object. It could be in a `cdef class` (like in the duplicate I've suggested) or maybe it could be allocated using `new`, held by a [Cython array, then freed with `callback_free_data`](https://cython.readthedocs.io/en/latest/src/userguide/memoryviews.html#cython-arrays) - although I haven't actually tested the latter approach. There's probably other options too – DavidW Jan 09 '20 at 16:04