I got a simple C++ class that holds a C array together with its shape for convenience:
template <typename T> class GpuBuffer {
public:
GpuBuffer(mydim4 shape = {1, 1, 1, 1}) : data(0) {
resize(shape);
}
inline operator T*() { return data; }
inline T* operator ->() { return data; }
inline T& operator[](const idx_t& at) { return data[at]; }
// …
mydim4 shape;
T *data = 0;
};
using BUF = GpuBuffer<DT_CUDA>;
I am using SWIG to generate python code to access this from python. It works well, but I struggle with getting the contained array converted back to numpy.
Currently, I use this code in my flambeau.i
:
%pythoncode %{
import numpy
def as_numpy(buf):
e = numpy.empty(buf.length(), dtype="float32")
for i in range(0, buf.length()):
e[i] = buf[i]
e = e.reshape((buf.shape.batches, buf.shape.channels, buf.shape.height, buf.shape.width))
return e
%}
I can then call flambeau.as_numpy(buffer)
and get a numpy array back.
It works, but it is, of course, excruciatingly slow.
How do I best go about the other direction? Do I use typemaps? How would I do that? How can I make sure there won't be memory leaks?