1

I wrote a little C++ Script and used pybind11 to make the C++ function available in python. When called from python, the C++ function takes about 4 seconds to terminate. The C++ functions returns a large array of length 54.346.383. Out of curiosity, I modified the C++ function and returned a different array of length 7373 without changing anything else in the code. Now the C++ function terminates in 1 second. So as I understand this the transfer of an object from C++ to Python becomes a huge bottleneck as size of the object increases.

Is there a smarter approach to handle this issue? Maybe working with pointers? (I am completely new to C++ and pybind11)

#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>
#include <pybind11/stl.h>
#include <vector>
#include <numeric>

namespace py = pybind11;

std::vector<double> isoCdf_seq(std::vector<double> array_w, std::vector<double> W, std::vector<double>  Y, std::vector<int>  posY, std::vector<double>  array_y) {

std::vector<double> CDF;
CDF.reserve(m * mY);

// some code

return CDF;

evva
  • 11
  • 2

1 Answers1

3

It is constructing a Python list of floats which has a lot of overhead. I suggest using a NumPy array on the Python side, which is explained here: returning numpy arrays via pybind11

That way, you can allocate the array memory once, and Python can reference it as a NumPy array without allocating 54 million tiny objects and references to them.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • Perfect and one quick follow up question: How about the other way around, when using numpy arrays from python as funtion input in C++. Can this also result in a lot of overhead? Is it for example faster to transfer C-contiguous numpy array instead of regular numpy arrays? (In C++ I then transform np array to a C++ object using std::memcpy) – evva Aug 22 '20 at 16:50
  • @evva: You should try to pass the NumPy array from Python to C++ and then access its memory directly, memcpying it (or copying into a vector, same thing) only if you really need a copy. Whether C++ receives the array as the full NumPy object, a Python buffer or memoryview, or a pointer to the data plus the size, anything that avoids serializing it into a new storage format will help. Still, nothing will be as slow as serializing into a Python list. – John Zwinck Aug 23 '20 at 06:06