I am relatively new to the binding between python and C++, so I am sorry if my pratique is not perfect or not good enough.
I am trying to write an extension of a python code in C++. To bind the codes I use pybind11. In the meanwhile I want to use the vector structure from the std library, passing a numpy
vector from the python side.
My piece of code in C++ is the following:
template<typename OffsetT> void unique_nodes(
const std::vector<OffsetT> &PC, const std::vector<OffsetT> &PCO,
std::vector<OffsetT> &PointConnectivity_singular, std::vector<OffsetT> &PointConnectivity_singular_O){
OffsetT ivert = 0;
const auto nc = PCO.size()-1;
auto begin_PC = PC.begin();
auto end_PC = PC.end();
for (auto icell = 0; icell < nc; icell++) {
PointConnectivity_singular_O[icell]=ivert;
std::vector<OffsetT> Cell (begin_PC+PCO[icell],begin_PC+PCO[icell+1]);
auto size = RemoveDuplicatesKeepOrder(Cell);
for (auto &elem:Cell){
*(PointConnectivity_singular.begin()+ivert) = elem;
ivert++;
}
}
PointConnectivity_singular.shrink_to_fit();
PointConnectivity_singular_O[nc]=ivert;
}
The RemoveDuplicatesKeepOrder
function is taken from this question and here reported:
template<typename T>
size_t RemoveDuplicatesKeepOrder(std::vector<T>& vec)
{
std::unordered_set<T> seen;
auto newEnd = std::remove_if(vec.begin(), vec.end(), [&seen](const T& value)
{
if (seen.find(value) != std::end(seen))
return true;
seen.insert(value);
return false;
});
vec.erase(newEnd, vec.end());
return vec.size();
}
I do the binding in a "classical" way, as reported in the pybind11 documentation:
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>
#include <pybind11/numpy.h>
#include "parsing_nodes.h"
#include "node_to_cells_Types.h"
namespace py = pybind11;
PYBIND11_MODULE(parsing_nodes, m) {
m.doc() = "Parsing nodes from the NGon and NFaces";
m.def(
"RemoveDuplicatesKeepOrder",
[](std::vector<n2c_OffsetT> &vec) {
RemoveDuplicatesKeepOrder<n2c_OffsetT>(vec);
return std::make_tuple(vec);
});
m.def("unique_nodes",
[]( const std::vector<n2c_OffsetT> &PC, const std::vector<n2c_OffsetT> &PCO,
std::vector<n2c_OffsetT> &PointConnectivity_singular, std::vector<n2c_OffsetT> &PointConnectivity_singular_O
) {unique_nodes<n2c_OffsetT>(PC,PCO,PointConnectivity_singular,PointConnectivity_singular_O);
return std::make_tuple(PointConnectivity_singular,PointConnectivity_singular_O);
});
}
The compilation pass smoothly (using CMake and gcc). When I try to call the method from python side like this (oversizing my numpy array, seen thah I apply a shrink_to_fit
):
nvert=5000
ncells=10000
PointConnectivity = np.zeros((nvert*12, ),dtype = np.int64)
PointConnectivityO = np.zeros((nb_cells+1, ),dtype = np.int64)
PointConnectivity_singular = np.zeros((4*nvert, ),dtype = np.int64)
PointConnectivity_singular_O = np.zeros((nb_cells+1, ),dtype = np.int64)
PCU,PCOU =unique_nodes(PC,PCO,PointConnectivity_singular,PointConnectivity_singular_O)
I obtain a segfault
at a certain point. In particular a segfault related to the dimension of the vectors:
Error in `python': corrupted size vs. prev_size: 0x0000000003631960
So my questions are:
- Am I doing correctly the binding (in the sense of the data structure
numpy
andstd
)? - Is there a way to avoid the copy in memory (so to have an opaque way to modify the numpy array). I have seen the buffer protocol in the documentation, but I have not understood how to apply it to my case where I want to use standard library methods on my vectors (that is a different data structure from numpy array).
- Is it the
shrink_to_fit
possible or it invalidate in some way the pointer and hence it should be avoided?