duality of vector-numpy array operation between C++ and Python

Question

I am relatively new to the binding between python and C++, so I am sorry if my pratique is not perfect or not good enough. I am trying to write an extension of a python code in C++. To bind the codes I use pybind11. In the meanwhile I want to use the vector structure from the std library, passing a numpy vector from the python side.

My piece of code in C++ is the following:

template<typename OffsetT> void unique_nodes(
                const std::vector<OffsetT> &PC, const std::vector<OffsetT> &PCO,
                std::vector<OffsetT> &PointConnectivity_singular, std::vector<OffsetT> &PointConnectivity_singular_O){
    OffsetT ivert = 0;
    const auto nc = PCO.size()-1;
    auto begin_PC = PC.begin();
    auto end_PC = PC.end();
    for (auto icell = 0; icell < nc; icell++) {
        PointConnectivity_singular_O[icell]=ivert;
        std::vector<OffsetT> Cell (begin_PC+PCO[icell],begin_PC+PCO[icell+1]);
        auto size = RemoveDuplicatesKeepOrder(Cell);
        for (auto &elem:Cell){
                *(PointConnectivity_singular.begin()+ivert) = elem;
                ivert++;
        }
    }
    PointConnectivity_singular.shrink_to_fit();
    PointConnectivity_singular_O[nc]=ivert;
}

The RemoveDuplicatesKeepOrder function is taken from this question and here reported:

template<typename T>
size_t RemoveDuplicatesKeepOrder(std::vector<T>& vec)
{
    std::unordered_set<T> seen;

    auto newEnd = std::remove_if(vec.begin(), vec.end(), [&seen](const T& value)
    {
        if (seen.find(value) != std::end(seen))
            return true;

        seen.insert(value);
        return false;
    });

    vec.erase(newEnd, vec.end());

    return vec.size();
}

I do the binding in a "classical" way, as reported in the pybind11 documentation:

#include <pybind11/pybind11.h>
#include <pybind11/stl.h>
#include <pybind11/numpy.h>

#include "parsing_nodes.h"
#include "node_to_cells_Types.h"

namespace py = pybind11;

PYBIND11_MODULE(parsing_nodes, m) {
  m.doc() = "Parsing nodes from the NGon and NFaces";
  m.def(
      "RemoveDuplicatesKeepOrder",
      [](std::vector<n2c_OffsetT> &vec) {
        RemoveDuplicatesKeepOrder<n2c_OffsetT>(vec);
        return std::make_tuple(vec);
      });
  m.def("unique_nodes",
      []( const std::vector<n2c_OffsetT> &PC, const std::vector<n2c_OffsetT> &PCO,
                std::vector<n2c_OffsetT> &PointConnectivity_singular, std::vector<n2c_OffsetT> &PointConnectivity_singular_O
              ) {unique_nodes<n2c_OffsetT>(PC,PCO,PointConnectivity_singular,PointConnectivity_singular_O);
        return std::make_tuple(PointConnectivity_singular,PointConnectivity_singular_O);
      });
}

The compilation pass smoothly (using CMake and gcc). When I try to call the method from python side like this (oversizing my numpy array, seen thah I apply a shrink_to_fit):

nvert=5000
ncells=10000
PointConnectivity = np.zeros((nvert*12, ),dtype = np.int64)
PointConnectivityO = np.zeros((nb_cells+1, ),dtype = np.int64)
PointConnectivity_singular = np.zeros((4*nvert, ),dtype = np.int64)
PointConnectivity_singular_O = np.zeros((nb_cells+1, ),dtype = np.int64)
PCU,PCOU =unique_nodes(PC,PCO,PointConnectivity_singular,PointConnectivity_singular_O)

I obtain a segfault at a certain point. In particular a segfault related to the dimension of the vectors:

 Error in `python': corrupted size vs. prev_size: 0x0000000003631960

So my questions are:

Am I doing correctly the binding (in the sense of the data structure numpy and std)?
Is there a way to avoid the copy in memory (so to have an opaque way to modify the numpy array). I have seen the buffer protocol in the documentation, but I have not understood how to apply it to my case where I want to use standard library methods on my vectors (that is a different data structure from numpy array).
Is it the shrink_to_fit possible or it invalidate in some way the pointer and hence it should be avoided?

duality of vector-numpy array operation between C++ and Python

0 Answers0