0

I usually code in python, but due to combinatorial explosion, I decided to use C++ for a problem. My idea is to save numpy array in a npy file, then I will feed my C++ function that npy file. Since I have limited understanding of C++ pointers and reference, I can't wrap my head around how to get the value of a void function. A void function is not returning anything, so I can't use usual assignment operation. How to assign the new (&mat_out) to my desired variable say A. Or another way of doing it by returning the value (like the second commented function), but in this approach I'm having type mismatch. My main question is how to make either one of the approach work. But for knowledge perspective, I'd also like to know how to make use of the new object created within a void function.

#include <iostream>
#include <Eigen/Dense>
#include "cnpy.h"
#include "opencv2/highgui.hpp"

using namespace std;
using namespace Eigen;
using namespace cv;

void cnpy2eigen(string data_fname, Eigen::MatrixXd& mat_out){
    cnpy::NpyArray npy_data = cnpy::npy_load(data_fname);
    // double* ptr = npy_data.data<double>();
    int data_row = npy_data.shape[0];
    int data_col = npy_data.shape[1];
    double* ptr = static_cast<double *>(malloc(data_row * data_col * sizeof(double)));
    memcpy(ptr, npy_data.data<double>(), data_row * data_col * sizeof(double));
    cv::Mat dmat = cv::Mat(cv::Size(data_col, data_row), CV_64F, ptr); // CV_64F is equivalent double
    new (&mat_out) Eigen::Map<Eigen::Matrix<double,Eigen::Dynamic,Eigen::Dynamic>>(reinterpret_cast<double *>(dmat.data), data_col, data_row);
}

/*
Eigen::Map<Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic> > * cnpy2eigen(string data_fname){
    cnpy::NpyArray npy_data = cnpy::npy_load(data_fname);
    // double* ptr = npy_data.data<double>();
    int data_row = npy_data.shape[0];
    int data_col = npy_data.shape[1];
    double* ptr = static_cast<double *>(malloc(data_row * data_col * sizeof(double)));
    memcpy(ptr, npy_data.data<double>(), data_row * data_col * sizeof(double));
    cv::Mat dmat = cv::Mat(cv::Size(data_col, data_row), CV_64F, ptr); // CV_64F is equivalent double
    return new (&mat_out) Eigen::Map<Eigen::Matrix<double,Eigen::Dynamic,Eigen::Dynamic>>(reinterpret_cast<double *>(dmat.data), data_col, data_row);
}
*/

int main()
{
   Eigen::MatrixXd& A = cnpy2eigen("/Users/osmanmamun/Play_Ground/C++_practice/bayesframe/X_data.npy", Eigen::MatrixXd& A);
   cout << "Here is the matrix A:\n" << A << endl;
}

Disclaimer: I got the cnpy2eigen function from a blog.

Osman Mamun
  • 2,864
  • 1
  • 16
  • 22
  • 2
    `new (&mat_out) Eigen::Map<...>(...);` should be `mat_out = Eigen::Map<...>(...);` and `Eigen::Map<...> * cnpy2eigen(string data_fname){ ... return new (&mat_out) Eigen::Map<...>(...); }` should be `Eigen::Map<...> cnpy2eigen(string data_fname){ ... return Eigen::Map<...>(...); }` – Remy Lebeau Jun 08 '20 at 23:52
  • 3
    For your personal records, the `new (&mat_out) blah blah blah` is [Placement `new`](https://en.cppreference.com/w/cpp/language/new#Placement_new). Not something you want to use until you are very familiar with C++ and even then something you almost never use. Whoever wrote that blog is either solving a very interesting problem or trying to be too smart and probably coding above their ability. – user4581301 Jun 09 '20 at 00:02

1 Answers1

1

There is so many things wrong with the code you posted

#include <iostream>
#include <Eigen/Dense>          // minor remark: Often you just need <Eigen/Core> here
#include "cnpy.h"
#include "opencv2/highgui.hpp"  // Do you actually need OpenCV here?

using namespace std;
using namespace Eigen;
using namespace cv;

(Almost) never put using namespace X; in global scope! [1]

void cnpy2eigen(string data_fname, Eigen::MatrixXd& mat_out){ // minor remark: Better pass `std::string` by const reference
    cnpy::NpyArray npy_data = cnpy::npy_load(data_fname);
    // double* ptr = npy_data.data<double>();
    int data_row = npy_data.shape[0];
    int data_col = npy_data.shape[1];

    double* ptr = static_cast<double *>(malloc(data_row * data_col * sizeof(double)));

You are malloc-ing data here without a matching free. First of all, standard "low-level" C++ would use new and delete (better use std::vector, though). But you can actually use the memory-allocation of Eigen instead.

    memcpy(ptr, npy_data.data<double>(), data_row * data_col * sizeof(double));

Standard C++ would use, e.g., std::copy (this will internally also transform to a memcpy for POD types, though). But there is actually no need to manually copy here.

    cv::Mat dmat = cv::Mat(cv::Size(data_col, data_row), CV_64F, ptr); // CV_64F is equivalent double

Are you using dmat anywhere? If not, remove it. Also note that OpenCV uses row-major storage (I assume numpy uses column-major storage?)

    new (&mat_out) Eigen::Map<Eigen::Matrix<double,Eigen::Dynamic,Eigen::Dynamic>>(reinterpret_cast<double *>(dmat.data), data_col, data_row);

Here you are doing a placement-new of an Eigen::Map on the memory location of an Eigen::Matrix. This might partially work, since both types accidentally share some of their memory layout (but Eigen::Map has additional member variables). Also, Eigen::Matrix owns its memory, while Eigen::Map does not, i.e., when the Eigen::Matrix gets destructed, it will deallocate memory it does not own. Generally, avoid placement-new unless you know what you are doing -- especially with non-matching types. (Actually, you should barely ever require new in user code)

}

Here is a fixed/cleaned up version:

void cnpy2eigen(std::string const& data_fname, Eigen::MatrixXd& mat_out){
    cnpy::NpyArray npy_data = cnpy::npy_load(data_fname);
    double* ptr = npy_data.data<double>();
    int data_row = npy_data.shape[0];
    int data_col = npy_data.shape[1];
    mat_out = Eigen::MatrixXd::Map(ptr, data_row, data_col); // assuming NumPy is column major
}

Call it like this:

int main()
{
   Eigen::MatrixXd A;
   cnpy2eigen("filename", A);
   std::cout << "Here is the matrix A:\n" << A << '\n';
}

And if you prefer to return by value, this should work:

Eigen::MatrixXd cnpy2eigen(std::string const& data_fname){
    cnpy::NpyArray npy_data = cnpy::npy_load(data_fname);
    double* ptr = npy_data.data<double>();
    int data_row = npy_data.shape[0];
    int data_col = npy_data.shape[1];
    return Eigen::MatrixXd::Map(ptr, data_row, data_col); // assuming NumPy is column major
}

Call it like this:

int main()
{
   Eigen::MatrixXd A = cnpy2eigen("filename");
   std::cout << "Here is the matrix A:\n" << A << '\n';
}

If you want to avoid the copy, you'd need to keep the npy_data object alive and directly use the Eigen::Map (I don't think there is a way to transfer ownership from cnpy to Eigen).

chtz
  • 17,329
  • 4
  • 26
  • 56
  • Thanks for the detailed answer. My takeaway for future C++ coding is never pollute the global namespace. – Osman Mamun Jun 09 '20 at 11:54
  • `Here is the A(0, 0): 4.43772e-06. The matrix m is of size 836x32`. It seems like the size is correct as the np array but the elements are off. The 1st element is 0.41135383. I think it's not loading data, it's just showing the value residing in the allocated space. How to fix it? – Osman Mamun Jun 09 '20 at 12:12
  • Can you ` std::cout << ptr[0];` (inside either `cnpy2eigen` function)? -- Your first takeaway should probably be "don't use `new`" (until you really have to). – chtz Jun 09 '20 at 12:54
  • It's the same value that's printed as A(0, 0). So the problem is in cnpy::npy_load() function – Osman Mamun Jun 09 '20 at 13:00
  • Maybe it is actually `float` data? (I have no idea how `cnpy` works or how it would handle that.) – chtz Jun 09 '20 at 13:16
  • Yep, you're right. Changing everything to float worked! – Osman Mamun Jun 09 '20 at 13:35