1

I am trying to get a Python array from C++ and put it back into a Python function using also C++ with Pybind11. First, I access a dictionary of numpy arrays, and then put each of those numpy arrays into a Python function which just computes the sum of the elements in the array. The main part of my code which is causing me trouble is:

        py::array item = list_items.attr("__getitem__")(i); 
    //equivalent to dict.__getitem__[i]

    //code runs if removing the following line
        boost_mod.attr("sum_of_dict")(item).cast<int>(); 
    //calls the function sum_of_dict in the module boost_mod having as argument a 1D numpy array
    //this function computes the sum of the elements in the numpy array, which are all integers 

I think I have a conversion problem and I don't understand how to get over it. I have looked at this documentation link but it didn't help. I have also looked at different cast possibilities, and other posts like this one or this other one, but they do not apply to my problem given that they do not use pybind. Any help, or any hint for the conversion and use of my numpy array object would be highly appreciated. Thank you very much


The full code is the following:

C++ code

#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>
#include <chrono>
#include <thread>
#include <Python.h>
namespace py = pybind11;
py::module boost_mod = py::module::import("boost_mod");

int parallelize(py::dict list_items, py::object q_in, py::object q_out){

    unsigned int sum = 0;
    unsigned int len = list_items.attr("__len__")().cast<int>();
    for( unsigned int i = 0; i < len ; i++){

        //PROBLEM IS HERE//
        py::array item = list_items.attr("__getitem__")(i);
        sum+= boost_mod.attr("sum_of_dict")(item).cast<int>();
        std::this_thread::sleep_for(std::chrono::milliseconds(5));

    }
    return sum;
}

PYBIND11_MODULE(cpp_parallel, m) {

    m.doc() = "pybind11 plugin";
    m.def("parallelize", &parallelize,
      "a function");
}

Python module which contains the function

import numpy as np
import cpp_parallel as cpp

class child(object):
    def __init__(self, index):
        self.nb = index
        self.total = None

    def sum(self, i, j):
        return self.nb + self.nb

    def create_dict(self):
        self.list_items = {}
        for i in range(self.nb):
            lth = np.random.randint(1,10)
            a = np.random.binomial(size=lth, n=1, p =0.6)
            self.list_items[i] = a
        return self.list_items

    def sum_of_dict(self, element):  # fonction comme eval function
        a = np.sum(element)
        return a

    def sub(self, q_in, q_out):
        //FUNCTION CALLED FROM C++//
        return cpp.parallelize(self.list_items, q_in, q_out)

Python Code

from multiprocessing import Process, Queue
import time
import boost_mod as test
import cpp_parallel as cpp

q_in = Queue()
q_out = Queue()

q_in.put(True)

dict_evaluated = test.child(1000)
dict_evaluated.create_dict()
dict_evaluated.sub(q_in, q_out)
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Joachim
  • 490
  • 5
  • 24
  • 1
    What exactly is the error? In the line `boost_mod.attr("sum_of_dict")(item).cast();`, the code attempts to get a method of class child from the module global dict, which of course can not work. You'll need to instantiate a child object or pass it into your parallelize function. – Wim Lavrijsen Nov 03 '19 at 18:02
  • @WimLavrijsen You are absolutely right ! I had not paid attention to this detail. I was focused on initiating a numpy array. Thanks a lot, really. – Joachim Nov 03 '19 at 19:09

1 Answers1

1

Actually I didn't instantiate correctly the module from which I am calling the function. Credits go to Wim Lavrijsen. The answer is the following:

#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>
#include <omp.h>
#include <chrono>
#include <thread>
#include <Python.h>
namespace py = pybind11;
py::module boost_mod = py::module::import("boost_mod");


int parallelize(py::object child, py::object q_in, py::object q_out){

    unsigned int sum = 0;
    py::object items = child.attr("list_items");
    unsigned int len = items.attr("__len__")().cast<int>();
    #pragma omp simd reduction (+:sum)
    for( unsigned int i = 0; i < len ; i++){

        py::array item = items.attr("__getitem__")(i);
        sum += child.attr("sum_of_dict")(item).cast<int>();
        bool accord = q_in.attr("get")().cast<bool>();
        if (accord == true){
          q_out.attr("put")(sum);
        }
        accord = false;
        std::this_thread::sleep_for(std::chrono::milliseconds(5));
    }
    return sum;
}

PYBIND11_MODULE(cpp_parallel, m) {

    m.doc() = "pybind11 example plugin";

    m.def("parallelize", &parallelize,
      "the function which parallelizes the evaluation");
}

And child class:

from multiprocessing import Process, Event, Lock, Queue, Pipe
import time
import numpy as np
import cpp_parallel as cpp

class child(object):
    def __init__(self, index):
        self.nb = index
        self.total = None

    def sum(self, i, j):
        return self.nb + self.nb

    def create_dict(self):
        self.list_items = {}
        for i in range(self.nb):
            lth = np.random.randint(1,10)
            a = np.random.binomial(size=lth, n=1, p =0.6)
            self.list_items[i] = a

    def sum_of_dict(self, element):  # fonction comme eval function
        a = np.sum(element)
        return a

    def sub(self, q_in, q_out):
        return cpp.parallelize(self, q_in, q_out)
Joachim
  • 490
  • 5
  • 24