Boost numpy example does not work

Question

I tried reproducing some of the examples described here, but I experience the following problem with the code below, which was written by just copy-pasting relevant parts of the linked page.

#include <boost/python.hpp>
#include <boost/python/numpy.hpp>
#include <iostream>

using namespace std;

namespace p = boost::python;
namespace np = boost::python::numpy;

np::ndarray test()
{
    int data[] = {1,2,3,4,5};
    p::tuple shape = p::make_tuple(5);
    p::tuple stride = p::make_tuple(sizeof(int));
    p::object own;
    np::dtype dt = np::dtype::get_builtin<int>();
    np::ndarray array = np::from_data(data, dt, shape,stride,own);
    std::cout << "Selective multidimensional array :: "<<std::endl
        << p::extract<char const *>(p::str(array)) << std::endl ;
    return array;
}


BOOST_PYTHON_MODULE(test_module)
{
    using namespace boost::python;

    // Initialize numpy
    Py_Initialize();
    boost::python::numpy::initialize();

    def("test", test);
}

When I compile as a shared library and load the module in python,

import test_module as test
print(test.test())

it seems that the ndarray gets created properly by the C++ code, but the version that python receives is rubbish; the arrays that get printed are:

[1 2 3 4 5]
[2121031184      32554 2130927769      32554          0]

What could be the cause of such a difference?

score 2 · Accepted Answer · edited Jun 20 '20 at 09:12

This week I had the same issue. To solve my problem I use dynamic memory:

np::ndarray test(){
    int *data = malloc(sizeof(int) * 5);
    for (int i=0; i < 5; ++i){
        data[i] = i + 1;
    }
    p::tuple shape = p::make_tuple(5);
    p::tuple stride = p::make_tuple(sizeof(int));
    p::object own;
    np::dtype dt = np::dtype::get_builtin<int>();
    np::ndarray array = np::from_data(data, dt, shape, stride, own);
    return array;
}

I think the difference according to this answer: https://stackoverflow.com/a/36322044/4637693 is:

The difference between declaring an array as
int array[n];
and
int* array = malloc(n * sizeof(int));
In the first version, you are declaring an object with automatic storage duration. This means that the array lives only as long as the function that calls it exists. In the second version, you are getting memory with dynamic storage duration, which means that it will exist until it is explicitly deallocated with free.

I will take more time in the next weeks to see if this works for a matrix too.

EDIT

Or you can use a dynamic structure from boost like list:

np::ndarray test(){
    boost::python::list my_list;
    for (int i=0; i < 5; ++i){
        my_list.append(i + 1);
    }
    np::ndarray array = np::from_object(my_list);
    return array;
}

This work also for a Matrix for example:

np::ndarray test(){
    //This will use a list of tuples
    boost::python::list my_list;
    for (int i=0; i < 5; ++i){
        my_list.append(boost::python::make_tuple(i + 1, i, i-1));
    }
    //Just convert the list to a NumPy array.
    np::ndarray array = np::from_object(my_list);
    return array;
}

I assume (for the moment) that by using the boost functions you will be able to avoid the memory conflicts.

Thanks, I reached the same conclusion while experimenting. Would you by any chance know whether python will destroy the dynamic array automatically when it goes out of scope? — Rastapopoulos, Aug 21 '18 at 10:28
Just checked the Python documentation and I found: To avoid memory corruption, extension writers should never try to operate on Python objects with the functions exported by the C library: malloc(), calloc(), realloc() and free(). This will result in mixed calls between the C allocator and the Python memory manager with fatal consequences. You can try to use PyMem_Malloc or another function from Python. Another solution could be to use the dynamic structures from boost, like boost::python::list. This can lead to an easier implementation, I will edit my answer to add an example. — Carlos Ramírez, Aug 21 '18 at 14:32
Thank you very much for taking the time to write this, it is much appreciated. — Rastapopoulos, Aug 21 '18 at 15:21

score 0 · Answer 2 · answered Feb 22 '19 at 14:17

Creating a new reference to the array before returning it solved the problem. Good news is that np::ndarray has a copy() method that achieves exactly the same thing. Thus, you should add

np::ndarray new_array = array.copy();

before the return statement

Boost numpy example does not work

2 Answers2