2

I have a C++ function that receives an int array and I'm making a python wrapper for it with ctypes and numpy. Here is a minimal example:

copy.cpp

#include <vector>

extern "C" std::vector<int>* copy_vec(int* array, int size){    
    std::vector<int>* vec = new std::vector<int>(size);
    for (int i=0; i<size; i++){
        vec->push_back(array[i]);
    }
    return vec;
}

copy.py

import ctypes as ct
import numpy as np

INT_POINTER = ct.POINTER(ct.c_int)

_lib = ct.cdll.LoadLibrary('./libcopy.dll')
_lib.copy_vec.argtypes = [INT_POINTER, ct.c_int]

def copy(nums):
    size = len(nums)
    nums_c = np.array(nums).ctypes.data_as(INT_POINTER)
    vector = _lib.copy_vec(nums_c, size)
    return vector

array =[12]*1000000
copy(array)

This produces the following error message:

---------------------------------------------------------------------------
WindowsError                              Traceback (most recent call last)
<ipython-input-2-752101759a61> in <module>()
      1 array =[12]*1000000
----> 2 copy(array)

<ipython-input-1-f18316d64ae3> in copy(nums)
     10     size = len(nums)
     11     nums_c = np.array(nums).ctypes.data_as(INT_POINTER)
---> 12     vector = _lib.copy_vec(nums_c, size)
     13 
     14     return vector

WindowsError: exception: access violation reading 0x08724020

This code works for small arrays like array =[12]*100, but fails when using big arrays.

Guilherme de Lazari
  • 1,134
  • 2
  • 11
  • 26

1 Answers1

1

After a long time I found out the problem.

I created an array with np.array(nums) and then created a ctypes pointer to that array with .ctypes.data_as(INT_POINTER). Since no reference is kept of the numpy array, the pointer will be pointing to a temporary variable. The way around this is to keep reference to the array in python.

nums_a = np.array(nums)
nums_c = nums_a.ctypes.data_as(INT_POINTER)

The full function would be:

def copy(nums):
    size = len(nums)
    nums_a = np.array(nums)
    nums_c = nums_a.ctypes.data_as(INT_POINTER)  
    vector = _lib.copy_vec(nums_c, size)

    return vector

For small arrays there is probably enough time to finish copying the array before its memory is reclaimed, but for big arrays this reclaiming of memory probably happens with more priority.

More can be read at numpy.ndarray.ctypes

Guilherme de Lazari
  • 1,134
  • 2
  • 11
  • 26
  • 1
    If you're at the beginning of the code-writing and can still change the way you interoperate between Python and C++, I'd strongly suggest that you use a C++ library called [PyBind](https://github.com/pybind). It is extremely simple to setup, install it from pip and you can easily pass Numpy arrays, or even custom types. I answered a question with a simple working example and explanation [here](https://stackoverflow.com/questions/49582252/pybind-numpy-access-2d-nd-arrays/49693704#49693704). – Christian Apr 13 '18 at 07:55