Numpy Memmap Ctypes Access

Question

I'm trying to use a very large numpy array using numpy memmap, accessing each element as a ctypes Structure.

class My_Structure(Structure):
    _fields_ = [('field1', c_uint32, 3),
                ('field2', c_uint32, 2),
                ('field3', c_uint32, 2),
                ('field4', c_uint32, 9),
                ('field5', c_uint32, 12),
                ('field6', c_uint32, 2),
                ('field7', c_uint32, 2)]

    def __str__(self):
        return f'MyStruct -- f1{self.field1} f2{self.field2} f3{self.field3} f3{self.field4} f5{self.field5} f6{self.field6} f7{self.field7}'

    def __eq__(self, other):
        for field in self._fields_:
            if getattr(self, field[0]) != getattr(other, field[0]):
                return False
            return True

_big_array = np.memmap(filename = 'big_file.data',
                                   dtype = 'uint32',
                                   mode = 'w+',
                                   shape = 1000000
                                   )

big_array = _big_array.ctypes.data_as(ctypes.POINTER(My_Structure))

big_array[0].field1 = 5
...

And it seems to work correctly, but I'm getting an fault on a 64bit Windows machine where python.exe simply stops. In Event Viewer, I see that the faulting module name is _ctypes.pyd and the exception code is 0xc0000005 which I believe is an access exception.

I don't seem to be getting the same error on Linux, though my testing has not been thorough.

My questions are:

Does my access look correct; ie. am I using numpy.memmap.ctypes.data_as correctly?
Does the fact that I have functions (__str__ and __eq__) defined on My_Structure change its size? ie. can it still be used in the array as a uint32?
Is there anything that you think might cause this behavior? Particularly considering the differences between Windows and Linux?

EDIT:

Using ctypes.addressof and ctypes.sizeof on big_array elements, it looks like the __str__ and __eq__ do not impact the size of My_Structure
I added some asserts before my access to big_array and found that I was attempting to access big_array[-1], which explains the access error and crash.

Which leaves question 1 open: It looks like my code is technically correct, but I'm wondering if there is a better way to access the numpy array than using a ctypes.pointer so that I still get the benefits of using a numpy array (out-of-bound access warning, negative index wrapping, etc.). Daniel below suggested using a structured numpy array, but is it possible to do bitfield access with this?

Hi Daniel, the main reason is that my Structure uses bitfields of various lengths. My understanding is that the smallest dtype in a structured array is a byte. — sheridp, Dec 14 '17 at 18:44

score 0 · Accepted Answer · answered Nov 19 '18 at 08:10

You can cast to ctypes at the last step, not the first step:

_big_array[0, ...].ctypes.data_as(ctypes.POINTER(My_Structure)).field1 = 5

Note that ... is needed to keep the result as a 0d array, so that the .ctypes attribute exists

Now of course, negative indexing will work just fine:

_big_array[-1, ...].ctypes.data_as(ctypes.POINTER(My_Structure)).field1 = 5

Daniel below suggested using a structured numpy array, but is it possible to do bitfield access with this?

No

Numpy Memmap Ctypes Access

1 Answers1