6

Suppose I have an __array_interface__ dictionary and I would like to create a numpy view of this data from the dictionary itself. For example:

buff = {'shape': (3, 3), 'data': (140546686381536, False), 'typestr': '<f8'}
view = np.array(buff, copy=False)

However, this does not work as np.array searches for either the buffer or array interface as attributes. The simple workaround could be the following:

class numpy_holder(object):
    pass

holder = numpy_holder()
holder.__array_interface__ = buff
view = np.array(holder, copy=False)

This seems a bit roundabout. Am I missing a straightforward way to do this?

Daniel
  • 19,179
  • 7
  • 60
  • 74
  • Does your workaround work, or are you just speculating? – hpaulj Sep 07 '16 at 19:52
  • Why do you only have the the dict? Did you manually create it to expose an array? – Dunes Sep 07 '16 at 19:55
  • I'd definitely just stick the dict on a wrapper object like you're doing here, though probably with an `__init__` method. – user2357112 Sep 07 '16 at 19:59
  • @hpaulj No the above works fine. – Daniel Sep 07 '16 at 21:11
  • @Dunes Im exposing a C++ Matrix class. – Daniel Sep 07 '16 at 21:11
  • So the `data` value in `buff` is not derived from another numpy array. It's a buffer of your own creation. – hpaulj Sep 07 '16 at 21:37
  • @hpaulj In this example yes, but it doesn't have to be. As pointed out these data attributes are just pointers cast to an integer. – Daniel Sep 07 '16 at 21:42
  • Looks like `exposes an array interface` means has a valid `__array_interface__` attribute. Can't think of anything more direct. – hpaulj Sep 08 '16 at 01:39
  • @hpaulj Its just having the `array_interface` as an attribute means a few silly work arounds. I still don't understand why your example below does not work on my laptop. – Daniel Sep 08 '16 at 13:46

2 Answers2

5

correction - with the right 'data' value your holder works in np.array:

np.array is definitely not going to work since it expects an iterable, some things like a list of lists, and parses the individual values.

There is a low level constructor, np.ndarray that takes a buffer parameter. And a np.frombuffer.

But my impression is that x.__array_interface__['data'][0] is a integer representation of the data buffer location, but not directly a pointer to the buffer. I've only used it to verify that a view shares the same databuffer, not to construct anything from it.

np.lib.stride_tricks.as_strided uses __array_interface__ for default stride and shape data, but gets the data from an array, not the __array_interface__ dictionary.

===========

An example of ndarray with a .data attribute:

In [303]: res
Out[303]: 
array([[ 0, 20, 50, 30],
       [ 0, 50, 50,  0],
       [ 0,  0, 75, 25]])
In [304]: res.__array_interface__
Out[304]: 
{'data': (178919136, False),
 'descr': [('', '<i4')],
 'shape': (3, 4),
 'strides': None,
 'typestr': '<i4',
 'version': 3}
In [305]: res.data
Out[305]: <memory at 0xb13ef72c>
In [306]: np.ndarray(buffer=res.data, shape=(4,3),dtype=int)
Out[306]: 
array([[ 0, 20, 50],
       [30,  0, 50],
       [50,  0,  0],
       [ 0, 75, 25]])
In [324]: np.frombuffer(res.data,dtype=int)
Out[324]: array([ 0, 20, 50, 30,  0, 50, 50,  0,  0,  0, 75, 25])

Both of these arrays are views.

OK, with your holder class, I can make the same thing, using this res.data as the data buffer. Your class creates an object exposing the array interface.

In [379]: holder=numpy_holder()
In [380]: buff={'data':res.data, 'shape':(4,3), 'typestr':'<i4'}
In [381]: holder.__array_interface__ = buff
In [382]: np.array(holder, copy=False)
Out[382]: 
array([[ 0, 20, 50],
       [30,  0, 50],
       [50,  0,  0],
       [ 0, 75, 25]])
hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • Re `data[0]` being representation of the buffer pointer. You're right, and may wish to include the numpy documentation `__array_interface__` http://docs.scipy.org/doc/numpy/reference/arrays.interface.html – Dunes Sep 07 '16 at 20:08
  • Also, using a view (rather than a copy) of the data pointed to by the pointer rather than Python's buffer interface could lead to memory leak issues. That is, the buffer interface allows other users to keep a buffer alive, should the original owner fall out of scope. (hope that makes sense :s) – Dunes Sep 07 '16 at 20:17
  • What version of python/numpy is this? I get a type error for the buffer. – Daniel Sep 07 '16 at 21:23
  • @Dunes Id like to point out that one of the very large upsides of this technique is that it *is* a view. This allows you to manipulate C++ data with python and NumPy. You cannot really get a memory leak, but you can get invalid access. This is usually handled by intelligent wrapping and smart pointers. – Daniel Sep 07 '16 at 21:26
1

Here's another approach:

import numpy as np


def arr_from_ptr(pointer, typestr, shape, copy=False,
                 read_only_flag=False):
    """Generates numpy array from memory address
    https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.interface.html

    Parameters
    ----------
    pointer : int
        Memory address

    typestr : str
        A string providing the basic type of the homogenous array The
        basic string format consists of 3 parts: a character
        describing the byteorder of the data (<: little-endian, >:
        big-endian, |: not-relevant), a character code giving the
        basic type of the array, and an integer providing the number
        of bytes the type uses.

        The basic type character codes are:

        - t Bit field (following integer gives the number of bits in the bit field).
        - b Boolean (integer type where all values are only True or False)
        - i Integer
        - u Unsigned integer
        - f Floating point
        - c Complex floating point
        - m Timedelta
        - M Datetime
        - O Object (i.e. the memory contains a pointer to PyObject)
        - S String (fixed-length sequence of char)
        - U Unicode (fixed-length sequence of Py_UNICODE)
        - V Other (void * – each item is a fixed-size chunk of memory)

        See https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.interface.html#__array_interface__

    shape : tuple
        Shape of array.

    copy : bool
        Copy array.  Default False

    read_only_flag : bool
        Read only array.  Default False.
    """
    buff = {'data': (pointer, read_only_flag),
            'typestr': typestr,
            'shape': shape}

    class numpy_holder():
        pass

    holder = numpy_holder()
    holder.__array_interface__ = buff
    return np.array(holder, copy=copy)

Usage:

# create array
arr = np.ones(10)

# grab pointer from array
pointer, read_only_flag = arr.__array_interface__['data']

# constrct numpy array from an int pointer
arr_out = arr_from_ptr(pointer, '<f8', (10,))

# verify it's the same data
arr[0] = 0
assert np.allclose(arr, arr_out)
Alex Kaszynski
  • 1,817
  • 2
  • 17
  • 17