Numpy apply_along_axis not returning ndarray subclass

Question

I have an ndarray subclass, with __array_wrap__ properly implemented, np.apply_along_axis isn't returning instances of my subclass, but rather ndarrays. The code below replicates my problem:

import numpy as np

class MySubClass(np.ndarray):

    def __new__(cls, input_array, info=None):
        obj = np.asarray(input_array).view(cls)
        obj.info = info
        return obj

    def __array_finalize__(self, obj):
        if obj is None: return
        self.info = getattr(obj, 'info', None)

    def __array_wrap__(self, out_arr, context=None):
        return np.ndarray.__array_wrap__(self, out_arr, context)

sample_ndarray = np.array([[0,5],[2.1,0]]) 
sample_subclass = MySubClass(sample_ndarray, info="Hi Stack Overflow")

# Find the smallest positive (>0) number along the first axis
min_positive = np.apply_along_axis(lambda x: np.min(np.extract(x>0,x)),
                                   0, sample_subclass)

# No more info
print hasattr(min_positive, 'info')
# Not a subclass
print isinstance(min_positive, MySubClass)
# Is an ndarray
print isinstance(min_positive, np.ndarray)

The most relevant question I could find was this one but the consensus there appears to be that __array_wrap__ needs to be implemented, which I've done. Also, np.extract and np.min both return subclasses as expected, it's just when using apply_along_axis that I see this behavior.

Is there any way to get my code to return my subclass? I am using numpy version 1.11.0

This should be fixed in numpy 1.13, and as of https://github.com/numpy/numpy/pull/8441, will also call `__array_prepare__` — Eric, Feb 13 '17 at 20:35

hpaulj · Accepted Answer · 2016-07-01T02:53:33.233

Looking at the apply_along_axis code (via Ipython ??)

Type:        function
String form: <function apply_along_axis at 0xb5a73b6c>
File:        /usr/lib/python3/dist-packages/numpy/lib/shape_base.py
Definition:  np.apply_along_axis(func1d, axis, arr, *args, **kwargs)
Source:
def apply_along_axis(func1d, axis, arr, *args, **kwargs):
...
    outarr = zeros(outshape, asarray(res).dtype)
    outarr[tuple(ind)] = res
....
    return outarr

I skipped a lot of details, but basically it uses np.zeros with shape and dtype, but makes no effort to adjust for array subclass.

Many numpy functions delegate the action to the method of the array, or use _wrapit (_wrapit(a, 'take', indices, axis, out, mode)).

Do you really need to use apply_along_axis? There's nothing magical about it. You can do the same iteration in your own code, and just as fast.

===================

Here are the 2 apply_along_axis examples, and alternative implementations. They are too small for meaningful timings, I'm sure they are just as fast, if not more so:

In [3]: def my_func(a):
    return (a[0]+a[-1]*0.5)    
In [4]: b=np.arange(1,10).reshape(3,3)

In [5]: np.apply_along_axis(my_func,0,b)
Out[5]: array([ 4.5,  6. ,  7.5])

In [6]: np.apply_along_axis(my_func,1,b)
Out[6]: array([  2.5,   7. ,  11.5])

Direct array implementation:

In [8]: b[0,:]+b[-1,:]*0.5
Out[8]: array([ 4.5,  6. ,  7.5])

In [9]: b[:,0]+b[:,-1]*0.5
Out[9]: array([  2.5,   7. ,  11.5])

2nd:

In [10]: c=np.array([[8,1,7],[4,3,9],[5,2,6]])

In [11]: np.apply_along_axis(sorted, 1, c)
Out[11]: 
array([[1, 7, 8],
       [3, 4, 9],
       [2, 5, 6]])

In [12]: d=np.zeros_like(c)
In [13]: for i in range(c.shape[0]):
   ....:     d[i,:] = sorted(c[i,:]) 

In [14]: d
Out[14]: 
array([[1, 7, 8],
       [3, 4, 9],
       [2, 5, 6]])

In the first I skip the iteration entirely; in the 2nd I use the same allocate and iterate, with less overhead.

Look at np.matrix and np.ma for examples of how ndarray subclasses are implemented.

np.core.fromnumeric.py as a _wrapit function that is used by functions like np.take:

# functions that are now methods
def _wrapit(obj, method, *args, **kwds):
    try:
        wrap = obj.__array_wrap__
    except AttributeError:
        wrap = None
    result = getattr(asarray(obj), method)(*args, **kwds)
    if wrap:
        if not isinstance(result, mu.ndarray):
            result = asarray(result)
        result = wrap(result)
    return result

So it appears that if obj has a __array_wrap__ method, it will apply that to the array result. So you might be able to use that as a model for wrapping apply_along_axis to get back your own class.

Thanks for the detailed answer. I was curious if there was any good design reason apply_along_axis didn't return a subclass, and it looks like the only one is convenience. Perhaps I can patch it. Although, you're right, I don't need that function. Even in my example, I could just use the axis argument to `min()` — Kyle Heuton, Jul 01 '16 at 18:42
It's possible that this function was written at a time when use of subclasses wasn't important. `np.matrix` has been around a long time, but it is only 2d. And `np.ma` usually needs to use its own funcitons/methods. But look at `np.kron` in `np.lib.shape_base.py`. It uses `asanyarray`, `subok` and a couple of custom `wrap` functions. — hpaulj, Jul 01 '16 at 20:13

Numpy apply_along_axis not returning ndarray subclass

1 Answers1