Numpy: Most computationally efficient way to get the mean of slices along an axis where the slices indices value are defined on that axis

Question

For a 2D array, I would like to get the average of a particular slice in each row, where the slice indices are defined in the last two columns of each row.

Example:

sample = np.array([
    [ 0,  1,  2,  3,  4,  2,  5],
    [ 5,  6,  7,  8,  9,  0,  3],
    [10, 11, 12, 13, 14,  1,  4],
    [15, 16, 17, 18, 19,  3,  5],
    [20, 21, 22, 23, 24,  2,  4]
])

So for row 1, I would like to get sample[0][2:5].mean(), row 2 I would like to get sample[0][0:3].mean(), row 3 sample[0][1:4].mean(), etc.

I came up with a way using apply_along_axis

def average_slice(x):
    return x[x[-2]:x[-1]].mean()

np.apply_along_axis(average_slice, 1, sample)```

array([ 3. ,  6. , 12. , 18.5, 22.5])

However, 'apply_along_axis' seems to be very slow.

numpy np.apply_along_axis function speed up?

From from source code, it seems that there are conversions to lists and direct looping, though I don't have a full comprehension on this code

https://github.com/numpy/numpy/blob/v1.22.0/numpy/lib/shape_base.py#L267-L414

I am wondering if there is a more computationally efficient solution than the one I came up with.

score 1 · Accepted Answer · answered Mar 03 '22 at 12:49

Bit of hacky, but one way using numpy.cumsum about 200x faster:

def faster(arr):
    ind = arr[:, -2:]
    padded = np.pad(arr.cumsum(axis=1), ((0, 0), (1, 0)))
    res = np.diff(np.take_along_axis(padded, ind, axis=1))/np.diff(ind)
    return res.ravel()

faster(sample)

Output:

array([ 3. ,  6. , 12. , 18.5, 22.5])

Benchmark:

large = sample[np.random.randint(0, 5, 10000)]

%timeit np.apply_along_axis(average_slice, 1, large)
# 47 ms ± 166 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit faster(large)
# 305 µs ± 2.36 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Validation:

np.array_equal(faster(large), np.apply_along_axis(average_slice, 1, large))
# True

What do you think of using masked array? See other answer submitted https://stackoverflow.com/a/71359001/3259896 . I tried timing it like in your solution, but my result says that intermediate results might be cached, so I can't seem to get a clean comparison — SantoshGupta7, Mar 05 '22 at 02:38

SantoshGupta7 · Answer 2 · 2022-03-05T02:50:16.213

I also see using a masked array works

col_idxs = np.arange(sample.shape[1]) 
mask = (col_idxs < sample[:, [-2]]) | (col_idxs >= sample[:, [-1]])
np.ma.array(sample, mask=mask).mean(axis=1).data

array([ 3. ,  6. , 12. , 18.5, 22.5])

I tried timing them like Chris did

def faster(arr):
    ind = arr[:, -2:]
    padded = np.pad(arr.cumsum(axis=1), ((0, 0), (1, 0)))
    res = np.diff(np.take_along_axis(padded, ind, axis=1))/np.diff(ind)
    return res.ravel()

def mfaster(arr):
    col_idxs = np.arange(arr.shape[1]) 
    mask = (col_idxs < arr[:, [-2]]) | (col_idxs >= arr[:, [-1]])
    return np.ma.array(arr, mask=mask).mean(axis=1).data

large = sample[np.random.randint(0, 5, 10000)]

%timeit faster(large)

The slowest run took 5.80 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 5: 1.46 ms per loop

%timeit mfaster(large)

100 loops, best of 5: 2.23 ms per loop

But the timing of the first one seems like it could be wrong, it mentioned that the result is being cached

EDIT

I tried it this way

start = time.time()
for i in range(1000):
    large = sample[np.random.randint(0, 5, 10000)]
    faster(large)
end = time.time()
print(f'It took {end - start} seconds!')

It took 1.478731393814087 seconds!

start = time.time()
for i in range(1000):
    large = sample[np.random.randint(0, 5, 10000)]
    mfaster(large)
end = time.time()
print(f'It took {end - start} seconds!')

It took 2.5966928005218506 seconds!

So the other solution still seems faster

Numpy: Most computationally efficient way to get the mean of slices along an axis where the slices indices value are defined on that axis

2 Answers2

Linked