How to speed up Numpy array slicing within a for loop?

Question

I have an original array, e.g.:

import numpy as np
original = np.array([56, 30, 48, 47, 39, 38, 44, 18, 64, 56, 34, 53, 74, 17, 72, 13, 30, 17, 53])

The desired output is an array made up of a fixed-size window sliding through multiple iterations, something like

[56, 30, 48, 47, 39, 38],
[30, 48, 47, 39, 38, 44],
[48, 47, 39, 38, 44, 18],
[47, 39, 38, 44, 18, 64],
[39, 38, 44, 18, 64, 56],
[38, 44, 18, 64, 56, 34],
[44, 18, 64, 56, 34, 53],
[18, 64, 56, 34, 53, 74],
[64, 56, 34, 53, 74, 17],
[56, 34, 53, 74, 17, 72]

At the moment I'm using

def myfunc():
    return np.array([original[i: i+k] for i in range(i_range)])

with parameters i_range = 10 and k = 6, using python's timeit module (10000 iter), I'm getting close to 0.1 seconds. Can this be improved 100x by any chance?

I've also tried Numba but the result wasn't ideal, as it shines better with larger arrays.

NOTE: the arrays used in this post are reduced for demo purpose, actual size of original is at around 500.

Thanks to all contributors, I've found what I was looking for, using `numpy.lib.stride_tricks` package and a slightly modified version of the `rolling_window` function posted [here](https://stackoverflow.com/a/6811241/7613480). Source: https://rigtorp.se/2011/01/01/rolling-statistics-numpy.html — UdonN00dle, May 17 '21 at 16:56

score 1 · Answer 1 · answered May 17 '21 at 12:35

1

Use np.lib.stride_tricks.sliding_window_view

answered May 17 '21 at 12:35

RandomGuy

1,076
6
17

1

Referencing this new function is good, but you should elaborate on how it applies to this case. – hpaulj May 17 '21 at 16:21
There's litteraly the same example as the author wanted in the documentation – RandomGuy May 18 '21 at 07:06

score 1 · Accepted Answer · answered May 17 '21 at 13:55

As RandomGuy suggested, you can use stride_tricks:

np.lib.stride_tricks.as_strided(original,(i_range,k),(8,8))

For larger arrays (and i_range and k) this is probably the most efficient, as it does not allocate any additional memory, there's a drawback - editing the created array would modify the original array as well, unless you make a copy. The (8,8) parameter define how many bytes in the memory you advance in each direction, I use 8 as its the original array stride size.

Another option, which works better for smaller arrays:

def myfunc2():
    i_s = np.arange(i_range).reshape(-1,1)+np.arange(k)
    return original[i_s]

This is faster than your original version. Both, however, are not 100x faster.

How to speed up Numpy array slicing within a for loop?

2 Answers2