0

I have an original array, e.g.:

import numpy as np
original = np.array([56, 30, 48, 47, 39, 38, 44, 18, 64, 56, 34, 53, 74, 17, 72, 13, 30, 17, 53])

The desired output is an array made up of a fixed-size window sliding through multiple iterations, something like

[56, 30, 48, 47, 39, 38],
[30, 48, 47, 39, 38, 44],
[48, 47, 39, 38, 44, 18],
[47, 39, 38, 44, 18, 64],
[39, 38, 44, 18, 64, 56],
[38, 44, 18, 64, 56, 34],
[44, 18, 64, 56, 34, 53],
[18, 64, 56, 34, 53, 74],
[64, 56, 34, 53, 74, 17],
[56, 34, 53, 74, 17, 72]

At the moment I'm using

def myfunc():
    return np.array([original[i: i+k] for i in range(i_range)])

with parameters i_range = 10 and k = 6, using python's timeit module (10000 iter), I'm getting close to 0.1 seconds. Can this be improved 100x by any chance?

I've also tried Numba but the result wasn't ideal, as it shines better with larger arrays.

NOTE: the arrays used in this post are reduced for demo purpose, actual size of original is at around 500.

UdonN00dle
  • 723
  • 6
  • 28
  • Thanks to all contributors, I've found what I was looking for, using `numpy.lib.stride_tricks` package and a slightly modified version of the `rolling_window` function posted [here](https://stackoverflow.com/a/6811241/7613480). Source: https://rigtorp.se/2011/01/01/rolling-statistics-numpy.html – UdonN00dle May 17 '21 at 16:56

2 Answers2

1

Use np.lib.stride_tricks.sliding_window_view

RandomGuy
  • 1,076
  • 6
  • 17
1

As RandomGuy suggested, you can use stride_tricks:

np.lib.stride_tricks.as_strided(original,(i_range,k),(8,8))

For larger arrays (and i_range and k) this is probably the most efficient, as it does not allocate any additional memory, there's a drawback - editing the created array would modify the original array as well, unless you make a copy. The (8,8) parameter define how many bytes in the memory you advance in each direction, I use 8 as its the original array stride size.

Another option, which works better for smaller arrays:

def myfunc2():
    i_s = np.arange(i_range).reshape(-1,1)+np.arange(k)
    return original[i_s]

This is faster than your original version. Both, however, are not 100x faster.

Dinari
  • 2,487
  • 13
  • 28