42

Lets say I have a Python Numpy array a.

a = numpy.array([1,2,3,4,5,6,7,8,9,10,11])

I want to create a matrix of sub sequences from this array of length 5 with stride 3. The results matrix hence will look as follows:

numpy.array([[1,2,3,4,5],[4,5,6,7,8],[7,8,9,10,11]])

One possible way of implementing this would be using a for-loop.

result_matrix = np.zeros((3, 5))
for i in range(0, len(a), 3):
  result_matrix[i] = a[i:i+5]

Is there a cleaner way to implement this in Numpy?

kafman
  • 2,862
  • 1
  • 29
  • 51
Stackd
  • 683
  • 1
  • 6
  • 12

3 Answers3

55

Approach #1 : Using broadcasting -

def broadcasting_app(a, L, S ):  # Window len = L, Stride len/stepsize = S
    nrows = ((a.size-L)//S)+1
    return a[S*np.arange(nrows)[:,None] + np.arange(L)]

Approach #2 : Using more efficient NumPy strides -

def strided_app(a, L, S ):  # Window len = L, Stride len/stepsize = S
    nrows = ((a.size-L)//S)+1
    n = a.strides[0]
    return np.lib.stride_tricks.as_strided(a, shape=(nrows,L), strides=(S*n,n))

Sample run -

In [143]: a
Out[143]: array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [144]: broadcasting_app(a, L = 5, S = 3)
Out[144]: 
array([[ 1,  2,  3,  4,  5],
       [ 4,  5,  6,  7,  8],
       [ 7,  8,  9, 10, 11]])

In [145]: strided_app(a, L = 5, S = 3)
Out[145]: 
array([[ 1,  2,  3,  4,  5],
       [ 4,  5,  6,  7,  8],
       [ 7,  8,  9, 10, 11]])
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • Thanks, I tried this : X = np.arange(100) Y = strided_app(X, 4, 1) Which gives Y as expected, and now : Z = strided_app(Y, 8, 4)# I want Z to view Y with a moving window of length 8 and step 4, but this results in junk. Can you please correct? – volatile Jan 19 '17 at 15:28
  • @volatile I am not sure what you are expecting with `Z`? Would `Z = strided_app(X, 8, 4)` give the desired `Z`? Are you expecting `Z` as 3D array instead? – Divakar Jan 19 '17 at 16:42
  • Do you also get a crazy result from `strided_app(np.array([np.arange(6)]),3,1)`? I'm using python 3.6 and numpy 1.13.1 in Anaconda. – Ziofil Sep 12 '17 at 05:11
  • @Ziofil `strided_app` assumes 1D array. `np.array([np.arange(6)])` is a 2D array. Not sure why you need to wrap it with `np.array()` as `np.arange(6)` is already an array. What exactly are you trying to do with `np.array()`? – Divakar Sep 12 '17 at 05:14
  • Nothing in particular, I noticed the overflows and I thought you might find it amusing. – Ziofil Sep 12 '17 at 05:37
  • 3
    I have used `as_strided` previously but found that it caused a very serious memory leak. This isn't an issue for small arrays but even using 64 GB of RAM on a server, my python programs raised MemoryError. Highly recommend using the `broadcasting_app` method. – pacificgilly1992 Dec 09 '17 at 08:27
  • 1
    Dude this is so automagical!. I was implementing Shi-Tomasi corner detection algo where I had to create a window for each pixel and compute something complex. This method immediately gave me all the windows!!! – insanely_sin Nov 18 '18 at 09:07
  • as_strided is not recommended by the developers of numpy for general usage see notes here https://docs.scipy.org/doc/numpy/reference/generated/numpy.lib.stride_tricks.as_strided.html – kkawabat Oct 30 '19 at 18:04
  • 1
    @kkawabat They are simply saying that we need to be careful when using it, understanding what it does. That `writeable` flag could be added to on the safer side. Modules like `scikit-image` also [uses `as_strided`](https://github.com/scikit-image/scikit-image/blob/master/skimage/util/shape.py#L98). – Divakar Oct 30 '19 at 18:09
  • @Divakar There is a line at the bottom of the reference that I linked, "For these reasons it is advisable to avoid as_strided when possible.". I guess "general usage" is vague but if a method requires care in usage it should be noted for the general audience who might stumble upon the code. – kkawabat Oct 30 '19 at 18:14
  • @Divakar can you please explain me the broadcasting_app part – Yami Kanashi Apr 02 '20 at 23:56
  • Thank you @Divakar, your function helped quite a bit. I'd suggest checking to ensure the ndarray is contiguous and setting the output as not writable, or you could run into problems with people modifying it. I added an alternative solution below. – Marie R Apr 05 '20 at 18:01
  • @Divakar: For the `strided_app`, is it a typo in `n = a.strides[0]`. Should it be `n = a.strides[1]` ? – Andy L. Sep 18 '20 at 00:28
  • 1
    @AndyL. Well input array is 1D, so `n = a.strides[0]` is good. – Divakar Sep 18 '20 at 05:35
  • ah, I missed that. I tested it on 2-D array, that's why I thought it is a typo :) – Andy L. Sep 18 '20 at 08:40
  • Thanks @Divakar for your comments – Ranjan Pal May 02 '21 at 16:41
10

Starting in Numpy 1.20, we can make use of the new sliding_window_view to slide/roll over windows of elements.

And coupled with a stepping [::3], it simply becomes:

from numpy.lib.stride_tricks import sliding_window_view

# values = np.array([1,2,3,4,5,6,7,8,9,10,11])
sliding_window_view(values, window_shape = 5)[::3]
# array([[ 1,  2,  3,  4,  5],
#        [ 4,  5,  6,  7,  8],
#        [ 7,  8,  9, 10, 11]])

where the intermediate result of the sliding is:

sliding_window_view(values, window_shape = 5)
# array([[ 1,  2,  3,  4,  5],
#        [ 2,  3,  4,  5,  6],
#        [ 3,  4,  5,  6,  7],
#        [ 4,  5,  6,  7,  8],
#        [ 5,  6,  7,  8,  9],
#        [ 6,  7,  8,  9, 10],
#        [ 7,  8,  9, 10, 11]])
Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
0

Modified version of @Divakar's code with checking to ensure that memory is contiguous and that the returned array cannot be modified. (Variable names changed for my DSP application).

def frame(a, framelen, frameadv):
"""frame - Frame a 1D array
a - 1D array
framelen - Samples per frame
frameadv - Samples between starts of consecutive frames
   Set to framelen for non-overlaping consecutive frames

Modified from Divakar's 10/17/16 11:20 solution:
https://stackoverflow.com/questions/40084931/taking-subarrays-from-numpy-array-with-given-stride-stepsize

CAVEATS:
Assumes array is contiguous
Output is not writable as there are multiple views on the same memory

"""

if not isinstance(a, np.ndarray) or \
   not (a.flags['C_CONTIGUOUS'] or a.flags['F_CONTIGUOUS']):
    raise ValueError("Input array a must be a contiguous numpy array")

# Output
nrows = ((a.size-framelen)//frameadv)+1
oshape = (nrows, framelen)

# Size of each element in a
n = a.strides[0]

# Indexing in the new object will advance by frameadv * element size
ostrides = (frameadv*n, n)
return np.lib.stride_tricks.as_strided(a, shape=oshape,
                                       strides=ostrides, writeable=False)
Marie R
  • 270
  • 3
  • 9