2

When using np.lib.stride_tricks.as_strided, how can I manage 2D a array with the nested arrays as data values? Is there a preferable efficient approach?

Specifically, if I have a 2D np.array looking as follows, where each data item in a 1D array is an array of length 2:

[[1., 2.],[3., 4.],[5.,6.],[7.,8.],[9.,10.]...]

I want to reshape for rolling over as follows:

[[[1., 2.],[3., 4.],[5.,6.]],
 [[3., 4.],[5.,6.],[7.,8.]],
 [[5.,6.],[7.,8.],[9.,10.]],
  ...
]

I have had a look at similar answers (e.g. this rolling window function), however in use I cannot leave the inner array/tuples untouched.

For example with a window length of 3: I have tried a shape of (len(seq)+3-1, 3, 2) and a stride of (2 * 8, 2 * 8, 8), but no luck. Maybe I am missing something obvious?

Cheers.


EDIT: It is easy to produce a functionally identical solution using Python built-ins (which can be optimised using e.g. np.arange similar to Divakar's solution), however, what about using as_strided? From my understanding, this could be used for a highly efficient solution?

Community
  • 1
  • 1

3 Answers3

4

IIUC you could do something like this -

def rolling_window2D(a,n):
    # a: 2D Input array 
    # n: Group/sliding window length
    return a[np.arange(a.shape[0]-n+1)[:,None] + np.arange(n)]

Sample run -

In [110]: a
Out[110]: 
array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])

In [111]: rolling_window2D(a,3)
Out[111]: 
array([[[ 1,  2],
        [ 3,  4],
        [ 5,  6]],

       [[ 3,  4],
        [ 5,  6],
        [ 7,  8]],

       [[ 5,  6],
        [ 7,  8],
        [ 9, 10]]])
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • Thanks, this is functionally correct! Performance-wise, however, is this not inferior to a `as_strided` solution as I was attempting to achieve? Clearly this should be faster than any built-in `range` solution, however. – AskingForAFriend Aug 29 '16 at 11:14
  • @Kappers Well I haven't really played around with `strides` much, so I can't comment on the performance aspect. So, at the least consider this as an alternative to strides in case `strides` method isn't working out for you, that's what I gathered from the question. – Divakar Aug 29 '16 at 11:17
2

What was wrong with your as_strided trial? It works for me.

In [28]: x=np.arange(1,11.).reshape(5,2)
In [29]: x.shape
Out[29]: (5, 2)
In [30]: x.strides
Out[30]: (16, 8)
In [31]: np.lib.stride_tricks.as_strided(x,shape=(3,3,2),strides=(16,16,8))
Out[31]: 
array([[[  1.,   2.],
        [  3.,   4.],
        [  5.,   6.]],

       [[  3.,   4.],
        [  5.,   6.],
        [  7.,   8.]],

       [[  5.,   6.],
        [  7.,   8.],
        [  9.,  10.]]])

On my first edit I used an int array, so had to use (8,8,4) for the strides.

Your shape could be wrong. If too large it starts seeing values off the end of the data buffer.

   [[  7.00000000e+000,   8.00000000e+000],
    [  9.00000000e+000,   1.00000000e+001],
    [  8.19968827e-257,   5.30498948e-313]]])

Here it just alters the display method, the 7, 8, 9, 10 are still there. Writing those those slots could be dangerous, messing up other parts of your code. as_strided is best if used for read-only purposes. Writes/sets are trickier.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • Hm, maybe I missed something very obvious... Thanks for playing around, I will verify as soon as possible! Unfortunately I have been too busy to sit down with the associated project over the last day. – AskingForAFriend Aug 31 '16 at 06:47
  • Thanks, I indeed answered my own question and was too stupid to realise - sigh. In the end, this didn't really bring any performance to the table - I'll need to revisit if using a more complex/higher dimension data structure, or maybe just larger data sets. – AskingForAFriend Aug 31 '16 at 18:17
0

You task is similar to this one. So I slightly changed it.

# Rolling window for 2D arrays in NumPy
import numpy as np

def rolling_window(a, shape):  # rolling window for 2D array
    s = (a.shape[0] - shape[0] + 1,) + (a.shape[1] - shape[1] + 1,) + shape
    strides = a.strides + a.strides
    return np.lib.stride_tricks.as_strided(a, shape=s, strides=strides)

x = np.array([[1,2],[3,4],[5,6],[7,8],[9,10],[3,4],[5,6],[7,8],[11,12]])
y = np.array([[3,4],[5,6],[7,8]])
found = np.all(np.all(rolling_window(x, y.shape) == y, axis=2), axis=2)
print(found.nonzero()[0])
FooBar167
  • 2,721
  • 1
  • 26
  • 37