3

Currently I am accessing multiple slices as follows:

First, I allocate an array that will be re-assigned many times

X = np.zeros( (batch_size, window, 5) )

This is the assignment loop that will be run multiple times (batch_indices has different indices each time but the same shape):

for i, b in enumerate(batch_indices):
    X[i] = Xs[b:b+window]

Is there a more efficient way? I feel like there should be syntax similar to:

X = Xs[ [slice(b,b+window) for b in batch_indices] ]

While the shape of Xs is 2-dimensional, the final shape of X should be a 3-dimensional np.array. Think of it as follows: Xs is one long multi-dimensional time-series, and X needs to be a numpy array containing many slices of the multi-dimensional time-series.

jamis
  • 247
  • 3
  • 10

1 Answers1

2

Approach #1

One vectorized approach would be to create all those sliding windowed indices and index into Xs with those, like so -

X = Xs[np.asarray(batch_indices)[:,None] + np.arange(window)]

Approach #2

Another memory efficient approach would be to create sliding-windows with np.lib.stride_tricks.as_strided, thus avoiding the creation of the sliding windowed indices as done in the previous approach and simply index with batch_indices, like so -

X = strided_axis0(Xs,window)[np.asarray(batch_indices)]

Strides based function strided_axis0 is from here.

Divakar
  • 218,885
  • 19
  • 262
  • 358