Most efficient way to implement rolling windows on NumPy arrays

Question

Given some small window, I'm trying to find the most similar window within a long sequence. I initially used SciPy correlation filter, which was pretty fast (less than a second for windows of length 10k and a sequence of length 600k) but did not actually land on the most similar windows.

Now I'm using a for-loop to find the window with the least MSE, but the code is painfully slow!

min_mse = np.inf
min_mse_idx = None
for i in range(len(training_data) - self.window_size):
  real_window = training_data[i:i+self.window_size]
  mse = np.mean(np.square(window - real_window))
  if mse < min_mse:
    min_mse = mse
    min_mse_idx = i

Does NumPy, SciPy, or any other Python library provide a more efficient way of solving this problem? The sequence is a NumPy array of shape (600000, 16) and the windows are usually (10000, 16).

Seems very relevant - [`Compute mean squared, absolute deviation and custom similarity measure - Python/NumPy`](https://stackoverflow.com/questions/41330517/compute-mean-squared-absolute-deviation-and-custom-similarity-measure-python). — Divakar, Dec 22 '19 at 10:23
The solutions offered in that question work best for images and not long sequences. — Mohammed Farahmand, Dec 22 '19 at 19:44
The only difference I see is that you won't be traversing along the width. So, the math stays the same, just the number of axes would be one less. — Divakar, Dec 22 '19 at 19:59

Most efficient way to implement rolling windows on NumPy arrays

0 Answers0