Given some small window, I'm trying to find the most similar window within a long sequence. I initially used SciPy correlation filter, which was pretty fast (less than a second for windows of length 10k and a sequence of length 600k) but did not actually land on the most similar windows.
Now I'm using a for-loop to find the window with the least MSE, but the code is painfully slow!
min_mse = np.inf
min_mse_idx = None
for i in range(len(training_data) - self.window_size):
real_window = training_data[i:i+self.window_size]
mse = np.mean(np.square(window - real_window))
if mse < min_mse:
min_mse = mse
min_mse_idx = i
Does NumPy, SciPy, or any other Python library provide a more efficient way of solving this problem? The sequence is a NumPy array of shape (600000, 16)
and the windows are usually (10000, 16)
.