TL;DR: Is there anyway I can get rid of my second for
-loop?
I have a time series of points on a 2D-grid. To get rid of fast fluctuations of their position, I average the coordinates over a window of frames. Now in my case, it's possible for the points to cover a larger distance than usual. I don't want to include frames for a specific point, if it travels farther than the cut_off
value.
In the first for
-loop, I go over all frames and define the moving window. I then calculate the distances between the current frame and each frame in the moving window. After I grab only those positions from all frames, where both the x
and y
component did not travel farther than cut_off
. Now I want to calculate the mean positions for every point from all these selected frames of the moving window (note: the number of selected frames can be smaller than n_window
). This leads me to the second for
-loop. Here I iterate over all points and actually grab the positions from the frames, in which the current point did not travel farther than cut_off
. From these selected frames I calculate the mean value of the coordinates and use it as the new value for the current frame.
This very last for
-loop slows down the whole processing. I can't come up with a better way to accomplish this calculation. Any suggestions?
MWE
Put in comments for clarification.
import numpy as np
# Generate a timeseries with 1000 frames, each
# containing 50 individual points defined by their
# x and y coordinates
n_frames = 1000
n_points = 50
n_coordinates = 2
timeseries = np.random.randint(-100, 100, [n_frames, n_points, n_coordinates])
# Set window size to 20 frames
n_window = 20
# Distance cut off
cut_off = 60
# Set up empty array to hold results
avg_data_store = np.zeros([n_frames, timeseries.shape[1], 2])
# Iterate over all frames
for frame in np.arange(0, n_frames):
# Set the frame according to the window size that we're looking at
t_before = int(frame - (n_window / 2))
t_after = int(frame + (n_window / 2))
# If we're trying to access frames below 0, set the lowest one to 0
if t_before < 0:
t_before = 0
# Trying to access frames that are not in the trajectory, set to last frame
if t_after > n_frames - 1:
t_after = n_frames - 1
# Grab x and y coordinates for all points in the corresponding window
pos_before = timeseries[t_before:frame]
pos_after = timeseries[frame + 1:t_after + 1]
pos_now = timeseries[frame]
# Calculate the distance between the current frame and the windows before/after
d_before = np.abs(pos_before - pos_now)
d_after = np.abs(pos_after - pos_now)
# Grab indices of frames+points, that are below the cut off
arg_before = np.argwhere(np.all(d_before < cut_off, axis=2))
arg_after = np.argwhere(np.all(d_after < cut_off, axis=2))
# Iterate over all points
for i in range(0, timeseries.shape[1]):
# Create temp array
temp_stack = pos_now[i]
# Grab all frames in which the current point did _not_
# travel farther than `cut_off`
all_before = arg_before[arg_before[:, 1] == i][:, 0]
all_after = arg_after[arg_after[:, 1] == i][:, 0]
# Grab the corresponding positions for this points in these frames
all_pos_before = pos_before[all_before, i]
all_pos_after = pos_after[all_after, i]
# If we have any frames for that point before / after
# stack them into the temp array
if all_pos_before.size > 0:
temp_stack = np.vstack([all_pos_before, temp_stack])
if all_pos_after.size > 0:
temp_stack = np.vstack([temp_stack, all_pos_after])
# Calculate the moving window average for the selection of frames
avg_data_store[frame, i] = temp_stack.mean(axis=0)