I have the following data:
data = [0.1, 0.2, 0.3, 0.4 , 0.5, 0.6, 0.7, 0.8, 0.5, 0.2, 0.1, -0.1,
-0.2, -0.3, -0.4, -0.5, -0.6, -0.7, -0.9, -1.2, -0.1, -0.7]
Every time the data point changes more than the step size, I want to record it. If to does not I want to keep the old one, until the cumulative change is at least as much as the step size. I achieve this iteratively like so:
import pandas as pd
from copy import deepcopy
import numpy as np
step = 0.5
df_steps = pd.Series(data)
df = df_steps.copy()
today = None
yesterday = None
for index, value in df_steps.iteritems():
today = deepcopy(index)
if today is not None and yesterday is not None:
if abs(df.loc[today] - df_steps.loc[yesterday]) > step:
df_steps.loc[today] = df.loc[today]
else:
df_steps.loc[today] = df_steps.loc[yesterday]
yesterday = deepcopy(today)
My final result is:
[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.7, 0.7, 0.7, 0.7, 0.1, 0.1, 0.1, 0.1, 0.1, -0.5, -0.5, -0.5, -0.5, -1.2, -0.1, -0.7]
Problem and Question
The problem is that this is achieved iteratively (I agree with the second answer here). My question is how can I achieve the same in a vectorized fashion?
Attempts
My attempt is the following, but it does not match the result:
(df.diff().cumsum().replace(np.nan, 0) / step).astype(int)