0

I want to calculate the maximum of a column over a sliding window. This solution provides a remedy that works when the index is numeric but I have DatetimeIndex object as my index. I do not want to create a new index column and solve my problem. Instead, I want to solve it directly using rolling because my data is big. Also, I do not want to use for. Here is a replicable example:

np.random.seed(0)
dtidx = pd.date_range(start='6/1/2022', end='6/2/2022', freq='min')
df = pd.DataFrame(np.cumsum(np.random.randn(len(dtidx))), index=dtidx)
ax = df.plot(figsize=(20, 4))
window = 60
df.rolling(window).max().plot(ax=ax)
plt.show()
df.rolling(window).apply(lambda x: x.idxmax())

enter image description here

My question

How can I catch only maximums and denote them on the graph. Currently, the maximums are denoted by 60 minutes delay.

Saeed
  • 598
  • 10
  • 19
  • Do you want the position of the maximums or the datetime at which the maximum occured? – Onyambu Jun 06 '22 at 17:19
  • @onyambu: I want the datetime at which a maximum happens so that I can use them and I can plot them on the underlying graph. To me, position and datetime that which maximum happens are the same. – Saeed Jun 06 '22 at 17:24
  • in that case does `df.rolling(60).apply(lambda x: x.reset_index(drop = True).idxmax())` help? – Onyambu Jun 06 '22 at 17:26
  • @onyambu: no. it gives me a wired answer. it provides ```2022-06-01 23:59:00 28.0 2022-06-02 00:00:00 27.0``` while the original signal is ```2022-06-01 23:59:00 -32.199537 2022-06-02 00:00:00 -32.052102```. Also, this new dataframe has the same length as the original one. However, it should have less than 20 points. – Saeed Jun 06 '22 at 17:40
  • No no dont look at the indices but rather the value within. Let me edit it for you – Onyambu Jun 06 '22 at 17:41
  • @onyambu: I am looking at values. I have included them in my previous comment. They are not right. I edited my previous comment. Please take a look at it. – Saeed Jun 06 '22 at 17:45
  • checnk the indices it gives `df.iloc[df.rolling(60).apply(lambda x: x.reset_index(drop = True).idxmax()).dropna().astype(int).to_numpy().flatten()].index` – Onyambu Jun 06 '22 at 17:45
  • @onyambu: please check your answers and run them. This new line gives 1382 indices which is not right. – Saeed Jun 06 '22 at 17:48
  • The size should be 1382. Note that the window is 60. So the values will be 1441-60+1. even when you do `df.rolling(window).max().dropna().size` should give you 1382 – Onyambu Jun 06 '22 at 17:52
  • @onyambu: I think you are misunderstanding the problem. Just take a look at the picture, we have less than 20 local maximums. I want the coordinates of them. What you are providing to me is the orange curve that I have in my picture! – Saeed Jun 06 '22 at 17:56

1 Answers1

0

According to this solution you can do the following:

from scipy.signal import argrelextrema

window=20

max = df.iloc[argrelextrema(df.values, np.greater_equal, order=window)[0]]
fig, ax = plt.subplots(figsize=(20,4))
ax.scatter(max.index, max.values, color='r')
ax.plot(df.index, df.values, color='b')
plt.show()

enter image description here

Saeed
  • 598
  • 10
  • 19