1

Is there a more efficient way in pandas, numpy or other to find and return the indexes of local maximas that also exceed a threshold and are separated by a distance called local_max as shown below? The code that I have is working I think but not very clean.

import pandas as pd
import numpy as np

np.random.seed(101)
df = pd.DataFrame(np.random.randn(20,1),columns=['surge'])
    surge
0   2.706850
1   0.628133
2   0.907969
3   0.503826
4   0.651118
5   -0.319318
6   -0.848077
7   0.605965
8   -2.018168
9   0.740122
10  0.528813
11  -0.589001
12  0.188695
13  -0.758872
14  -0.933237
15  0.955057
16  0.190794
17  1.978757
18  2.605967
19  0.683509
surge_threshold = .7
local_max = 5 # seperate results exeeding the surge threshold by 5 rows and return the highest local surge value.

# look for a surge
df.dropna(inplace=True)
i = df.first_valid_index()
markers = []
while i + 1 <= df.index[-1]:
    if df.loc[i,'surge'] > surge_threshold:
        # check if markers is an empty list
        if markers:
            if (i - markers[-1] < local_max):
                if (df.loc[i,'surge'] >= df.loc[markers[-1],'surge']):
                    markers[-1] = i
            else:
                markers.append(i)
        else:
            markers.append(i)
    i += 1
print(markers)

result for this df is [0,9,18] because they are the indexes of the local maximas that exceed .7 and are separated by at least 5 rows.

beandip
  • 11
  • 3
  • Is there any reason why 0 should be included? – Scott Boston Nov 25 '19 at 19:33
  • This is not quite well-defined: You'd get different results if you looped from the end instead of the start. You could define your expected behavior so that it's what you get by your current approach, but that's still a smell, so is this really what you want? – fuglede Nov 25 '19 at 19:42
  • Hi, sorry, pretty bad I know. just learning. I put in a better version of my code that doesn't start with 0 and runs the same forward and backward. – beandip Nov 25 '19 at 20:54
  • There's quite a few pretty decent answers to a similar questions on SO. E.g. https://stackoverflow.com/a/35289406/11610186 – Grzegorz Skibinski Nov 25 '19 at 21:19
  • I tried to make argrelextrema work. It's not really for finance apps. It takes more code to massage the input and output including mapping indexes back to the indexes of the df. It doesn't take a threshold as input so number of maxima is based on distance and not dollars. I agree it could be made to work for my case. – beandip Nov 25 '19 at 22:09

0 Answers0