Pandas find idxmax() between range

Question

I have this timeseries df:

                    Current
2018-09-01 00:00      -0.01
2018-09-01 00:01      -0.03
2018-09-01 00:02      -0.01
2018-09-01 00:03       0.03
2018-09-01 00:04      -0.02
2018-09-01 00:05      -0.04
2018-09-01 00:06       0.05

I am trying to find the first instance of a Current value being > 0.01. If I use

findValue = (df['Current'] > 0.01).idxmax()

I will return:

2018-09-01 00:03 0.03.

However, I would like to ignore the first 5 rows, so that the return should be

 2018-09-01 00:06       0.05

I have tried using shift():

findValue = (df['Current'] > 0.01).shift(5).idxmax()

But this doesn't seem right...

jezrael · Accepted Answer · 2019-01-22T10:16:59.627

You can use iloc for seelct all columns without first 5 by indexing:

N = 5
findValue = (df['Current'].iloc[N:] > 0.01).idxmax()
print (findValue)
2018-09-01 00:06

Another idea is create another boolean mask by np.arange and length of DataFrame and chained by &:

m1 = df['Current'] > 0.01
m2 = np.arange(len(df)) >= 5
findValue = (m1 & m2).idxmax()
print (findValue)
2018-09-01 00:06

If need select by value in DatetimeIndex:

findValue = (df['Current'].loc['2018-09-01 00:05':] > 0.01).idxmax()
print (findValue)
2018-09-01 00:06:00

m1 = df['Current'] > 0.01
m2 = df.index >= '2018-09-01 00:05'
findValue = (m1 & m2).idxmax()
print (findValue)
2018-09-01 00:06:00

BUT:

idxmax return first False value, if not match any value:

m1 = df['Current'] > 5.01
m2 = np.arange(len(df)) >= 5
findValue = (m1 & m2).idxmax()

print (findValue)
2018-09-01 00:00:00

Possible solution is use next with iter:

m1 = df['Current'] > 5.01
m2 = np.arange(len(df)) >= 5
findValue = next(iter(df.index[m1 & m2]), 'no exist')

print (findValue)
no exist

If performance is important, check this nice @jpp Q/A - Efficiently return the index of the first value satisfying condition in array.

Thanks. For the first solution, how can I get just the index of the 5th row (Start of the idxmax range: `2018-09-01 00:04`?) — warrenfitzhenry, Jan 22 '19 at 11:00
@wazzahenry - use `df.index[4]`, because python count from `0` — jezrael, Jan 22 '19 at 11:09

Pandas find idxmax() between range

1 Answers1