2

I need to find the local max and minimum in pandas DataFrame, at first it looks like this is the same question as Pandas finding local max and min but neither of the suggested solutions seems correct.

In[876]: import pandas as pd
    ...: 
    ...: df = pd.DataFrame({'data': [1, 1, 2, 2, 1, 0, 0, -2, 0]})
    ...: 
    ...: # Test 1, missing max in df.iloc[3], min correct
    ...: df['min'] = df.data[(df.data.shift(1) > df.data) & (df.data.shift(-1) > df.data)]
    ...: df['max'] = df.data[(df.data.shift(1) < df.data) & (df.data.shift(-1) < df.data)]
    ...: 
In[877]: df
Out[877]: 
   data  min  max
0     1  NaN  NaN
1     1  NaN  NaN
2     2  NaN  NaN
3     2  NaN  NaN
4     1  NaN  NaN
5     0  NaN  NaN
6     0  NaN  NaN
7    -2 -2.0  NaN
8     0  NaN  NaN
In[878]: 
In[878]: # Test 2, max incorrect, min incorrect
    ...: # max in iloc = 3, 6
    ...: # min in iloc = 1, 7
    ...: df['min'] = df.data[(df.data.shift(1) >= df.data) & (df.data.shift(-1) > df.data)]
    ...: df['max'] = df.data[(df.data.shift(1) <= df.data) & (df.data.shift(-1) < df.data)]
    ...: 
In[879]: df
Out[879]: 
   data  min  max
0     1  NaN  NaN
1     1  1.0  NaN
2     2  NaN  NaN
3     2  NaN  2.0
4     1  NaN  NaN
5     0  NaN  NaN
6     0  NaN  0.0
7    -2 -2.0  NaN
8     0  NaN  NaN

I would like to identify the local minimums and maximums, not the plateau values. The correct identification would be:

  • maximum in iloc 2 or 3 (Doesn't matter)
  • minimum in iloc 7

One solution would be to start writing loops and ifs/elses but it's getting ugly... My guess is that there might be simpler solutions using pandas with some knowhow that I lack, any help would be deeply appreciated.

I'm new to both Python and Stack Overflow so I hope you will forgive any newbie mistakes, cheers.

särimner
  • 23
  • 4
  • 1
    Why is max in position 6 incorrect? – Dani Mesejo Dec 17 '20 at 14:55
  • When you have 12344455332 the only local max is values are 5 and 5? – Tarik Dec 17 '20 at 14:56
  • @DaniMesejo: Max at iloc=6 is between the value of the previous max and before the minimum value in iloc=7, there must be a local min before a new local max. – särimner Dec 17 '20 at 15:18
  • @Tarik: Yes. The problem in my example is that plateaus (between local max and min) are registered. Min and max come in pair or as a single (local) max or min. (In your example there are no locals, just a global max) – särimner Dec 17 '20 at 15:31

1 Answers1

0

IIUC, you want to apply local max and min over the unique consecutive values, so do the following:

import pandas as pd

df = pd.DataFrame({'data': [1, 1, 2, 2, 1, 0, 0, -2, 0]})

# remove consecutive duplicates
res = df[df['data'] != df['data'].shift()]

# find min and max
res['min'] = res.data[(res.data.shift(1) > res.data) & (res.data.shift(-1) > res.data)]
res['max'] = res.data[(res.data.shift(1) < res.data) & (res.data.shift(-1) < res.data)]

# put back in original df
output = pd.concat((df, res[['min', 'max']]), axis=1)
print(output)

Output

   data  min  max
0     1  NaN  NaN
1     1  NaN  NaN
2     2  NaN  2.0
3     2  NaN  NaN
4     1  NaN  NaN
5     0  NaN  NaN
6     0  NaN  NaN
7    -2 -2.0  NaN
8     0  NaN  NaN
Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76