0

I'm trying to filter a groupby to contain only those rows in the group from group beginning to first local max, and I'm having some trouble.

To select the local max, I'm using x.B.diff().fillna(1) >= 0).cumprod()) == 1].tail(1)

To get the rows I want, I figured I'd try to use groupby filter and try to get rows with indices smaller than the index of the first local max of the group. (Maybe there's a better way?)

Here's what I'm working on so far:

df.groupby('Flag').filter(lambda x: x.index.values < x.index.get_loc(x[((x.B.diff().fillna(1) >= 0).cumprod()) == 1].tail(1)))

With this I'm currently getting a TypeError that says that one of the rows is an invalid key. I'm assuming I've got some malformed code in the line above.

Sample Data:

            Flag              B
60738       10.0           27.2
60739       10.0           27.3
60740       10.0           27.4
60741       10.0           27.6
60742       10.0           27.8
60743       10.0           28.1
60744       10.0           28.4
60745       10.0           28.7
60746       10.0           29.0
60747       10.0           29.3
60748       10.0           29.6
60749       10.0           29.9
60750       10.0           29.9
60751       10.0           29.9
60752       10.0           29.9
60753       10.0           29.9
60754       10.0           30.1
60755       10.0           30.4
60756       10.0           30.6
60757       10.0           30.9
60758       10.0           31.1
60759       10.0           31.3
60760       10.0           31.6
60761       10.0           31.9
60762       10.0           32.3
60763       10.0           32.6
60764       10.0           33.0
60765       10.0           33.1
60766       10.0           33.3
60767       10.0           33.5
60768       10.0           33.9
60769       10.0           34.3
60770       10.0           34.6
60771       10.0           35.0
60772       10.0           35.4
60773       10.0           35.7
60774       10.0           36.1
60775       10.0           36.2
60776       10.0           36.1
60777       10.0           36.0
60778       10.0           35.8
60779       10.0           35.5
60780       10.0           35.0
60781       10.0           34.6
60782       10.0           34.0
60783       10.0           33.6
60784       10.0           33.3
60785       10.0           33.0
60786       10.0           32.7
60787       10.0           32.4

I believe for this group, 10, I'd like to see the grouping contain indexes 60738-60775

John
  • 485
  • 3
  • 5
  • 16
  • Can we get some example data? https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-example It's hard (if not impossible) to know what's wrong without it. – FHTMitchell Apr 06 '18 at 15:18
  • Added some sample data – John Apr 06 '18 at 15:35

1 Answers1

0

I think you need scipy

from scipy.signal import argrelextrema

df.groupby('Flag').apply(lambda x :x.iloc[argrelextrema(x['B'].values, np.greater)[0][0],:])

Out[1508]: 
60775  Flag     B
Flag             
10.0   10.0  36.2
BENY
  • 317,841
  • 20
  • 164
  • 234
  • For some groupings, B doesn't change. Is there a way for me to get that static value for those groupings along with the local max for those groupings that change? I guess it's not really a local max at that point, but I'd like to capture the value. – John Apr 06 '18 at 16:06