I'm trying to filter a groupby to contain only those rows in the group from group beginning to first local max, and I'm having some trouble.
To select the local max, I'm using x.B.diff().fillna(1) >= 0).cumprod()) == 1].tail(1)
To get the rows I want, I figured I'd try to use groupby filter
and try to get rows with indices smaller than the index of the first local max of the group. (Maybe there's a better way?)
Here's what I'm working on so far:
df.groupby('Flag').filter(lambda x: x.index.values < x.index.get_loc(x[((x.B.diff().fillna(1) >= 0).cumprod()) == 1].tail(1)))
With this I'm currently getting a TypeError
that says that one of the rows is an invalid key. I'm assuming I've got some malformed code in the line above.
Sample Data:
Flag B
60738 10.0 27.2
60739 10.0 27.3
60740 10.0 27.4
60741 10.0 27.6
60742 10.0 27.8
60743 10.0 28.1
60744 10.0 28.4
60745 10.0 28.7
60746 10.0 29.0
60747 10.0 29.3
60748 10.0 29.6
60749 10.0 29.9
60750 10.0 29.9
60751 10.0 29.9
60752 10.0 29.9
60753 10.0 29.9
60754 10.0 30.1
60755 10.0 30.4
60756 10.0 30.6
60757 10.0 30.9
60758 10.0 31.1
60759 10.0 31.3
60760 10.0 31.6
60761 10.0 31.9
60762 10.0 32.3
60763 10.0 32.6
60764 10.0 33.0
60765 10.0 33.1
60766 10.0 33.3
60767 10.0 33.5
60768 10.0 33.9
60769 10.0 34.3
60770 10.0 34.6
60771 10.0 35.0
60772 10.0 35.4
60773 10.0 35.7
60774 10.0 36.1
60775 10.0 36.2
60776 10.0 36.1
60777 10.0 36.0
60778 10.0 35.8
60779 10.0 35.5
60780 10.0 35.0
60781 10.0 34.6
60782 10.0 34.0
60783 10.0 33.6
60784 10.0 33.3
60785 10.0 33.0
60786 10.0 32.7
60787 10.0 32.4
I believe for this group, 10, I'd like to see the grouping contain indexes 60738-60775