I have a DataFrame that I want to split into multiple groups. Each group will be a sequence of rows where the column difference
is equal to 1. If not, skip it and find the next row with difference
equal to 1 and start a new group.
For example this:
id difference
0 001 1
1 001 1
2 001 1
3 001 1
4 001 1
5 001 1
6 001 2
7 001 2
8 001 1
9 001 1
10 001 1
11 001 1
12 001 4
13 001 1
14 001 1
15 001 1
16 001 1
17 001 1
18 001 1
19 001 1
will be 3 dfs first: from 0 to 5 (including 5), second: from 8 to 11, and third: from 13 to 19
Right now I do it this way, and I am new to pandas. Is there any other efficient way of doing it?
grouped = g.df((g['difference'] != g['difference'].shift()).cumsum())
for group_id, group in grouped:
if (group['difference'].iloc[0] < 1.1) & (group['difference'].iloc[0] > 0.9) and len(
group.index) > 1:
#do stuff...