I have a dataframe with the following structure - Start, End and Height.
Some properties of the dataframe:
- A row in the dataframe always starts from where the previous row ended i.e. if the end for row n is 100 then the start of line n+1 is 101.
- The height of row n+1 is always different then the height in row n+1 (this is the reason the data is in different rows).
I'd like to group the dataframe in a way that heights will be grouped in buckets of 5 longs i.e. the buckets are 0, 1-5, 6-10, 11-15 and >15.
See code example below where what I'm looking for is the implemetation of group_by_bucket function.
I tried looking at other questions but couldn't get exact answer to what I was looking for.
Thanks in advance!
>>> d = pd.DataFrame([[1,3,5], [4,10,7], [11,17,6], [18,26, 12], [27,30, 15], [31,40,6], [41, 42, 7]], columns=['start','end', 'height'])
>>> d
start end height
0 1 3 8
1 4 10 7
2 11 17 6
3 18 26 12
4 27 30 15
5 31 40 6
6 41 42 7
>>> d_gb = group_by_bucket(d)
>>> d_gb
start end height_grouped
0 1 17 6_10
1 18 30 11_15
2 31 42 6_10