I'm quite new to pandas so possibly doing some silly things, but I'm trying to somehow aggregate data in numpy arrays with pandas. Below is my incomplete attempt (Python 3.8).
import numpy as np
import pandas as pd
input = np.array([20, 40, 48, 42, 25]) # unsorted 1-dimensional array
dataframe = pd.DataFrame({"v":input}).sort_values("v")
"""
dataframe is:
v
0 20
4 25
1 40
3 42
2 48
"""
dataframe["group"] = dataframe.diff().gt(5).cumsum()
"""
dataframe is:
v group
0 20 0
4 25 0
1 40 1
3 42 1
2 48 2
"""
result = dataframe.???????
What I want to get as result
is something like:
{0: [0, 4], 1: [1, 3], 2:[2]}
[[0, 4], [1, 3], [2]]
Of course it will be welcome if you can do the equivalent without pandas.