I'm new to python and have a simple question for which I haven't found an answer yet. Lets say I have a time series with c(t):
t_ c_
1 40
2 41
3 4
4 5
5 7
6 20
7 20
8 8
9 90
10 99
11 10
12 5
13 8
14 8
15 19
I now want to evaluate this series with respect to how long the value c has been continuously in certain ranges and how often these time periods occur.
The result would therefore include three columns: c (binned), duration (binned), frequency. Translated to the simple example the result could look as follows:
c_ Dt_ Freq_
0-50 8 1
50-100 2 1
0-50 5 1
Can you give me an advice?
Thanks in advance,
Ulrike
//EDIT: Thank you for the replies! My example data were somewhat flawed so that I couldn't show a part of my question. So, here is a new data series:
series=
t c
1 1
2 1
3 10
4 10
5 10
6 1
7 1
8 50
9 50
10 50
12 1
13 1
14 1
If I apply the code proposed by Christoph below:
bins = pd.cut(series['c'], [-1, 5, 100])
same_as_prev = (bins != bins.shift())
run_ids = same_as_prev.cumsum()
result = bins.groupby(run_ids).aggregate(["first", "count"])
I receive a result like this:
first count
(-1, 5] 2
(5, 100] 3
(-1, 5] 2
(5, 100] 3
(-1, 5] 3
but what I'm more interested in something looking like this:
c length freq
(-1, 5] 2 2
(-1, 5] 3 1
(5, 100] 3 2
How do I achieve this? And how could I plot it in a KDE plot?
Best,
Ulrike