I have a raw dataframe that looks like
codcet placa_encoded date time_seconds velocidade
5031 490191222 431.0 2021-03-11 70079.0 51
5032 490221211 431.0 2021-03-11 72219.0 55
7991 490361213 562.0 2021-03-11 28559.0 24
7992 490361232 562.0 2021-03-11 29102.0 29
7993 490361221 562.0 2021-03-11 30183.0 33
...
Where the numbers on the far left are indexes from the original dataset.
My goal is to convert this into a dataframe indexed by placa_encoded
and by n
, a counter within each group that then looks like
placa_encoded n time_seconds velocidade codcet
431.0 0 70079.0 51 490191222
431.0 1 72219.0 55 490221211
562.0 0 28559.0 24 490361213
562.0 1 29102.0 29 490361232
562.0 2 30183.0 33 490361221
That is, I aim to groupby('placa_encoded')
then add another column n
that counts the position within each group. The row should be indexed by both placa_encoded
and n
. I think I can use cumcount()
to do this but it's unclear to me how to add it as a column since groupby
doesn't product a dataframe I can assign to. I looked at this question but it seems they use .count()
to convert it to a dataframe, and I want to preserve the data instead of getting any counts. I also tried to use pd.DataFrame(gbplaca)
and pd.DataFrame(gbplaca.groups)
to no avail.
Thank you so much!