1

I have a dataset grouped by an ID, but I want to add a new label according to the date the ID occurred.

My dataset looks like this:

ID Date
1 10/11/20
1 15/11/20
1 13/12/20
1 16/12/20
2 10/11/19
2 15/11/19
2 13/12/20
2 16/12/20
3 10/11/19
3 15/11/19
3 13/12/20
3 16/12/20
4 10/11/19
4 15/11/19
4 13/12/20
4 16/12/20

And so on.

I would like my dataset, when ordered, to then look like this:

ID Date Order
1 10/11/20 1
1 15/11/20 2
1 13/12/20 3
1 16/12/20 4
2 10/11/19 1
2 15/11/19 2
2 13/12/20 3
2 16/12/20 4
3 10/11/19 1
3 15/11/19 2
3 13/12/20 3
3 16/12/20 4
4 10/11/19 1
4 15/11/19 2
4 13/12/20 3
4 16/12/20 4

Does anybody know if there is a way to do this or a function that could help?

  • 1
    `df.groupby('id')['Date'].rank()`? `df.groupby('id').cumcount() + 1 ` would also work if data is sorted. – Quang Hoang Jul 27 '22 at 15:50
  • In Pandas these are "windowing operations" ( similar to SQL window functions). Pandas can only do it with numeric types so you have to create a date index first. see https://stackoverflow.com/questions/53433242/pandas-dataframe-window-using-date-index and https://pandas.pydata.org/docs/user_guide/window.html – Davos Jul 27 '22 at 15:53

0 Answers0