-1

a quick one but I am a bit stuck on this. I have a dataframe with 3 classes: 0, 1, 2. The idea is to calculate cumsum per each class using, possibly, groupby, although there could be some other way.

Here is my df:

| classes |
|    1    |
|    0    |
|    1    |
|    2    |
|    1    |
|    2    |
|    0    |
|    0    |

The 'ID' column I'd like to see:

| classes | ID |
|    1    | 1  |
|    0    | 1  |
|    1    | 2  |
|    2    | 1  |
|    1    | 3  |
|    2    | 2  |
|    0    | 2  |
|    0    | 3  |
....etc...

Any ideas?

Piotr
  • 87
  • 5

2 Answers2

1

You need to use cumcount here not cumsum :)

df['Id'] = (df.groupby('classes').cumcount() +1)
Nk03
  • 14,699
  • 2
  • 8
  • 22
1

You're looking for GroupBy.cumcount I believe:

df["ID"] = df.groupby("classes").cumcount() + 1

to get

    classes   ID
0          1   1
1          0   1
2          1   2
3          2   1
4          1   3
5          2   2
6          0   2
7          0   3
Mustafa Aydın
  • 17,645
  • 4
  • 15
  • 38