2

I have a data frame with a category column.

For each class within the category column, I would like a repeating sequence from 1 to n. For example, in the below table, for each unique value in the category column (e.g. a, b, c etc.), I would like to have a repeating sequence from 1 to 3 in the corresponding sequence column

id category sequence
1 a 1
2 a 2
3 a 3
4 a 1
5 a 2
6 a 3
7 b 1
8 b 2
9 b 3
10 b 1
11 b 2
12 b 3

I would very much appreciate any suggestions

Marley
  • 121
  • 1
  • 9
  • Does this answer your question? [How to add sequential counter column on groups using Pandas groupby](https://stackoverflow.com/questions/23435270/how-to-add-sequential-counter-column-on-groups-using-pandas-groupby) – wwnde Jan 06 '21 at 00:51

2 Answers2

4

Use cumcount:

df['sequence'] = (df.groupby('category').cumcount() % 3) + 1
print(df)

Output

    id category  sequence
0    1        a         1
1    2        a         2
2    3        a         3
3    4        a         1
4    5        a         2
5    6        a         3
6    7        b         1
7    8        b         2
8    9        b         3
9   10        b         1
10  11        b         2
11  12        b         3

As an alternative:

df['sequence'] = df.groupby('category').cumcount().mod(3).add(1)
Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76
0

If there are always a multiple of 3 rows per group, then you don't need to .groupby. You can use string multiplication and convert to list / Series:

df['sequence'] = list('123'*(df.shape[0]//3))

Or if the column datqa type must be int, then:

df['sequence'] = pd.Series(list('123'*(df.shape[0]//3))).astype(int)
df

Out[1]: 
    id category  sequence
0    1        a         1
1    2        a         2
2    3        a         3
3    4        a         1
4    5        a         2
5    6        a         3
6    7        b         1
7    8        b         2
8    9        b         3
9   10        b         1
10  11        b         2
11  12        b         3
David Erickson
  • 16,433
  • 2
  • 19
  • 35