How can I add a column to a pandas DataFrame that uniquely identifies grouped data?

Asked Mar 12 '18 at 16:03

Active Mar 12 '18 at 16:03

Viewed 34 times

Starting with a dataframe like this:

df = pd.DataFrame({"Cat1":['a','a', 'b','b','c','c','c', 'c', 'c'], "Cat2":[1,1,1,2,1,1,2,2,2]})

df
Out[2]: 
  Cat1  Cat2
0    a     1
1    a     1
2    b     1
3    b     2
4    c     1
5    c     1
6    c     2
7    c     2
8    c     2

How can I manipulate the data in order to produce a count of each "Cat1"-"Cat2" pair, in increasing order? The result would look like this:

df = pd.DataFrame({"Cat1":['a','a', 'b','b','c','c','c', 'c', 'c'], "Cat2":[1,1,1,2,1,1,2,2,2], "Count":[1,2,1,1,1,2,1,2,3]})

df
Out[4]: 
  Cat1  Cat2  Count
0    a     1      1
1    a     1      2
2    b     1      1
3    b     2      1
4    c     1      1
5    c     1      2
6    c     2      1
7    c     2      2
8    c     2      3

asked Mar 12 '18 at 16:03

wesanyer

Use `df['Count'] = df.groupby(['Cat1','Cat2']).cumcount() + 1` – jezrael Mar 12 '18 at 16:06
1

great, thanks for the help. sorry for the duplicate question, I couldn't figure out the correct way to search for this to see if it had been answered before. – wesanyer Mar 12 '18 at 16:18

How can I add a column to a pandas DataFrame that uniquely identifies grouped data?

0 Answers0