1

I am working on a data frame, which contains 70 over actions. I have a column that groups those 70 actions. I want to create a new column that is the rank of string from an existing column. The following the sample of the data frame:

DF = pd.DataFrame()
DF ['template']= ['Attk','Attk','Attk','Attk','Attk','Attk','Def','Def','Def','Def','Def','Def','Accuracy','Accuracy','Accuracy','Accuracy','Accuracy','Accuracy']
DF ['Stats'] = ['Goal','xG','xA','Goal','xG','xA','Block','interception','tackles','Block','interception','tackles','Acc.passes','Acc.actions','Acc.crosses','Acc.passes','Acc.actions','Acc.crosses']
DF=DF.sort_values(['template','Stats'])

enter image description here

The new column that I wanted to create is groupby [template] and ranking the Stats alphabetical order.

The expected data frame is as follow:

enter image description here

I have 10 to 15 of Stats under each of the template.

halfer
  • 19,824
  • 17
  • 99
  • 186
Zephyr
  • 1,332
  • 2
  • 13
  • 31

1 Answers1

0

Use GroupBy.transform with lambda function and factorize, also because python counts from 0 is added 1:

f = lambda x: pd.factorize(x)[0]
DF['Order'] = DF.groupby('template')['Stats'].transform(f) + 1
print (DF)
    template         Stats  Order
13  Accuracy   Acc.actions      1
16  Accuracy   Acc.actions      1
14  Accuracy   Acc.crosses      2
17  Accuracy   Acc.crosses      2
12  Accuracy    Acc.passes      3
15  Accuracy    Acc.passes      3
0       Attk          Goal      1
3       Attk          Goal      1
2       Attk            xA      2
5       Attk            xA      2
1       Attk            xG      3
4       Attk            xG      3
6        Def         Block      1
9        Def         Block      1
7        Def  interception      2
10       Def  interception      2
8        Def       tackles      3
11       Def       tackles      3
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252