1

have a data frame data as below

   InsuranceId InsuranceStatus  Date
0  Ins1234     DuePayment       2020-06-07 23:59:43.123456+00:00
1  Ins1234     Successful       2019-06-07 23:59:43.123456+00:00
2  Ins1234     Successful       2018-06-07 23:59:43.123456+00:00
3  Ins5678     DuePayment       2020-07-07 22:59:32.123421+00:00
4  Ins5678     Successful       2019-07-07 22:59:32.123421+00:00
5  Ins5678     Successful       2018-07-07 22:59:32.123421+00:00

trying to create a rownumber/rank on basis of group by InsuranceId and max(Date)

df['RowNum'] = df.groupby('InsuranceId')['InsuranceStatus']['Date'].rank(method="first", ascending=True)

and 

df['RowNum'] = df.groupby(by=['InsuranceId'])['InsuranceStatus']['Date'].transform(lambda x: x.rank())

by referring SQL-like window functions in PANDAS: Row Numbering in Python Pandas Dataframe

Error: Index Error: Columns status already selected 

trying to achieve below output

   InsuranceId InsuranceStatus  Date                                RowNum
0  Ins1234     DuePayment       2020-06-07 23:59:43.123456+00:00    1
1  Ins1234     Successful       2019-06-07 23:59:43.123456+00:00    2
2  Ins1234     Successful       2018-06-07 23:59:43.123456+00:00    3
3  Ins5678     DuePayment       2020-07-07 22:59:32.123421+00:00    1
4  Ins5678     Successful       2019-07-07 22:59:32.123421+00:00    2
5  Ins5678     Successful       2018-07-07 22:59:32.123421+00:00    3

Is there anything i am missing to add. pls any suggestions

Final output:

   InsuranceId InsuranceStatus  Date                                
   Ins1234     DuePayment       2020-06-07 23:59:43.123456+00:00    
   Ins5678     DuePayment       2020-07-07 22:59:32.123421+00:00    
Ravi
  • 793
  • 3
  • 16
  • 29

1 Answers1

2

Using rank. Just pass the value you want to do group by and rank the column which needs to be ranked.

df['Rank'] = df.groupby(by=['InsuranceId'])['Date'].rank(method='max',ascending=False)
df[df['Rank']==1]

Output:

InsuranceId InsuranceStatus  Date                                
Ins1234     DuePayment       2020-06-07 23:59:43.123456+00:00    
Ins5678     DuePayment       2020-07-07 22:59:32.123421+00:00    
Equinox
  • 6,483
  • 3
  • 23
  • 32