I have a dataframe as follows : data_frame:
column_A column_B, column_C
2 3 5 row= 0
2 3 6 row= 1
3 4 2 row= 2
3 4 9 row= 3
what am trying to do is to check the rows that have same values for their column_A and their column_B, and get the maximum value for each group obtained: To be more precise: in my exemple I would like to get the row 1 and row 3. So the expected output should be:
column_A column_B, column_C
2 3 6 row= 1
3 4 9 row= 3
I tried to apply groupby with max method to column_C: it seems to work but What i got as output is column_C. Here is my code:
test_df=pd.DataFrame([[2,3,5],[2,3,6],[3,4,2],[3,4,9]],columns=['column_A','column_B','column_C'])
result= test_df.groupby(['column_A','column_B'], sort=False)['column_C'].max()
print(result)
and here is the result
column_A column_B
2 3 6
3 4 9
Name: column_C, dtype: int64
I know why I have a serie ( since I apply max method on the column_C ), but i couldn't figure out a way to get the corresponding rows instead of only getting column_C values.