I have a pandas dataframe. My goal is to select only those rows where column C has the largest value within group B. For example, when B is "one" the maximum value of C is 311, so I would like the row where C = 311 and B = "one."
import pandas as pd
import numpy as np
df2 = pd.DataFrame({ 'A' : 1.,
'A' : pd.Categorical(["test1","test2","test3","test4"]),
'B' : pd.Categorical(["one","one","two","two"]),
'C' : np.array([311,42,31,41]),
'D' : np.array([9,8,7,6])
})
df2.groupby('C').max()
Output should be:
test1 one 311 9
test4 two 41 6