I have a data frame that has repeating values in 2 columns and I only want to keep the highest value of each combination. For the following data frame:
df = pd.DataFrame(
np.array([['A', 'B ', 3], ['A', 'B', 6], ['C', 'D', 9], ['C', 'D', 2], ['C', 'B', 4]]))
df
how would I get this dataframe as a result:
|A|B|6|
|C|D|9|
|C|B|4|
Here's my code:
df = df.groupby([0]).max().sort_values(2,ascending=False)
df
and this is what it returns:
|A|B|6|
|C|D|9|
My problem with my code is that it only sorts the values on the first column (so CB is the same as CB, but I want 2 separate values returned). I only want to keep the highest row for all combinations. Some posts very similar but different is this one. Can someone please let me know what I can do to fix this problem? Thanks!