I have a pandas DataFrame like below:
df = pd.DataFrame({"A": ["apple", "apple", "banana", "banana", "banana", "pineapple"],
"B": [0.5, 0.77, 0.32, 0.16, 0.05, 1],
"C": [132, 44, 32, 11, 0, 5]})
Now, I want to create a DataFrame from this in which I want to keep, for each unique value of column A
, only the row with the highest value of column B
and throw away the other rows. The desired result would look like this:
A B C
apple 0.77 44
banana 0.32 32
pineapple 1 5
Is there an elegant, Python efficient way of doing this? (The real DataFrame is quit big and has more extra columns besides C
)