I want to do almost the same thing as this question.
However, the approach in the accepted answer by @jezrael takes way too long based on my dataset -- I have ~300k rows in the original dataframe, and it takes a few minutes to run the nlargest(1) command. Furthermore, I tried it on a head(1000) limited dataframe, and didn't get only 1 row for each within the value_count -- I got exactly the same Series back as the value_counts.
In my own words: Basically, my dataset has two columns like this:
Session Rating
A Positive
A Positive
A Positive
A Negative
B Negative
B Negative
C Positive
C Negative
Using counts = df.groupby('Session')['Rating'].value_counts() I get a Series object like this:
Session Rating
A Positive 3
Negative 1
B Negative 2
C Positive 1
Negative 1
How do I get a dataframe where just the Rating with the max count is included? And in cases where there are multiple maxes (such as C), I would like to exclude that one from the returned table.