How to get rows of a DataFrame conditionally

Question

I have the following DataFrame:

I need to create a new DataFrame from this one, which only will include the maximum value from the col3 for each unique pair of col1 and col2 values. The end result needs to look as the following:

For example, there are 2 rows where col1 is a and col2 is c. In one column, col3 is 1 and in the other one col3 is 2. And I only need to include the row where col3 is 2 because 2 is the max value of col3 where col1 is a and col2 is c.

What would be the most elegant way of doing this without making it too complex?

hi there 1k rep user can you add a [mcve] please also see [ask] — Umar.H, Aug 07 '20 at 15:20
Please provide a small set of sample data **as text** that we can copy and paste. Include the corresponding desired result. Check out the guide on [how to make good reproducible pandas examples](https://stackoverflow.com/a/20159305/3620003). — timgeb, Aug 07 '20 at 15:20

score 0 · Accepted Answer · answered Aug 07 '20 at 15:20

0

df.groupby(['col1','col2'],as_index=False)['col3'].max()

answered Aug 07 '20 at 15:20

sangmoo

87
1
3

this is REALLY elegant! What if there was a 4th column and I would like to save the value of that column as well, whatever the value happens to be on that row? Your solution leaves the potential 4th column outside. – edn Aug 07 '20 at 15:32

How to get rows of a DataFrame conditionally

1 Answers1